CopyOnWriteArraySet vs ConcurrentHashMap-backed Set [duplicate]

CopyOnWriteArraySet vs ConcurrentHashMap-backed Set [duplicate] - java

There seems to be a lot of different implementations and ways to generate thread-safe Sets in Java.
Some examples include
1) CopyOnWriteArraySet
2) Collections.synchronizedSet(Set set)
3) ConcurrentSkipListSet
4) Collections.newSetFromMap(new ConcurrentHashMap())
5) Other Sets generated in a way similar to (4)
These examples come from Concurrency Pattern: Concurrent Set implementations in Java 6
Could someone please simply explain the differences, advantages, and disadvantage of these examples and others? I'm having trouble understanding and keeping straight everything from the Java Std Docs.

The CopyOnWriteArraySet is a quite simple implementation - it basically has a list of elements in an array, and when changing the list, it copies the array. Iterations and other accesses which are running at this time continue with the old array, avoiding necessity of synchronization between readers and writers (though writing itself needs to be synchronized). The normally fast set operations (especially contains()) are quite slow here, as the arrays will be searched in linear time.
Use this only for really small sets which will be read (iterated) often and changed seldom. (Swing's listener-sets would be an example, but these are not really sets, and should be only used from the EDT anyway.)
Collections.synchronizedSet will simply wrap a synchronized-block around each method of the original set. You should not access the original set directly. This means that no two methods of the set can be executed concurrently (one will block until the other finishes) - this is thread-safe, but you will not have concurrency if multiple threads are using the set. If you use the iterator, you usually still need to synchronize externally to avoid ConcurrentModificationExceptions when modifying the set between iterator calls. The performance will be like the performance of the original set (but with some synchronization overhead, and blocking if used concurrently).
Use this if you only have low concurrency, and want to be sure all changes are immediately visible to the other threads.
ConcurrentSkipListSet is the concurrent SortedSet implementation, with most basic operations in O(log n). It allows concurrent adding/removing and reading/iteration, where iteration may or may not tell about changes since the iterator was created. The bulk operations are simply multiple single calls, and not done atomically – other threads may observe only some of them.
Obviously you can use this only if you have some total order on your elements.
This looks like an ideal candidate for high-concurrency situations, for not-too-large sets (because of the O(log n)).
For the ConcurrentHashMap (and the Set derived from it): Here most basic options are (on average, if you have a good and fast hashCode()) in O(1) (but might degenerate to O(n) when many keys have the same hash code), like for HashMap/HashSet. There is a limited concurrency for writing (the table is partitioned, and write access will be synchronized on the needed partition), while read access is fully concurrent to itself and the writing threads (but might not yet see the results of the changes currently being written). The iterator may or may not see changes since it was created, and bulk operations are not atomic.
Resizing is slow (as for HashMap/HashSet), thus you should try to avoid this by estimating the needed size on creation (and using about 1/3 more of that, as it resizes when 3/4 full).
Use this when you have large sets, a good (and fast) hash function and can estimate the set size and needed concurrency before creating the map.
Are there other concurrent map implementations one could use here?

It is possible to combine the contains() performance of HashSet with the concurrency-related properties of CopyOnWriteArraySet by using the AtomicReference<Set> and replacing the whole set on each modification.
The implementation sketch:
public abstract class CopyOnWriteSet<E> implements Set<E> {
private final AtomicReference<Set<E>> ref;
protected CopyOnWriteSet( Collection<? extends E> c ) {
ref = new AtomicReference<Set<E>>( new HashSet<E>( c ) );
}
#Override
public boolean contains( Object o ) {
return ref.get().contains( o );
}
#Override
public boolean add( E e ) {
while ( true ) {
Set<E> current = ref.get();
if ( current.contains( e ) ) {
return false;
}
Set<E> modified = new HashSet<E>( current );
modified.add( e );
if ( ref.compareAndSet( current, modified ) ) {
return true;
}
}
}
#Override
public boolean remove( Object o ) {
while ( true ) {
Set<E> current = ref.get();
if ( !current.contains( o ) ) {
return false;
}
Set<E> modified = new HashSet<E>( current );
modified.remove( o );
if ( ref.compareAndSet( current, modified ) ) {
return true;
}
}
}
}

If the Javadocs don't help, you probably should just find a book or article to read about data structures. At a glance:
CopyOnWriteArraySet makes a new copy of the underlying array every time you mutate the collection, so writes are slow and Iterators are fast and consistent.
Collections.synchronizedSet() uses old-school synchronized method calls to make a Set threadsafe. This would be a low-performing version.
ConcurrentSkipListSet offers performant writes with inconsistent batch operations (addAll, removeAll, etc.) and Iterators.
Collections.newSetFromMap(new ConcurrentHashMap()) has the semantics of ConcurrentHashMap, which I believe isn't necessarily optimized for reads or writes, but like ConcurrentSkipListSet, has inconsistent batch operations.

Concurrent set of weak references
Another twist is a thread-safe set of weak references.
Such a set is handy for tracking subscribers in a pub-sub scenario. When a subscriber is going out of scope in other places, and therefore headed towards becoming a candidate for garbage-collection, the subscriber need not be bothered with gracefully unsubscribing. The weak reference allows the subscriber to complete its transition to being a candidate for garbage-collection. When the garbage is eventually collected, the entry in the set is removed.
While no such set is directly provided with the bundled classes, you can create one with a few calls.
First we start with making a Set of weak references by leveraging the WeakHashMap class. This is shown in the class documentation for Collections.newSetFromMap.
Set< YourClassGoesHere > weakHashSet =
Collections
.newSetFromMap(
new WeakHashMap< YourClassGoesHere , Boolean >()
)
;
The Value of the map, Boolean, is irrelevant here as the Key of the map makes up our Set.
In a scenario such as pub-sub, we need thread-safety if the subscribers and publishers are operating on separate threads (quite likely the case).
Go one step further by wrapping as a synchronized set to make this set thread-safe. Feed into a call to Collections.synchronizedSet.
this.subscribers =
Collections.synchronizedSet(
Collections.newSetFromMap(
new WeakHashMap <>() // Parameterized types `< YourClassGoesHere , Boolean >` are inferred, no need to specify.
)
);
Now we can add and remove subscribers from our resulting Set. And any “disappearing” subscribers will eventually be automatically removed after garbage-collection executes. When this execution happens depends on your JVM’s garbage-collector implementation, and depends on the runtime situation at the moment. For discussion and example of when and how the underlying WeakHashMap clears the expired entries, see this Question, *Is WeakHashMap ever-growing, or does it clear out the garbage keys?
*.

Related

Using Threads and Flyweight pattern together in Java?

I'm new to both multi-threading and using design patterns.
I've some threads using explicit multi-threading and each is suppose to compute the factorial of a number if it hasn't been computed ever by any thread. I'm using Flyweight Pattern for this.
private final long Comp;
private static Map<String, Fact> instances=new HashMap<String, Fact>();
private Fact(long comp) {
Comp=comp;
}
public static Fact getInstance(int num){
String key=String.valueOf(num);
if(!instances.containsKey(key)){
int comp=//calculate factorial of num
instances.put(key, new Fact(comp));
}
return instances.get(key);
}
public long get_Comp(){
return this.Comp;
}
}
public class Th implements Runnable {
// code elited
#Override
public void run() {
//get number and check if it's already in the HashMap, if no,
compute
}
}
If I do so then is it right to say that my Threads Th are computing Factorials?
If I add the computation in Fact (Flyweight) class then does it remain Flyweight, I guess yes.
Any other way of doing what I wish would be highly appreciated as well.

There's a couple of aims you might have here. What to do is dependent on what you are trying to do.
So it seems in this case you are attempting to avoid repeated computation, but that computation is not particularly expensive. You could run into a problem of lock contention. Therefore, to make it thread safe use ThreadLocal<Map<String, Fact>>. Potentially InheritableThreadLocal<Map<String, Fact>> where childValue copies the Map.
Often there are a known set of values that are likely to be common, and you just want these. In that case, compute a Map (or array) during class static initialisation.
If you want the flyweights to be shared between thread and be unique, use ConcurrentHashMap with together with the Map.computeIfAbsent method.
If you want the flyweights to be shared between thread, be unique and you want to make sure you only do the computation once, it gets a bit more difficult. You need to put (if absent) a placeholder into the ConcurrentMap; if the current thread wins replace that with the computed value and notify, otherwise wait for the computation.
Now if you want the flyweights to be garbage collected, you would want WeakHashMap. This cannot be a ConcurrentMap using the Java SE collections which makes it a bit hopeless. You can use good old fashioned locking. Alternatively the value can be a WeakReference<Fact>, but you'll need to manage eviction yourself.
It may be that a strong reference to Fact is only kept intermittently but you don't want it to be recreated too often, in which case you will need SoftReference instead of WeakReference. Indeed WeakHashMap can behave surprisingly, in some circumstances causing performance to drop to unusable after previously working fine.
(Note, in this case your Map would be better keyed on Integer.)

Iterator versus Stream of Java 8

To take advantage of the wide range of query methods included in java.util.stream of Jdk 8 I am attempted to design domain models where getters of relationship with * multiplicity (with zero or more instances ) return a Stream<T>, instead of an Iterable<T> or Iterator<T>.
My doubt is if there is any additional overhead incurred by the Stream<T> in comparison to the Iterator<T>?
So, is there any disadvantage of compromising my domain model with a Stream<T>?
Or instead, should I always return an Iterator<T> or Iterable<T>, and leave to the end-user the decision of choosing whether to use a stream, or not, by converting that iterator with the StreamUtils?
Note that returning a Collection is not a valid option because in this case most of the relationships are lazy and with unknown size.

There's lots of performance advice here, but sadly much of it is guesswork, and little of it points to the real performance considerations.
#Holger gets it right by pointing out that we should resist the seemingly overwhelming tendency to let the performance tail wag the API design dog.
While there are a zillion considerations that can make a stream slower than, the same as, or faster than some other form of traversal in any given case, there are some factors that point to streams haven a performance advantage where it counts -- on big data sets.
There is some additional fixed startup overhead of creating a Stream compared to creating an Iterator -- a few more objects before you start calculating. If your data set is large, it doesn't matter; it's a small startup cost amortized over a lot of computation. (And if your data set is small, it probably also doesn't matter -- because if your program is operating on small data sets, performance is generally not your #1 concern either.) Where this does matter is when going parallel; any time spent setting up the pipeline goes into the serial fraction of Amdahl's law; if you look at the implementation, we work hard to keep the object count down during stream setup, but I'd be happy to find ways to reduce it as that has a direct effect on the breakeven data set size where parallel starts to win over sequential.
But, more important than the fixed startup cost is the per-element access cost. Here, streams actually win -- and often win big -- which some may find surprising. (In our performance tests, we routinely see stream pipelines which can outperform their for-loop over Collection counterparts.) And, there's a simple explanation for this: Spliterator has fundamentally lower per-element access costs than Iterator, even sequentially. There are several reasons for this.
The Iterator protocol is fundamentally less efficient. It requires calling two methods to get each element. Further, because Iterators must be robust to things like calling next() without hasNext(), or hasNext() multiple times without next(), both of these methods generally have to do some defensive coding (and generally more statefulness and branching), which adds to inefficiency. On the other hand, even the slow way to traverse a spliterator (tryAdvance) doesn't have this burden. (It's even worse for concurrent data structures, because the next/hasNext duality is fundamentally racy, and Iterator implementations have to do more work to defend against concurrent modifications than do Spliterator implementations.)
Spliterator further offers a "fast-path" iteration -- forEachRemaining -- which can be used most of the time (reduction, forEach), further reducing the overhead of the iteration code that mediates access to the data structure internals. This also tends to inline very well, which in turn increases the effectiveness of other optimizations such as code motion, bounds check elimination, etc.
Further, traversal via Spliterator tend to have many fewer heap writes than with Iterator. With Iterator, every element causes one or more heap writes (unless the Iterator can be scalarized via escape analysis and its fields hoisted into registers.) Among other issues, this causes GC card mark activity, leading to cache line contention for the card marks. On the other hand, Spliterators tend to have less state, and industrial-strength forEachRemaining implementations tend to defer writing anything to the heap until the end of the traversal, instead storing its iteration state in locals which naturally map to registers, resulting in reduced memory bus activity.
Summary: don't worry, be happy. Spliterator is a better Iterator, even without parallelism. (They're also generally just easier to write and harder to get wrong.)

Let’s compare the common operation of iterating over all elements, assuming that the source is an ArrayList. Then, there are three standard ways to achieve this:
Collection.forEach
final E[] elementData = (E[]) this.elementData;
final int size = this.size;
for (int i=0; modCount == expectedModCount && i < size; i++) {
action.accept(elementData[i]);
}
Iterator.forEachRemaining
final Object[] elementData = ArrayList.this.elementData;
if (i >= elementData.length) {
throw new ConcurrentModificationException();
}
while (i != size && modCount == expectedModCount) {
consumer.accept((E) elementData[i++]);
}
Stream.forEach which will end up calling Spliterator.forEachRemaining
if ((i = index) >= 0 && (index = hi) <= a.length) {
for (; i < hi; ++i) {
#SuppressWarnings("unchecked") E e = (E) a[i];
action.accept(e);
}
if (lst.modCount == mc)
return;
}
As you can see, the inner loop of the implementation code, where these operations end up, is basically the same, iterating over indices and directly reading the array and passing the element to the Consumer.
Similar things apply to all standard collections of the JRE, all of them have adapted implementations for all ways to do it, even if you are using a read-only wrapper. In the latter case, the Stream API would even slightly win, Collection.forEach has to be called on the read-only view in order to delegate to the original collection’s forEach. Similarly, the iterator has to be wrapped to protect against attempts to invoke the remove() method. In contrast, spliterator() can directly return the original collection’s Spliterator as it has no modification support. Thus, the stream of a read-only view is exactly the same as the stream of the original collection.
Though all these differences are hardly to notice when measuring real life performance as, as said, the inner loop, which is the most performance relevant thing, is the same in all cases.
The question is which conclusion to draw from that. You still can return a read-only wrapper view to the original collection, as the caller still may invoke stream().forEach(…) to directly iterate in the context of the original collection.
Since the performance isn’t really different, you should rather focus on the higher level design like discussed in “Should I return a Collection or a Stream?”

Is there an improved alternative to Java CopyOnWriteArrayList implementation and how can I request a change to Java spec?

CopyOnWriteArrayList almost has the behavior I want, and if unnecessary copies were removed it would be exactly what I am looking for. In particular, it could act exactly like ArrayList for adds made to the end of the ArrayList - i.e., there is no reason to actually make a new copy every single time which is so wasteful. It could just virtually restrict the end of the ArrayList to capture the snapshot for the readers, and update the end after the new items are added.
This enhancement seems like it would be worth having since for many applications the most common type of addition would be to the end of the ArrayList - which is even a reason for choosing to use an ArrayList to begin with.
There also would be no extra overhead since it could only not copy when appending and although it would still have to check if a re-size is necessary ArrayList has to do this anyways.
Is there any alternative implementation or data structure that has this behavior without the unnecessary copies for additions at the end (i.e., thread-safe and optimized to allow frequent reads with writes only being additions at the end of the list)?
How can I submit a change request to request a change to the Java specification to eliminate copies for additions to the end of a CopyOnWriteArrayList (unless a re-size is necessary)?
I'd really liked to see this changed with the core Java libraries rather than maintaining and using my own custom code.

Sounds like you're looking for a BlockingDeque, and in particular an ArrayBlockingQueue.
You may also want a ConcurrentLinkedQueue, which uses a "wait-free" algorithm (aka non-blocking) and may therefore be faster in many circumstances. It's only a Queue (not a Dequeue) and thus you can only insert/remove at the head of the collection, but it sounds like that might be good for your use case. But in exchange for the wait-free algorithm, it has to use a linked list rather than an array internally, and that means more memory (including more garbage when you pop items) and worse memory locality. The wait-free algorithm also relies on a compare and set (CAS) loop, which means that while it's faster in the "normal" case, it can actually be slower under high contention, as each thread needs to try its CAS several times before it wins and is able to move forward.
My guess is that the reason that lists don't get as much love in java.util.concurrent is that a list is an inherently racy data structure in most use cases other iteration. For instance, something like if (!list.isEmpty()) { return list.get(0); } is racy unless it's surrounded by a synchronized block, in which case you don't need an inherently thread-safe structure. What you really need is a "list-type" interface that only allows operations at the ends -- and that's exactly what Queue and Deque are.

To answer your questions:
I'm not aware of an alternative implementation that is a fully functional list.
If your idea is truly viable, I can think of a number of ways to proceed:
You can submit "requests for enhancement" (RFE) through the Java Bugs Database. However, in this case I doubt that you will get a positive response. (Certainly, not a timely one!)
You could create an RFE issue on Guava or Apache Commons issues tracker. This might be more fruitful, though it depends on convincing them ...
You could submit a patch to the OpenJDK team with an implementation of your idea. I can't say what the result might be ...
You could submit a patch (as above) to Guava or Apache Commons via their respective issues trackers. This is the approach that is most likely to succeed, though it still depends on convincing "them" that it is technically sound, and "a good thing".
You could just put the code for your proposed alternative implementation on Github, and see what happens.
However, all of this presupposes that your idea is actually going to work. Based on the scant information you have provided, I'm doubtful. I suspect that there may be issues with incomplete encapsulation, concurrency and/or not implementing the List abstraction fully / correctly.
I suggest that you put your code on Github so that other people can take a good hard look at it.

there is no reason to actually make a new copy every single time which is so wasteful.
This is how it works. It works by replacing the previous array with new array in a compare and swap action. It is a key part of the thread safety design that you always have a new array even if all you do is replace an entry.
thread-safe and optimized to allow frequent reads with writes only being additions at the end of the list
This is heavily optimised for reads, any other solution will be faster for writes, but slower for reads and you have to decide which one you really want.
You can have a custom data structure which will be the best of both worlds, but it not longer a generic solution which is what CopyOnWriteArrayList and ArrayDeque provide.
How can I submit a change request to request a change to the Java specification to eliminate copies for additions to the end of a CopyOnWriteArrayList (unless a re-size is necessary)?
You can do this through the bugs database, but what you propose is a fundamental change in how the data structure works. I suggest proposing a new/different data structure which works the way you want. In the mean time I suggest implementing it yourself as a working example as you will get want you want faster.
I would start with an AtomicReferenceArray as this can be used to perform the low level actions you need. The only problem with it is it is not resizable so you would need to determine the maximum size you would every need.

CopyOnWriteArrayList has a performance drawback because it creates a copy of the underlying array of the list on write operations. The array copying is making the write operations slow. May be, CopyOnWriteArrayList is advantageous for a usage of a List with high read rate and low write rate.
Eventually I started coding my own implementation using the java.util.concurrent.locks,ReadWriteLock. I did my implementation simply by maintaining object level ReadWriteLock instance, and gaining the read lock in the read operations and gaining the write lock in the write operations. The code looks like this.
public class ConcurrentList< T > implements List< T >
{
private final ReadWriteLock readWriteLock = new ReentrantReadWriteLock();
private final List< T > list;
public ConcurrentList( List<T> list )
{
this.list = list;
}
public boolean remove( Object o )
{
readWriteLock.writeLock().lock();
boolean ret;
try
{
ret = list.remove( o );
}
finally
{
readWriteLock.writeLock().unlock();
}
return ret;
}
public boolean add( T t )
{
readWriteLock.writeLock().lock();
boolean ret;
try
{
ret = list.add( t );
}
finally
{
readWriteLock.writeLock().unlock();
}
return ret;
}
public void clear()
{
readWriteLock.writeLock().lock();
try
{
list.clear();
}
finally
{
readWriteLock.writeLock().unlock();
}
}
public int size()
{
readWriteLock.readLock().lock();
try
{
return list.size();
}
finally
{
readWriteLock.readLock().unlock();
}
}
public boolean contains( Object o )
{
readWriteLock.readLock().lock();
try
{
return list.contains( o );
}
finally
{
readWriteLock.readLock().unlock();
}
}
public T get( int index )
{
readWriteLock.readLock().lock();
try
{
return list.get( index );
}
finally
{
readWriteLock.readLock().unlock();
}
}
//etc
}
The performance improvement observed was notable.
Total time taken for 5000 reads + 5000 write ( read write ratio is 1:1) by 10 threads were
ArrayList - 16450 ns( not thread safe)
ConcurrentList - 20999 ns
Vector -35696 ns
CopyOnWriteArrayList - 197032 ns
please follow this link for more info about the test case used for obtaining above results
However, in order to avoid ConcurrentModificationException when using the Iterator, I just created a copy of the current List and returned the iterator of that. This means this list does not return and Iterator which can modify the original List. Well, for me, this is o.k. for the moment.
public Iterator<T> iterator()
{
readWriteLock.readLock().lock();
try
{
return new ArrayList<T>( list ).iterator();
}
finally
{
readWriteLock.readLock().unlock();
}
}
After some googling I found out that CopyOnWriteArrayList has a similar implementaion, as it does not return an Iterator which can modify the original List. Javadoc says,
The returned iterator provides a snapshot of the state of the list when the iterator was constructed. No synchronization is needed while traversing the iterator. The iterator does NOT support the remove method.

What is the name of this locking technique?

I've got a gigantic Trove map and a method that I need to call very often from multiple threads. Most of the time this method shall return true. The threads are doing heavy number crunching and I noticed that there was some contention due to the following method (it's just an example, my actual code is bit different):
synchronized boolean containsSpecial() {
return troveMap.contains(key);
}
Note that it's an "append only" map: once a key is added, is stays in there forever (which is important for what comes next I think).
I noticed that by changing the above to:
boolean containsSpecial() {
if ( troveMap.contains(key) ) {
// most of the time (>90%) we shall pass here, dodging lock-acquisition
return true;
}
synchronized (this) {
return troveMap.contains(key);
}
}
I get a 20% speedup on my number crunching (verified on lots of runs, running during long times etc.).
Does this optimization look correct (knowing that once a key is there it shall stay there forever)?
What is the name for this technique?
EDIT
The code that updates the map is called way less often than the containsSpecial() method and looks like this (I've synchronized the entire method):
synchronized void addSpecialKeyValue( key, value ) {
....
}

This code is not correct.
Trove doesn't handle concurrent use itself; it's like java.util.HashMap in that regard. So, like HashMap, even seemingly innocent, read-only methods like containsKey() could throw a runtime exception or, worse, enter an infinite loop if another thread modifies the map concurrently. I don't know the internals of Trove, but with HashMap, rehashing when the load factor is exceeded, or removing entries can cause failures in other threads that are only reading.
If the operation takes a significant amount of time compared to lock management, using a read-write lock to eliminate the serialization bottleneck will improve performance greatly. In the class documentation for ReentrantReadWriteLock, there are "Sample usages"; you can use the second example, for RWDictionary, as a guide.
In this case, the map operations may be so fast that the locking overhead dominates. If that's the case, you'll need to profile on the target system to see whether a synchronized block or a read-write lock is faster.
Either way, the important point is that you can't safely remove all synchronization, or you'll have consistency and visibility problems.

It's called wrong locking ;-) Actually, it is some variant of the double-checked locking approach. And the original version of that approach is just plain wrong in Java.
Java threads are allowed to keep private copies of variables in their local memory (think: core-local cache of a multi-core machine). Any Java implementation is allowed to never write changes back into the global memory unless some synchronization happens.
So, it is very well possible that one of your threads has a local memory in which troveMap.contains(key) evaluates to true. Therefore, it never synchronizes and it never gets the updated memory.
Additionally, what happens when contains() sees a inconsistent memory of the troveMap data structure?
Lookup the Java memory model for the details. Or have a look at this book: Java Concurrency in Practice.

This looks unsafe to me. Specifically, the unsynchronized calls will be able to see partial updates, either due to memory visibility (a previous put not getting fully published, since you haven't told the JMM it needs to be) or due to a plain old race. Imagine if TroveMap.contains has some internal variable that it assumes won't change during the course of contains. This code lets that invariant break.
Regarding the memory visibility, the problem with that isn't false negatives (you use the synchronized double-check for that), but that trove's invariants may be violated. For instance, if they have a counter, and they require that counter == someInternalArray.length at all times, the lack of synchronization may be violating that.
My first thought was to make troveMap's reference volatile, and to re-write the reference every time you add to the map:
synchronized (this) {
troveMap.put(key, value);
troveMap = troveMap;
}
That way, you're setting up a memory barrier such that anyone who reads the troveMap will be guaranteed to see everything that had happened to it before its most recent assignment -- that is, its latest state. This solves the memory issues, but it doesn't solve the race conditions.
Depending on how quickly your data changes, maybe a Bloom filter could help? Or some other structure that's more optimized for certain fast paths?

Under the conditions you describe, it's easy to imagine a map implementation for which you can get false negatives by failing to synchronize. The only way I can imagine obtaining false positives is an implementation in which key insertions are non-atomic and a partial key insertion happens to look like another key you are testing for.
You don't say what kind of map you have implemented, but the stock map implementations store keys by assigning references. According to the Java Language Specification:
Writes to and reads of references are always atomic, regardless of whether they are implemented as 32 or 64 bit values.
If your map implementation uses object references as keys, then I don't see how you can get in trouble.
EDIT
The above was written in ignorance of Trove itself. After a little research, I found the following post by Rob Eden (one of the developers of Trove) on whether Trove maps are concurrent:
Trove does not modify the internal structure on retrievals. However, this is an implementation detail not a guarantee so I can't say that it won't change in future versions.
So it seems like this approach will work for now but may not be safe at all in a future version. It may be best to use one of Trove's synchronized map classes, despite the penalty.

I think you would be better off with a ConcurrentHashMap which doesn't need explicit locking and allows concurrent reads
boolean containsSpecial() {
return troveMap.contains(key);
}
void addSpecialKeyValue( key, value ) {
troveMap.putIfAbsent(key,value);
}
another option is using a ReadWriteLock which allows concurrent reads but no concurrent writes
ReadWriteLock rwlock = new ReentrantReadWriteLock();
boolean containsSpecial() {
rwlock.readLock().lock();
try{
return troveMap.contains(key);
}finally{
rwlock.readLock().release();
}
}
void addSpecialKeyValue( key, value ) {
rwlock.writeLock().lock();
try{
//...
troveMap.put(key,value);
}finally{
rwlock.writeLock().release();
}
}

Why you reinvent the wheel?
Simply use ConcurrentHashMap.putIfAbsent

Different types of thread-safe Sets in Java

There seems to be a lot of different implementations and ways to generate thread-safe Sets in Java.
Some examples include
1) CopyOnWriteArraySet
2) Collections.synchronizedSet(Set set)
3) ConcurrentSkipListSet
4) Collections.newSetFromMap(new ConcurrentHashMap())
5) Other Sets generated in a way similar to (4)
These examples come from Concurrency Pattern: Concurrent Set implementations in Java 6
Could someone please simply explain the differences, advantages, and disadvantage of these examples and others? I'm having trouble understanding and keeping straight everything from the Java Std Docs.

It is possible to combine the contains() performance of HashSet with the concurrency-related properties of CopyOnWriteArraySet by using the AtomicReference<Set> and replacing the whole set on each modification.
The implementation sketch:
public abstract class CopyOnWriteSet<E> implements Set<E> {
private final AtomicReference<Set<E>> ref;
protected CopyOnWriteSet( Collection<? extends E> c ) {
ref = new AtomicReference<Set<E>>( new HashSet<E>( c ) );
}
#Override
public boolean contains( Object o ) {
return ref.get().contains( o );
}
#Override
public boolean add( E e ) {
while ( true ) {
Set<E> current = ref.get();
if ( current.contains( e ) ) {
return false;
}
Set<E> modified = new HashSet<E>( current );
modified.add( e );
if ( ref.compareAndSet( current, modified ) ) {
return true;
}
}
}
#Override
public boolean remove( Object o ) {
while ( true ) {
Set<E> current = ref.get();
if ( !current.contains( o ) ) {
return false;
}
Set<E> modified = new HashSet<E>( current );
modified.remove( o );
if ( ref.compareAndSet( current, modified ) ) {
return true;
}
}
}
}

If the Javadocs don't help, you probably should just find a book or article to read about data structures. At a glance:
CopyOnWriteArraySet makes a new copy of the underlying array every time you mutate the collection, so writes are slow and Iterators are fast and consistent.
Collections.synchronizedSet() uses old-school synchronized method calls to make a Set threadsafe. This would be a low-performing version.
ConcurrentSkipListSet offers performant writes with inconsistent batch operations (addAll, removeAll, etc.) and Iterators.
Collections.newSetFromMap(new ConcurrentHashMap()) has the semantics of ConcurrentHashMap, which I believe isn't necessarily optimized for reads or writes, but like ConcurrentSkipListSet, has inconsistent batch operations.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.