I'm reading the java official doc regarding wrappers implementation, which are static methods in Collections used to get synchronized collection, for example : List<Type> list = Collections.synchronizedList(new ArrayList<Type>());
...
the thing that I did not understand is the following (I quote from the java doc ) :
A collection created in this fashion is every bit as thread-safe as a normally synchronized collection, such as a Vector.
In the face of concurrent access, it is imperative that the user manually synchronize on the returned collection when iterating over it. The reason is that iteration is accomplished via multiple calls into the collection, which must be composed into a single atomic operation...
how it could be every bit as thread-safe an need to manually synchronize when iterating ??
It is thread safe in the sense that each of it's individual methods are thread safe, but if you perform compound actions on the collection, then your code is at risk of concurrency issues.
ex:
List<String> synchronizedList = Collections.synchronizedList(someList);
synchronizedList.add(whatever); // this is thread safe
the individual method add() is thread safe but if i perform the following:
List<String> synchronizedList = Collections.synchronizedList(someList);
if(!synchronizedList.contains(whatever))
synchronizedList.add(whatever); // this is not thread safe
the if-then-add operation is not thread safe because some other thread might have added whatever to the list after contains() check.
There is no contradiction here: collections returned from synchronizedXyz suffer the same shortcoming as synchronized collections available to you directly, namely the need to manually synchronize on iterating the collection.
The problem of external iteration cannot be solved by a better class design, because iterating a collection inherently requires making multiple calls to its methods (see this Q&A for detailed explanation).
Note that starting with Java 1.8 you can iterate without additional synchronization using forEach method of your synchronized collection*. This is thread-safe, and comes with additional benefits; see this Q&A for details.
The reason this is different from iterating externally is that forEach implementation inside the collection takes care of synchronizing the iteration for you.
Related
I'm new to Java8 and working on a problem where multiple threads (~10) are writing values to a Concurrent Hash Map. I have another dedicated thread which reads all the values present in Concurrent Hash Map and returns them (every 30 seconds). Is iterating over result of values() method the recommended way of fetching results without getting Concurrent Modification Exception?
Note: I am perfectly fine with getting stale data
I went over the official docs which says:
Retrieval operations generally do not block, so may overlap with update operations . Retrievals reflect the results of the most recently completed update operations holding upon their onset. For aggregate operations such as putAll and clear, concurrent retrievals may reflect insertion or removal of only some entries. Similarly, Iterators, Spliterators and Enumerations return elements reflecting the state of the hash table at some point at or since the creation of the iterator/enumeration. They do not throw ConcurrentModificationException.
However doc of values() method says:
Returns a Collection view of the values contained in this map
Is the below code thread safe?
for (String name: myMap.values()) {
System.out.println("name": + name);
}
Is iterating over result of values() method the recommended way of fetching results without getting a ConcurrentModificationException?
Yes. It is the recommended way, and you won't get a ConcurrentModificationException.
As the package level javadoc states:
Most concurrent Collection implementations (including most Queues) also differ from the usual java.util conventions in that their Iterators and Spliterators provide weakly consistent rather than fast-fail traversal:
they may proceed concurrently with other operations
they will never throw ConcurrentModificationException
they are guaranteed to traverse elements as they existed upon construction exactly once, and may (but are not guaranteed to) reflect any modifications subsequent to construction.
Is the below code thread safe?
for (String name: myMap.values()) {
System.out.println("name": + name);
}
Yes ... with some qualifications.
Thread safety really means that the code works according to its specified behavior in a multi-threaded application. The problem is that you haven't said clearly what you expect the code to actually do.
What we can say is the following:
The iteration will see values as per the previously stated guarantees.
The memory model guarantees mean that there shouldn't be any nasty behavior with stale values ... unless you mutate value objects after putting them into the map. (If you do that, then the object's methods need to be implemented to cope with that; e.g. they may need to be synchronized. This is moot for String values, since they are immutable.)
HashTable is a thread-safe collection but does initializing it with an ArrayList (which is not thread-safe) as value endanger the whole thread-safety aspect?
Hashtable <Employee, ArrayList<Car>> carDealership = new Hashtable<>();
Further on, I am planning to wrap every action of ArrayLists in a synchronized block to prevent any race-conditions when operating with any methods.
Yet I haven't declared the ArrayLists in the HashTable as synchronized lists, this being achieved with the following code
Collections.synchronizedList(new ArrayList<>())
This will happen when I will be adding ArrayLists to the HashTable obviously.
How can I be sure that the ArrayLists in the HashTable are thread-safe?
Is it enough to pass a thread-safe ArrayList to the put() method of the hashTable and I'm good to go? (and not even worry about the constructor of the HashTable?) Therefore the put() method of the HashTable doesn't even recognize if I am passing a thread-safe/unsafe parameter?
Note: Thread-safety is a requirement. Otherwise I wouldn't have opted for this implementation.
The only way to ensure that the values in the Hashtable or ConcurrentHashMap are thread-safe is to wrap it in a way that prevents anyone from adding something that you don't control. Never expose the Map itself or any of the Lists contained in it to other parts of your code. Provide methods to get snapshot-copies if you need them, provide methods to add values to the lists, but make sure the class wrapping the map is the one that will create all lists that can ever get added to it. Iteration over the "live" lists in you map will require external synchronisation (as metioned in the JavaDocs of synchronizedList).
Both Hashtable and ConcurrentHashMap are thread-safe in that concurrent operations will not leave them in an invalid state. This means e.g. that if you invoke put from two threads with the same key, one of them will return the value the other inserted as the "old" value. But of course you can't tell which will be the first and which will be second in advance without some external synchronization.
The implementation is quite different, though: Hashtable and the synchronized Map returned by Collections.synchronizedMap(new HashMap()); are similar in that they basically add synchronized modifiers to most methods. This can be inefficient if you have lots of threads (i.e. high contention for the locks) that mostly read, but only occasionally modify the map. ConcurrentHashMap provides more fine grained locking:
Retrieval operations (including get) generally do not block
which can yield significantly better performance, depending on your use case. I also provides a richer API with powerful search- and bulk-modification-operations.
Yes, using ArrayList in this case is not thread safe. You can always get the object from the table and operate on it.
CopyOnWriteArrayList is a good substitue for it.
But you still have the case, when one thread takes (saves in a variable) the collection, and the other thread replaces with another one.
If you are not going to replace the lists inside the table, then this is not a problem.
I have a ConcurrentHashMap<String, Object> concurrentMap;
I need to return String[] with keys of the map.
Is the following code:
public String[] listKeys() {
return (String[]) concurrentMap.keySet().toArray();
}
thread safe?
While the ConcurrentHashMap is a thread-safe class, the Iterator that is used on the keys is NOT CERTAIN to be in sync with any subsequent HashMap changes, once created...
From the spec:
public Set<K> keySet()
Returns a Set view of the keys contained in this map......
...........................
The view's iterator is a "weakly consistent" iterator that will
never throw ConcurrentModificationException, and guarantees to
traverse elements as they existed upon construction of the iterator,
and may (but is not guaranteed to) reflect any modifications
subsequent to construction.
Yes and No. Threas-safe is only fuzzily defined as soon as you extend to scope.
Generally, concurrent collections implement all their methods in ways that allow concurrent access by multiple threads, or if they can't, provide mechanisms to serialize such accesses (e.g. synchronization) transparently. Thus, they are safe in the sense they ensure they preserve a valid internal structure and method calls give valid results.
The fuzziness starts if you look at the details, e.g. toArray() will return you some kind of snapshot of the collections contents. There is no guarantee that by the time the method returns the contents will not have already been changed. So while the call is thread safe, the result will not fulfill the usual invariants (e.g. the array contents may not be the same as the collections).
If you need consistency over the scope of mupltiple calls to a concurrent collection, you need to provide mechanisms within the code calling the methods to ensure the required consistency.
I have a general question regarding synchronized List.
Lets say that in the constructor I am createing a list
List synchronizedList = Collections.synchronizedList(list);
and I have one method adds an object to the list.
public void add(String s){
synchronizedList.add(s)
}
There is another thread that checks every few seconds if there are a few rows , dump it to a file and deletes them all.
Now lets say I iterate each row and save it to the db.
after all iteration I clear the list.
How does the multithread support help me?
I could add an element to the list just before the clear() in the other thread occurs .
Unless I manage the lock myself (which I dont realy need a synched list for that ) it myself.
The synchronized list returned by Collections won't help in your case. It's only good if you need to guarantee serial access to individual method calls. If you need to synchronize around a larger set of operations, then you need to manually wrap that code in a synchronized block. The Javadoc states:
It is imperative that the user manually synchronize on the returned list when iterating over it.
If your list is used elsewhere you can at least safeguard it from individual method calls that would otherwise not be thread-safe. If you're entirely managing the list however, you can just add a synchronized block to your add method and use the same lock that you'll use when iterating over it.
synchronizedList indeed only guarantees that every method call on the list is synchronized. If you need multiple operations to be done in a synchronized way, you have to handle the synchronization yourself.
BTW, this is explicitely said in the javadoc for Collections.synchronizedList :
It is imperative that the user
manually synchronize on the returned
list when iterating over it:
List list = Collections.synchronizedList(new ArrayList());
...
synchronized(list) {
Iterator i = list.iterator(); // Must be in synchronized block
while (i.hasNext())
foo(i.next());
}
synchronized list means that all the operations on that list are guaranteed to be atomic. The scenario you describe requires to have some locking outside the list. Consider semaphores or making synchronized block to implement monitors. Take a look at java.util.concurrent.
If all attributes (or items fields, or data members) of a java collection are thread-safe (CopyOnWriteArraySet,ConcurrentHashMap, BlockingQueue, ...), can we say that this collection is thread-safe ?
an exemple :
public class AmIThreadSafe {
private CopyOnWriteArraySet thradeSafeAttribute;
public void add(Object o) {
thradeSafeAttribute.add(o);
}
public void clear() {
thradeSafeAttribute.clear();
}
}
in this sample can we say that AmIThreadSafe is thread-safe ?
Assuming by "attributes" you mean "what the collection holds", then no. Just because the Collection holds thread-safe items does not mean that the Collection's implementation implements add(), clear(), remove(), etc., in a thread-safe manner.
Short answer: No.
Slightly longer answer: because add() and clear() are not in any way synchronized, and HashSet isn't itself synchronized, it's possible for multiple threads to be in them at the same time.
Edit following comment: Ah. Now the short answer is Yes, sorta. :)
The reason for the "sorta" (American slang meaning partially, btw) is that it's possible for two operations to be atomically safe, but to be unsafe when used in combination to make a compound operation.
In your given example, where only add() and clear() are supported, this can't happen.
But in a more complete class, where we would have more of the Collection interface, imagine a caller who needs to add an entry to the set iff the set has no more than 100 entries already.
This caller would want to write a method something like this:
void addIfNotOverLimit (AmIThreadSafe set, Object o, int limit) {
if (set.size() < limit) // ## thread-safe call 1
set.add(o); // ## thread-safe call 2
}
The problem is that while each call is itself threadsafe, two threads could be in addIfNotOverLimit (or for that matter, adding through another method altogether), and so threads A would call size() and get 99, and then call add(), but before that happens, it could be interrupted, and thread B could then add an entry, and now the set would be over its limit.
Moral? Compound operations make the definition of 'thread safe' more complex.
No, because the state of an object is the "sum" of all of its attributes.
for instance, you could have 2 thread-safe collections as attributes in your object. additionally, your object could depend on some sort of correlation between these 2 collections (e.g. if an object is in 1 collection, it is in the other collection, and vice versa). simply using 2 thread-safe collections will not ensure that that correlation is true at all points in time. you would need additional concurrency control in your object to ensure that this constraint holds across the 2 collections.
since most non-trivial objects have some type of correlation relationship across their attributes, using thread-safe collections as attributes is not sufficient to make an object thread-safe.
What is thread safety?
Thread safety simply means that the
fields of an object or class always
maintain a valid state, as observed by
other objects and classes, even when
used concurrently by multiple threads.
A thread-safe object is one that
always maintains a valid state, as
observed by other classes and objects,
even in a multithreaded environment.
According to the API documentation, you have to use this function to ensure thread-safety:
synchronizedCollection(Collection c)
Returns a synchronized (thread-safe) collection
backed by the specified collection
Reading that, it is my opinion that you have to use the above function to ensure a thread-safe Collection. However, you do not have to use them for all Collections and there are faster Collections that are thread-safe such as ConcurrentHashMap. The underlying nature of CopyOnWriteArraySet ensures thread-safe operations.