Can Java LinkedList be read in multiple-threads safely? - java

I would like multiple threads to iterate through the elements in a LinkedList. I do not need to write into the LinkedList. Is it safe to do so? Or do I need a synchronized list to make it work?
Thank you!

They can do this safely, PROVIDED THAT:
they synchronize with (all of) the threads that have written the list BEFORE they start the iterations, and
no threads modify the list during the iterations.
The first point is necessary, because unless there is proper synchronization before you start, there is a possibility that one of the "writing" threads has unflushed changes for the list data structures in local cache memory or registers, or one of the reading threads has stale list state in its cache or registers.
(This is one of those cases where a solid understanding of the Java memory model is needed to know whether the scenario is truly thread-safe.)
Or do I need a synchronized list to make it work
You don't necessarily need to go that far. All you need to do is to ensure that there is a "happens-before" relationship at the appropriate point, and there are a variety of ways to achieve that. For instance, if the list is created and written by the writer thread, and the writer then passes the list to the reader thread objects before calling start() on them.

From the Java documentation:
Note that this implementation is not synchronized. If multiple threads access a linked list concurrently, and at least one of the threads modifies the list structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more elements; merely setting the value of an element is not a structural modification.) This is typically accomplished by synchronizing on some object that naturally encapsulates the list. If no such object exists, the list should be "wrapped" using the Collections.
In other words, if you are truly just iterating through then you're alright, just be careful.

Related

Why are we here, specifically, saying that ArrayList is not thread safe?

Description: If we use same object reference among multiple threads, no object is thread safe. Similarly, if any collection reference is shared among multiple threads then that collection is not thread-safe since other threads can access it. So, Why are we here specifically saying that ArrayList is not thread-safe? What about the other Collections?
You misunderstand the meaning of "thread-safe."
When we say "class X is thread-safe," We are not saying that you don't have to worry about the thread-safety of a program that uses it. If you build a program using nothing but thread-safe objects, that does not guarantee that your program will be thread-safe.
So what does it guarantee?
Suppose you have a List. Suppose that two threads, A and B, each write different values to the same index in the list, suppose that some thread C reads from that index, and suppose that none of those three threads uses any synchronization.
If the list is "thread-safe," then you can be assured that thread C will get one of three possible values:
The value that thread A wrote,
The value that thread B wrote,
The value that was stored at that index before either thread A or thread B wrote.
If the list is not thread-safe, then any of those same three things could happen, but also, other things could happen:
Thread C could get a value that was never in the list,
The list could behave in broken ways in the future for thread C even if no other thread continues to use it,
The program could crash,
etc. (I don't know how many other strange things could happen.)
When we say that a class is "thread-safe" we are saying that it will always behave in predictable, reasonable ways, even when its methods are concurrently called by multiple threads.
If you write a program that uses a "thread-safe" list, and if it depends on thread C reading one particular value of the three possibilities that I listed above, then your program has a thread-safety problem, even though the list itself does not.
I haven't checked but I think that all standard Collection implementations state if they are thread-safe or not. So you know if you can share that collection among different threads without synchronization.
CopyOnWriteArrayList for example is a thread-safe List implementation.
ArrayList is unsynchronized in implementation. When an object is unsynchronized it means that is is not locked while being modified structurally. A structural modification is any operation that adds or deletes one or more elements, or explicitly resizes the backing array; merely setting the value of an element is not a structural modification.
What you are referring to is an array which the elements are being added to or being deleted from and can be modified this differs from it having its value being set.
Reference is in regards with the pointer of the start of the array but how many elements are there is in question and having an unsynchronized object being modified in the sense of elements while the elements are being iterated over by another thread the integrity of the elements in the list is hard to guarantee. I hope I was able to convey the message plainly.
Look for more details here in Oracle: Array List and ConcurrentModificationException
ArrayList:
Note that this implementation is not synchronized. If multiple threads access an ArrayList instance concurrently, and at least one of the threads modifies the list structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more elements, or explicitly resizes the backing array; merely setting the value of an element is not a structural modification.) This is typically accomplished by synchronizing on some object that naturally encapsulates the list. If no such object exists, the list should be "wrapped" using the Collections.synchronizedList method.
ConcurrentModificationException:
Note that fail-fast behavior cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the presence of unsynchronized concurrent modification.

Is KeySet iterator of ConcurrentHashMap is threadsafe?

I just trying to explore What is ThreadSafe mean?
Below are my understanding:
It looks like for me; allowing multiple threads to access a collection at the same time; this is irrespective of its synchronization. For example any method without synchronized keyword on it; is thread safe, means mutiple threads can access it.
It is up to a developer choice to maintain some more logic (synchronization) on this method to maintain data integrity while multi-threads are accessing it. That is separate from thread safe.
If my above statement is false; just read the below JAVA DOC for `ConcurrentHashMap:
keySet: The view's iterator is a "weakly consistent" iterator that will never throw
ConcurrentModificationException, and guarantees to traverse elements as they existed upon construction of the iterator, and may (but is not guaranteed to) reflect any modifications subsequent to construction.
The above statement says keySet iterator will not guarantee the data integrity; while multi-threads are modifying the collection.
Could you please answer me, *Is KeySet iterator of ConcurrentHashMap is threadsafe?
And my understanding on thread safe is correct ??
keySet: The view's iterator is a "weakly consistent" iterator that will never throw ConcurrentModificationException, and guarantees to traverse elements as they existed upon construction of the iterator, and may (but is not guaranteed to) reflect any modifications subsequent to construction
This itself explains, that KeySet iterator of ConcurrentHashMap is threadsafe.
General idea behind the java.util.concurrent package is providing a set of data structures that provide thread-safe access without strong consistency. This way these objects achieve higher concurrency then properly locked objects.
Being thread safe means that, even without any explicit synchronization you never corrupt the objects. In HashTable and HashMap some methods are potential problems for multi-thread access, such as remove method, that first checks that the element exists, then removes it. These kind of methods are implemented as atomic operations in ConcurrentHashMap, thus you do not need to afraid that you will lose some data.
However it does not mean that this class is automatically locked for each operation. High level operations such as putAll and iterators are not synchronized. The class does not provide strong consistency. The order and timing of your operations are guaranteed to not to corrupt the object, but are not guaranteed to generate accurate results.
For example if your print the object concurrently with a call to putAll, you might see a partially populated output. Using an iterator concurrently with new insertions also might not reflect all insertions as you quoted.
This is different from being thread safe. Even though the results might surprise you, you are assured that nothing is lost or accidentally overwritten, elements are added to and removed from your object without any problem. If this behaviour is sufficient for your requirements you are advised to use java.util.concurrent classes. If you need more consistency, then you need to use synchronized classes from java.util or use synchronization yourself.
By your definition the Set returned by ConcurrentHashMap.keySet() is thread safe.
However, it may act in very strange ways, as pointed out in the quote you included.
As a Set, entries may appear and/or disappear at random. I.e. if you call contains twice on the same object, the two results may differ.
As an Iterable you could begin two iterations of its underlying objects in two different threads and discover that the two iterations enumerate different entries.
Furthermre, contains and iteration may not match either.
This activity will not occur, however, if you somehow lock the underlying Map from modification while you have hold of your Set but the need to do that does not imply that the structure is not thread safe.

Should I access (not change) an object from multiple threads?

My situation is that I have two threads. The 1st thread produces a number of objects which the 2nd thread does not have access to until all of them are created. After that the 2nd thread reads fields in those objects but does so concurrently with the 1st. At this point no thread is changing the values of the fields of the objects.
The objects are not synchronized. Should I synchronize them or not?
What I would recommend is to use an AtomicReference<Collection<SomeObject>>. The first thread would produce the collection of objects and do a reference.put(collection). The 2nd thread would see the objects (reference.get()) after they have been set on the AtomicReference only. Here are the javadocs for AtomicReference. You could also set your objects as an array or any type of collection such as List.
If is important to realize that after your set the collection (or array) on the AtomicReference you cannot make any changes to the collection. You can't add additional items, clear it, etc.. If you want true concurrent access to a collection of objects then you should look into ConcurrentHashMap and friends.
Should I synchronize them or not?
If the objects are not going to be mutated at all after they are put in your collection then you do not need to make them synchronized.
There's nothing wrong with reading data from multiple threads at the same time. Issues arise when you attempt to modify that data. So long as the objects are fully initialized and the values are such that the second thread receives the actual value (no issues with caching etc), there no problem with reading data from multiple threads concurrently.

Multiple threads iterating over the same map

I was recently writing a concurrent program in Java and came across the dollowing dilemma: suppose you have a global data structure, which is partof regular non-synchronized, non-concurrent lib such as HashMap. Is it OK to allow multiple threads to iterate through the collection (just reading, no modifications) perhaps at different, interleaved periods i.e. thread1 might be half way thrpugh iterating when thread2 gets his iterator on the same map?
It is OK. Ability to do this is the reason to create such interface as iterator. Every thread iterating over collection has its own instance of iterator that holds its state (e.g. where you are now in your iterating process).
This allows several threads to iterate over the same collection simultaneously.
It should be fine, as long as there are no writers.
This issue is similar to the readers-writer lock, where multiple readers are allowed to read from the data, but not during the time a writer "has" the lock for it. There is no concurrency issue for multiples read at the same time. [data race can occure only when you have at least one write].
Problems only arise when you attempt concurrent modifications on a data structure.
For instance, if one thread is iterating over the content of a Map, and another thread deletes elements from that collection, you'll be heading for serious trouble.
If you do need some threads to modify that collection safely, Java provides for mechanisms to do so, namely, ConcurrentHashMap.
ConcurrentHashMap in Java?
There is also Hashtable, which has the same interface as HashMap, but is synchronized, although it's use is not advised currently (deprecated), since it's performance suffers when the number of elements becomes larger (compared to ConcurrentHashMap which doesn't need to lock the entire Collection).
If you happen to have a Collection that is not synchronized and you need to have several threads reading and writing on top of it, you can use Collections.synchronizedMap(Map) to get a synchronized version of it.
The above answers are certainly good advice. In general, when writing Java with concurrent threads, so long as you do not modify a data structure, you need not worry about multiple threads concurrently reading that structure.
Should you have a similar problem in the future, except that the global data structure could be concurrently modified, I would suggest writing a Java class that all threads use to access and modify the structure. This class could impleement its own concurrency methodology, using either synchronized methods or locks. The Java tutorial has a very good explanation of Java's concurrency mechanisms. I have personally done this and it is fairly straight forward.

Returning Java Arrays vs Collections

I'm trying to think through some design in regards to memory allocation and multithreading in a java app and this is what I'm wondering:
I have a class that has a synchronized Collection say a list that is getting updated several times a a second but all updates are happening within the class and its own thread not from other threads. However I have many other threads that call the getCollection() method and do a foreach to iterate its contents in a read only fashion. This is what I don't know:
If another thread is iterating the synchronized colletion will the single thread that performs the updates have to wait until a point in time when no other threads are iterating?
My second question is it seems to make sense to return an array copy of the collection not the collection itself by doing .toArray but from thinking about it from a memory point of view won't that have to allocate a new array that is the size of the collection contents everytime and if getting called hundreds of times a second on a collection that has several thousand objects in it is something I don't know makes sense or not.
Also if I never return the collection itself than making the list synchronized is no longer necessary?
Would appreciate any input. Thanks! - Duncan
if another thread is iterating the
synchronized colletion will the single
thread that performs the updates have
to wait until a point in time when no
other threads are iterating?
If you're talking about synchronized (not concurrent) collections then yes.
As for the second question, it
looks like a real use case for java.util.concurrent.CopyOnWriteArrayList.
I would suggest you use the CopyOnWriteArrayList. This is thread safe and can be read accessed efficient by any number of threads. Provided you have a small number of updates this should be fine.
However, to answer your questions. If you iterator over a synchronized collection while it is being modifed, you will get a ConcurrentModificationException (COWAL doesn't get this) Your update will not be blocked by this, only your readers will have a problem.
Instead of creating a copy each time getCollection is called, youc an create a copy each time the collection is modifed (far less often) This is what COWAL does for you.
If you return a copy on demand, you will still need to synchronize the collection.
Probably the easiest way to deal with this is to keep two collections: one that is updated by the class itself, and a read-only copy in a volatile field that is returned when getCollection() is called.
The latter needs to be recreated by the process that updates the main collection when appropiate. This allows you to atomically update your collection: change several elements in one go, while hiding the intermediate states.
If your updates are infrequent and every update leaves the collection in a consistent state, then use the CopyOnWriteArrayList already suggested.
It seems that the collections is being updated frequently and #getCollection() is being called frequently. You could use CopyOnWriteArrayList but you'll creating a copy every time you modify the array. So you'll need to see how this effects performance.
Another option is to task the thread within the class to make a copy everytime #getCollection is called. This will involve #getCollection waiting for the internal class thread to complete.
If you just want #getCollection to return a recent copy and not the most up to date copy then you can have the internal thread periodically create a copy of the collection that gets returned in #getCollection. The copy will need to be volatile or be an AtomicReference.

Categories

Resources