Is KeySet iterator of ConcurrentHashMap is threadsafe? - java

I just trying to explore What is ThreadSafe mean?
Below are my understanding:
It looks like for me; allowing multiple threads to access a collection at the same time; this is irrespective of its synchronization. For example any method without synchronized keyword on it; is thread safe, means mutiple threads can access it.
It is up to a developer choice to maintain some more logic (synchronization) on this method to maintain data integrity while multi-threads are accessing it. That is separate from thread safe.
If my above statement is false; just read the below JAVA DOC for `ConcurrentHashMap:
keySet: The view's iterator is a "weakly consistent" iterator that will never throw
ConcurrentModificationException, and guarantees to traverse elements as they existed upon construction of the iterator, and may (but is not guaranteed to) reflect any modifications subsequent to construction.
The above statement says keySet iterator will not guarantee the data integrity; while multi-threads are modifying the collection.
Could you please answer me, *Is KeySet iterator of ConcurrentHashMap is threadsafe?
And my understanding on thread safe is correct ??

keySet: The view's iterator is a "weakly consistent" iterator that will never throw ConcurrentModificationException, and guarantees to traverse elements as they existed upon construction of the iterator, and may (but is not guaranteed to) reflect any modifications subsequent to construction
This itself explains, that KeySet iterator of ConcurrentHashMap is threadsafe.

General idea behind the java.util.concurrent package is providing a set of data structures that provide thread-safe access without strong consistency. This way these objects achieve higher concurrency then properly locked objects.
Being thread safe means that, even without any explicit synchronization you never corrupt the objects. In HashTable and HashMap some methods are potential problems for multi-thread access, such as remove method, that first checks that the element exists, then removes it. These kind of methods are implemented as atomic operations in ConcurrentHashMap, thus you do not need to afraid that you will lose some data.
However it does not mean that this class is automatically locked for each operation. High level operations such as putAll and iterators are not synchronized. The class does not provide strong consistency. The order and timing of your operations are guaranteed to not to corrupt the object, but are not guaranteed to generate accurate results.
For example if your print the object concurrently with a call to putAll, you might see a partially populated output. Using an iterator concurrently with new insertions also might not reflect all insertions as you quoted.
This is different from being thread safe. Even though the results might surprise you, you are assured that nothing is lost or accidentally overwritten, elements are added to and removed from your object without any problem. If this behaviour is sufficient for your requirements you are advised to use java.util.concurrent classes. If you need more consistency, then you need to use synchronized classes from java.util or use synchronization yourself.

By your definition the Set returned by ConcurrentHashMap.keySet() is thread safe.
However, it may act in very strange ways, as pointed out in the quote you included.
As a Set, entries may appear and/or disappear at random. I.e. if you call contains twice on the same object, the two results may differ.
As an Iterable you could begin two iterations of its underlying objects in two different threads and discover that the two iterations enumerate different entries.
Furthermre, contains and iteration may not match either.
This activity will not occur, however, if you somehow lock the underlying Map from modification while you have hold of your Set but the need to do that does not imply that the structure is not thread safe.

Related

Why are we here, specifically, saying that ArrayList is not thread safe?

Description: If we use same object reference among multiple threads, no object is thread safe. Similarly, if any collection reference is shared among multiple threads then that collection is not thread-safe since other threads can access it. So, Why are we here specifically saying that ArrayList is not thread-safe? What about the other Collections?
You misunderstand the meaning of "thread-safe."
When we say "class X is thread-safe," We are not saying that you don't have to worry about the thread-safety of a program that uses it. If you build a program using nothing but thread-safe objects, that does not guarantee that your program will be thread-safe.
So what does it guarantee?
Suppose you have a List. Suppose that two threads, A and B, each write different values to the same index in the list, suppose that some thread C reads from that index, and suppose that none of those three threads uses any synchronization.
If the list is "thread-safe," then you can be assured that thread C will get one of three possible values:
The value that thread A wrote,
The value that thread B wrote,
The value that was stored at that index before either thread A or thread B wrote.
If the list is not thread-safe, then any of those same three things could happen, but also, other things could happen:
Thread C could get a value that was never in the list,
The list could behave in broken ways in the future for thread C even if no other thread continues to use it,
The program could crash,
etc. (I don't know how many other strange things could happen.)
When we say that a class is "thread-safe" we are saying that it will always behave in predictable, reasonable ways, even when its methods are concurrently called by multiple threads.
If you write a program that uses a "thread-safe" list, and if it depends on thread C reading one particular value of the three possibilities that I listed above, then your program has a thread-safety problem, even though the list itself does not.
I haven't checked but I think that all standard Collection implementations state if they are thread-safe or not. So you know if you can share that collection among different threads without synchronization.
CopyOnWriteArrayList for example is a thread-safe List implementation.
ArrayList is unsynchronized in implementation. When an object is unsynchronized it means that is is not locked while being modified structurally. A structural modification is any operation that adds or deletes one or more elements, or explicitly resizes the backing array; merely setting the value of an element is not a structural modification.
What you are referring to is an array which the elements are being added to or being deleted from and can be modified this differs from it having its value being set.
Reference is in regards with the pointer of the start of the array but how many elements are there is in question and having an unsynchronized object being modified in the sense of elements while the elements are being iterated over by another thread the integrity of the elements in the list is hard to guarantee. I hope I was able to convey the message plainly.
Look for more details here in Oracle: Array List and ConcurrentModificationException
ArrayList:
Note that this implementation is not synchronized. If multiple threads access an ArrayList instance concurrently, and at least one of the threads modifies the list structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more elements, or explicitly resizes the backing array; merely setting the value of an element is not a structural modification.) This is typically accomplished by synchronizing on some object that naturally encapsulates the list. If no such object exists, the list should be "wrapped" using the Collections.synchronizedList method.
ConcurrentModificationException:
Note that fail-fast behavior cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the presence of unsynchronized concurrent modification.

Are Synchronized Blocks needed for Blocking Queues

public BlockingQueue<Message> Queue;
Queue = new LinkedBlockingQueue<>();
I know if I use, say a synchronized List, I need to surround it in synchronized blocks to safely use it across threads
Is that the same for Blocking Queues?
No you do not need to surround with synchronized blocks.
From the JDK javadocs...
BlockingQueue implementations are thread-safe. All queuing methods achieve their effects atomically using internal locks or other forms of concurrency control. However, the bulk Collection operations addAll, containsAll, retainAll and removeAll are not necessarily performed atomically unless specified otherwise in an implementation. So it is possible, for example, for addAll(c) to fail (throwing an exception) after adding only some of the elements in c.
Just want to point out that from my experience the classes in the java.util.concurrent package of the JDK do not need synchronization blocks. Those classes manage the concurrency for you and are typically thread-safe. Whether intentional or not, seems like the java.util.concurrent has superseded the need to use synchronization blocks in modern Java code.
Depends on use case, will explain 2 scenarios where you may need synchronized blocks or dont need it.
Case 1: Not required while using queuing methods e.g. put, take etc.
Why not required is explained here, important line is below:
BlockingQueue implementations are thread-safe. All queuing methods
achieve their effects atomically using internal locks or other forms
of concurrency control.
Case 2: Required while iterating over blocking queues and most concurrent collections
Since iterator (one example from comments) is weakly consistent, meaning it reflects some but not necessarily all of the changes that have been made to its backing collection since it was created. So if you care about reflecting all changes you need to use synchronized blocks/ Locks while iterating.
You are thinking about synchronization at too low a level. It doesn't have anything to do with what classes you use. It's about protecting data and objects that are shared between threads.
If one thread is able to modify any single data object or group of related data objects while other threads are able to look at or modify the same object(s) at the same time, then you probably need synchronization. The reason is, it often is not possible for one thread to modify data in a meaningful way without temporarily putting the data into an invalid state.
The purpose of synchronization is to prevent other threads from seeing the invalid state and possibly doing bad things to the same data or to other data as a result.
Java's Collections.synchronizedList(...) gives you a way for two or more threads to share a List in such a way that the list itself is safe from being corrupted by the action of the different threads. But, It does not offer any protection for the data objects that are in the List. If your application needs that protection, then it's up to you to supply it.
If you need the equivalent protection for a queue, you can use any of the several classes that implement java.util.concurrent.BlockingQueue. But beware! The same caveat applies. The queue itself will be protected from corruption, but the protection does not automatically extend to the objects that your threads pass through the queue.

Is ConcurrentSkipListMap put method thread-safe?

Recently while exploring ConcurrentSkipListMap I went through its implementation and found that its put method is not thread-safe. It internally calls doPut which actually adds the item. But I found that this method does not use any kind of lock similar to ConcurrentHashMap.
Therefore, I want to know whether add is thread-safe or not. Looking at the method it seems that it is not thread-safe--that is if this method is executed by two threads simultaneously then a problem may occur.
I know ConcurrentSkipListMap internally uses a skiplist data structure but I was expecting add method to be thread safe. Am I understanding anything wrong ? Is ConcurrentSkipListMap really not thread-safe ?
Just because it doesn't use a Lock doesn't make it thread unsafe. The Skip list structure can be implemented lock free.
You should read the API carefully.
... Insertion, removal, update, and access operations safely execute concurrently by multiple threads. Iterators are weakly consistent, returning elements reflecting the state of the map at some point at or since the creation of the iterator. They do not throw ConcurrentModificationException, and may proceed concurrently with other operations. ...
The comments in the implementation say:
Given the use of tree-like index nodes, you might wonder why this
doesn't use some kind of search tree instead, which would support
somewhat faster search operations. The reason is that there are no
known efficient lock-free insertion and deletion algorithms for search
trees. The immutability of the "down" links of index nodes (as opposed
to mutable "left" fields in true trees) makes this tractable using
only CAS operations.
So they use some low level programming features with compare-and-swap operations to make changes to the map atomic. With this they ensure thread safety without the need to synchronize access.
You can read it in more detail in the source code.
We should trust Java API. And this is what java.util.concurrent package docs says:
Concurrent Collections
Besides Queues, this package supplies Collection implementations designed for use in multithreaded contexts: ConcurrentHashMap, ConcurrentSkipListMap, ConcurrentSkipListSet, CopyOnWriteArrayList, and CopyOnWriteArraySet.

One concurrent collection inside another: is it thread safe

[Question]: Is it thread safe to use ConcurrentHashMap<Object, ConcurrentHashMap<Object, Object>> or not.
[Optional to answer]: Also what about another concurrent maps types? And what about concurrent collections?
P.S. I'm asking only about java.util.concurrent package.
Specific Usage Context:
//we have
ConcurrentHashMap<Object, ConcurrentHashMap<Object, Object>> map = new ConcurrentHashMap<Object, ConcurrentHashMap<Object, Object>>();
//each string can be executed separately and concurently
ConcurrentHashMap<Object, Object> subMap = new ConcurrentHashMap<Object, Object>()
map.put(key, subMap);
map.remove(key);
map.get(key);
map.get(key).put(key, ref);
map.get(key).remove(key);
Maybe my solution lays around Guava HashBasedTable?
You can't define thread safety without the specific context in which you plan to use your collections.
The concurrent collections you have named are thread-safe on their own in the sense that their internal invariants will not be broken by concurrent access; however that's just one bullet point on the thread safety checklist.
If you perform anything more than a single operation on your structure, which must be atomic as a whole, then you will not get thread safety just by using these classes. You will have to resort to classic locking, or some quite elaborate, and usually unmotivated, lock-free updating scheme.
Using the examples from your question, consider the following.
Thread 1 executes
map.get(mapKey).put(key, value);
At the same time, Thread 2 executes
map.remove(mapKey);
What is the outcome? Thread 1 may be putting something to a map which has already been removed, or it may even get a null result from get. In most cases more coordination will be needed for correctness.
Concurrent Collections means multiple thread could perform add/remove operation on collection same time, No it is not thread safe
More Detail:
for further please read
What's the difference between ConcurrentHashMap and Collections.synchronizedMap(Map)?
Is ConcurrentHashMap totally safe?
The concurrent collections are thread safe for reads; but you must expect ConcurrentModificationException in case of competing concurrent updates or when modifying a Collection while another thread is iterating over it.
this is what the javadoc of ConcurrentHashMap says:
However, even though all operations are thread-safe, retrieval operations do not entail locking, and there is not any support for locking the entire table in a way that prevents all access
So, they ARE thread-safe in terms of modifying it.
UPDATE
same javadoc http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ConcurrentHashMap.html says:
Retrieval operations (including get) generally do not block, so may overlap with update operations (including put and remove). Retrievals reflect the results of the most recently completed update operations holding upon their onset. For aggregate operations such as putAll and clear, concurrent retrievals may reflect insertion or removal of only some entries. Similarly, Iterators and Enumerations return elements reflecting the state of the hash table at some point at or since the creation of the iterator/enumeration. They do not throw ConcurrentModificationException. However, iterators are designed to be used by only one thread at a time.
In general the classes which are part of java.util.concurrent provide additional performance at the (potential) penalty of additional coding complexity.
The issue that I see with nesting ConcurrentMap instances is managing the populating the outer map with values at given keys. If all the keys are known upfront and values placed in the map in some sort of initialization phase, there are no issues (but you also likely would not need to have the outer map be a ConcurrentMap). If you need to be able to insert new maps into the outer map as you go, the work becomes a bit more complicated. When creating a new map to insert into the outer map, you would need to use the putIfAbsentmethod[1] and pay attention to the returned value to determine what instance to add data to.
[1] - http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ConcurrentMap.html#putIfAbsent(K,%20V)

Can Java LinkedList be read in multiple-threads safely?

I would like multiple threads to iterate through the elements in a LinkedList. I do not need to write into the LinkedList. Is it safe to do so? Or do I need a synchronized list to make it work?
Thank you!
They can do this safely, PROVIDED THAT:
they synchronize with (all of) the threads that have written the list BEFORE they start the iterations, and
no threads modify the list during the iterations.
The first point is necessary, because unless there is proper synchronization before you start, there is a possibility that one of the "writing" threads has unflushed changes for the list data structures in local cache memory or registers, or one of the reading threads has stale list state in its cache or registers.
(This is one of those cases where a solid understanding of the Java memory model is needed to know whether the scenario is truly thread-safe.)
Or do I need a synchronized list to make it work
You don't necessarily need to go that far. All you need to do is to ensure that there is a "happens-before" relationship at the appropriate point, and there are a variety of ways to achieve that. For instance, if the list is created and written by the writer thread, and the writer then passes the list to the reader thread objects before calling start() on them.
From the Java documentation:
Note that this implementation is not synchronized. If multiple threads access a linked list concurrently, and at least one of the threads modifies the list structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more elements; merely setting the value of an element is not a structural modification.) This is typically accomplished by synchronizing on some object that naturally encapsulates the list. If no such object exists, the list should be "wrapped" using the Collections.
In other words, if you are truly just iterating through then you're alright, just be careful.

Categories

Resources