I am not sure how to properly use the Collections.synchronizedList() implementation.
I have these two:
public synchronized static List<CurrencyBox> getOrderList() {
return Collections.synchronizedList(orderList);
}
and
public static List<CurrencyBox> getOrderList() {
return Collections.synchronizedList(orderList);
}
So as far as I understood, synchronizedList really returns the orderList and not a copy, correct?
So If I want to gurantee atomic operations, like add and remove, which of the implementation above is correct?
And does something maybe changes with Java9? Or is it still the way to go or have you any other suggestion?
Thank you
Without context it's a bit hard to tell, from the snippets provided neither give you guaranteed atomic operations.
The documentation states:
Returns a synchronized (thread-safe) list backed by the specified
list. In order to guarantee serial access, it is critical that all
access to the backing list is accomplished through the returned list.
So even if you synchronize the method the best you'll get is a guarantee that no two objects are creating the synchronized list at the same time.
You need to wrap the original orderList with Collections.synchronizedList to begin with and return the stored result of that each time.
private static List<CurrencyBox> orderList = Collections.synchronizedList(new ArrayList<CurrencyBox>());
public static List<CurrencyBox> getOrderList() {
return orderList
}
A synchronized list only synchronized methods of this list.
It means a thread won't be able to modify the list while another thread is currently running a method from this list. The object is locked while processing method.
As an example, Let's say two threads run addAllon your list, with 2 different lists (A=A1,A2,A3, B=B1,B2,B3) as parameter.
As the method is synchronized, you can be sure those lists won't be merged randomly like A1,B1,A2,A3,B2,B3
You don't decide when a thread handover the process to the other thread so you can either get A1,A2,A3,B1,B2,B3 or B1,B2,B3,A1,A2,A3.
Credit : jhamon
Related
How do I lock a data structure (such as List) when someone is iterating over it?
For example, let's say I have this class with a list in it:
class A{
private List<Integer> list = new ArrayList<>();
public MyList() {
// initialize this.list
}
public List<Integer> getList() {
return list;
}
}
And I run this code:
public static void main(String[] args) {
A a = new A();
Thread t1 = new Thread(()->{
a.getList().forEach(System.out::println);
});
Thread t2 = new Thread(()->{
a.getList().removeIf(e->e==1);
});
t1.start();
t2.start();
}
I don't have a single block of code that uses the list, so I can't use synchronized().
I was thinking of locking the getList() method after it has been called but how can I know if the caller has finished using it so I could unlock it?
And I don't want to use CopyOnWriteArrayList because of I care about my performance;
after it has been called but how can I know if the caller has finished using it so I could unlock it?
That's impossible. The iterator API fundamentally doesn't require that you explicitly 'close' them, so, this is simply not something you can make happen. You have a problem here:
Iterating over the same list from multiple threads is an issue if anybody modifies that list in between. Actually, threads are immaterial; if you modify a list then interact with an iterator created before the modification, you get ConcurrentModificationException guaranteed. Involve threads, and you merely usually get a CoModEx; you may get bizarre behaviour if you haven't set up your locking properly.
Your chosen solution is "I shall lock the list.. but how do I do that? Better ask SO". But that's not the correct solution.
You have a few options:
Use a lock
It's not specifically the iteration that you need to lock, it's "whatever interacts with this list". Make an actual lock object, and define that any interaction of any kind with this list must occur in the confines of this lock.
Thread t1 = new Thread(() -> {
a.acquireLock();
try {
a.getList().forEach(System.out::println);
} finally {
a.releaseLock();
}
});
t1.start();
Where acquireLock and releaseLock are methods you write that use a ReadWriteLock to do their thing.
Use CopyOnWriteArrayList
COWList is an implementation of java.util.List with the property that it copies the backing store anytime you change anything about it. This has the benefit that any iterator you made is guaranteed to never throw ConcurrentModificationException: When you start iterating over it, you will end up iterating each value that was there as the list was when you began the iteration. Even if your code, or any other thread, starts modifying that list halfway through. The downside is, of course, that it is making lots of copies if you make lots of modifications, so this is not a good idea if the list is large and you're modifying it a lot.
Get rid of the getList() method, move the tasks into the object itself.
I don't know what a is (the object you call .getList() on, but apparently one of the functions that whatever this is should expose is some job that you really can't do with a getList() call: It's not just that you want the contents, you want to get the contents in a stable fashion (perhaps the method should instead have a method that gives you a copy of the list), or perhaps you want to do a thing to each element inside it (e.g. instead of getting the list and calling .forEach(System.out::println) on it, instead pass System.out::println to a and let it do the work. You can then focus your locks or other solutions to avoid clashes in that code, and not in callers of a.
Make a copy yourself
This doesn't actually work, even though it seems like it: Immediately clone the list after you receive it. This doesn't work, because cloning the list is itself an operation that iterates, just like .forEach(System.out::println) does, so if another thread interacts with the list while you are making your clone, it fails. Use one of the above 3 solutions instead.
For example, in the code below, we have to wrap list in a synchronized block when doing the iteration. Does the Collections.synchronizedList make the list synchronized? Why do we do this if it doesn't provide any convenience? Thanks!
List<Integer> list = Collections.synchronizedList( new ArrayList<>(Arrays.asList(4,3,52)));
synchronized(list) {
for(int data: list)
System.out.print(data+" ");
}
See https://docs.oracle.com/javase/tutorial/collections/implementations/wrapper.html
The reason is that iteration is accomplished via multiple calls into the collection, which must be composed into a single atomic operation.
Also see https://www.baeldung.com/java-synchronized-collections
Why do we do this if it doesn't provide any convenience
That it does not help you when iterating is not the same as providing no convenience.
All of the methods - get, size, set, isEmpty etc - are synchronized. This means that they have visibility of all writes made in any thread.
Without the synchronization, there is no guarantee that updates made in one thread are visible to any other threads, so one thread might see a size of 5 which another sees a size of 6, for example.
The mechanism for making the list synchronized is to make all of its methods synchronized: this effectively means that the body of the method is wrapped in a synchronized (this) { ... } block.
This is still true of the iterator() method: that too is synchronized. But the synchronized block finishes when iterator() returns, not when you finish iterating. It's a fundamental limitation of the way the language is designed.
So you have to help the language by adding the synchronized block yourself.
Wrapper is used to synchronize addition and removal elements from wrapped collection.
JavaDoc mentions that iteration is not synchronized an you need to synchronize it yourself.
* It is imperative that the user manually synchronize on the returned
* list when iterating over it
But other access operations are thread-safe and also establish happens before relation (since they use synchronized).
Collections.synchronizedList method synchronises methods like add, remove. However, it does not synzhronize iterator() method. Consider the following scenario:
Thread 1 is iterating through the list
Thread 2 is adding an element into it
In this case, you will get ConcurrentModificationException and hence, it's imperative to synzhronize the calls to iterator() method.
In thread A, an ArrayList is created. It is managed from thread A only.
In thread B, I want to copy that to a new instance.
The requirement is that copyList should not fail and should return a consistent version of the list (= existed at some time at least during the copying process).
My approach is this:
public static <T> ArrayList<T> copyList(ArrayList<? extends T> list) {
List<? extends T> unmodifiableList = Collections.unmodifiableList(list);
return new ArrayList<T>(unmodifiableList);
}
Q1: Does that satisfy the requirements?
Q2: How can I do the same without Collections.unmodifiableList with proably iterators and try-catch blocks?
UPD. That is an interview question I was asked a year ago. I understand this a bad idea to use non-thread-safe collections like ArrayList in multithreaded environment
No. ArrayList is not thread safe and you are not using an explicit synchronization.
While you are executing the method unmodifiableList the first thread can modify the original list and you will have a not valid unmodifiable list.
The simplest way I think is the following:
Replace the List with a synchronized version of it.
On the copy list synchronize on the arrayList and make a copy
For example, something like:
List<T> l = Collections.synchronizedList(new ArrayList<T>());
...
public static <T> List<T> copyList(List<? extends T> list) {
List<T> copyList = null;
synchronized(list) {
copyList = new ArrayList<T>(list);
}
return copyList;
}
You should synchronize access to the ArrayList, or replace ArrayList with a concurrent collection like CopyOnWriteArrayList.
Without doing that you might end up with a copy that is inconsistent.
There is absolutely no way to create a copy of a plain ArrayList if the "owning" thread does not offer some protocol to do so.
Without any protocol, thread A can modify the list potentially at any time, meaning thread B never gets a chance to ensure that is sees a consistent state of the list.
To actually allow a consistent copy to be made, thread A must ensure that any modifications it has made are written to memory and are visible to other threads.
Normally, the VM is allowed to reorder instructions, reads and writes as it sees fit, provided no difference can be observed from within the thread executing the program. This includes, for example, delaying writes by holding values in CPU registers or on the local stack.
The only way to ensure that everything is consistently written to main menory, is for thread A to execute an instruction that presents a reordering barrier to the VM (e.g. synchronized block or volatile field access).
So without some cooperation from thread A, there is no way to ensure above conditions are guaranteed to be fulfilled.
Common methods of circumventing this are to synchronize access to the List by only using it in a safely wrapped form (Collections.synchronizedCollection), or use of a List implementation that has these guarantees built in (any type of concurrent list implementation).
The javadoc for Collections.unmodifiableList(...) says, "Returns an unmodifiable view of the specified list."
The key word there is "view". That means it does not copy the data. All it does is create a wrapper for the given list with mutators that all throw exceptions rather than modify the base list.
Yes, but I acually create new ArrayList(Collections.unmodif...), wouldn't this work?
Oops! I missed that. If you're going to copy the list, then there's no point in calling unmodifiableList(). The only code that will ever access the unmodifiable view is the code that's right there in the same method where it's created. You don't have to worry about that code modifying the list contents because you wrote it.
On the other hand, if you're going to copy the list when other threads could be updating the list, then you're going to need synchronized all around. Every place where code could update the list needs to be in a synchronized block, as does the code that makes the copy. Of course, all of those synchronized blocks must synchronize on the same object.
Some programmers will use the list object itself as the lock object. Others will prefer to use a separate object.
Q1: Does that satisfy the requirements?
If the provided list is modified while copying it using new ArrayList<T>(unmodifiableList), you will get a ConcurrentModificationException even if you wrapped it using Collections.unmodifiableList because the Iterator of an UnmodifiableList simply calls the Iterator of the wrapped list and here as it is a non thread safe list you can still get a ConcurrentModificationException.
What you could do is indeed use CopyOnWriteArrayList instead as it is a thread safe list implementation that provides consistent snapshots of the List when you try to iterate over it. Another way could be to make the Thread A push for other threads regularly a safe copy of it using new ArrayList<T>(myList) as it is the only thread that modifies it we know that while creating the copy no other thread will modify it so it would be safe.
Q2: How can I do the same without Collections.unmodifiableList with
probably iterators and try-catch blocks?
As mentioned above Collections.unmodifiableList is not helping here to make it thread safe, for me the only thing that could make sense is actually the opposite: the thread A (the only thread that can modify the list) creates a safe copy of your ArrayList using new ArrayList<T>(list) then it pushes to other threads an unmodified list of it using Collections.unmodifiableList(list).
Generally speaking you should avoid specifying implementations in your method's definition especially public ones, you should only use interfaces or abstract classes because otherwise you would provide an implementation details to the users of your API which is not expected. So here it should be List or Collection not ArrayList.
I have a List read (iterated through) many times and by multiple threads but updated rarely (reads are more than 50,000 times more numerous). EDIT: in fact, an array would suffice in this case, instead of a List.
When the list is updated, it's simply replaced with a different version (there are no add() or remove() calls).
A CopyOnWriteArrayList avoid the disadvantages of a synchronized list but I'm not sure that setting the list to the new value is atomic. I read this question as well.
To show some code. Consider the following as an attribute of a singleton Spring bean.
List<MyObject> myList; //the list read many times and concurrently.
//called by many threads
public void doStuff(){
for (MyObject mo : myList){
//do something
}
}
//called rarely. It's synchronized to prevent concurrent updates
//but my question is about thread-safety with regards to readers
public synchronized void updateList(List<MyObject> newList){ // newList is a CopyOnWriteArrayList<>();
myList = myNewList; //is this following statement executed atomically and thread-safe for readers?
}
Do I need to use a ReadWriteLock for achieve a thread-safe set?
The need for ReadWriteLock depends what you need to achieve.
If you want to ensure that reference is updated atomically you can use AtomicReference (or in your case enough to mark this reference as volatile), however if your goal is that updater thread should wait until all reading threads finish iterating over old list before updating the reference then ReadWriteLock is the way to go.
Updated the question.. please check secodn part of question
I need to build up a master list of book ids. I have multiple threaded tasks which brings up a subset of book ids. As soon as each task execution is completed, I need to add them to the super list of book ids. Hence I am planning to pass below aggregator class instance to all of my execution tasks and have them call the updateBookIds() method. To ensure it's thread safe, I have kept the addAll code in synchronized block.
Can any one suggest is this same as Synchronized list? Can I just say Collections.newSynchronizedList and call addAll to that list from all thread tasks? Please clarify.
public class SynchronizedBookIdsAggregator {
private List<String> bookIds;
public SynchronizedBookIdsAggregator(){
bookIds = new ArrayList<String>();
}
public void updateBookIds(List<String> ids){
synchronized (this) {
bookIds.addAll(ids);
}
}
public List<String> getBookIds() {
return bookIds;
}
public void setBookIds(List<String> bookIds) {
this.bookIds = bookIds;
}
}
Thanks,
Harish
Second Approach
So after below discussions, I am currently planning to go with below approach. Please let me know if I am doing anything wrong here:-
public class BooksManager{
private static Logger logger = LoggerFactory.getLogger();
private List<String> fetchMasterListOfBookIds(){
List<String> masterBookIds = Collections.synchronizedList(new ArrayList<String>());
List<String> libraryCodes = getAllLibraries();
ExecutorService libraryBookIdsExecutor = Executors.newFixedThreadPool(BookManagerConstants.LIBRARY_BOOK_IDS_EXECUTOR_POOL_SIZE);
for(String libraryCode : libraryCodes){
LibraryBookIdsCollectionTask libraryTask = new LibraryBookIdsCollectionTask(libraryCode, masterBookIds);
libraryBookIdsExecutor.execute(libraryTask);
}
libraryBookIdsExecutor.shutdown();
//Now the fetching of master list is complete.
//So I will just continue my processing of the master list
}
}
public class LibraryBookIdsCollectionTask implements Runnable {
private String libraryCode;
private List<String> masterBookIds;
public LibraryBookIdsCollectionTask(String libraryCode,List<String> masterBookIds){
this.libraryCode = libraryCode;
this.masterBookIds = masterBookIds;
}
public void run(){
List<String> bookids = new ArrayList<String>();//TODO get this list from iconnect call
synchronized (masterBookIds) {
masterBookIds.addAll(bookids);
}
}
}
Thanks,
Harish
Can I just say Collections.newSynchronizedList and call addAll to that list from all thread tasks?
If you're referring to Collections.synchronizedList, then yes, that would work fine. That will give you a object that implements the List interface where all of the methods from that interface are synchronized, including addAll.
Consider sticking with what you have, though, since it's arguably a cleaner design. If you pass the raw List to your tasks, then they get access to all of the methods on that interface, whereas all they really need to know is that there's an addAll method. Using your SynchronizedBookIdsAggregator keeps your tasks decoupled from design dependence on the List interface, and removes the temptation for them to call something other than addAll.
In cases like this, I tend to look for a Sink interface of some sort, but there never seems to be one around when I need it...
The code you have implemented does not create a synchronization point for someone who accesses the list via getBookIds(), which means they could see inconsistent data. Furthermore, someone who has retrieved the list via getBookIds() must perform external synchronization before accessing the list. Your question also doesn't show how you are actually using the SynchronizedBookIdsAggregator class, which leaves us with not enough information to fully answer your question.
Below would be a safer version of the class:
public class SynchronizedBookIdsAggregator {
private List<String> bookIds;
public SynchronizedBookIdsAggregator() {
bookIds = new ArrayList<String>();
}
public void updateBookIds(List<String> ids){
synchronized (this) {
bookIds.addAll(ids);
}
}
public List<String> getBookIds() {
// synchronized here for memory visibility of the bookIds field
synchronized(this) {
return bookIds;
}
}
public void setBookIds(List<String> bookIds) {
// synchronized here for memory visibility of the bookIds field
synchronized(this) {
this.bookIds = bookIds;
}
}
}
As alluded to earlier, the above code still has a potential problem with some thread accessing the ArrayList after it has been retrieved by getBookIds(). Since the ArrayList itself is not synchronized, accessing it after retrieving it should be synchronized on the chosen guard object:
public class SomeOtherClass {
public void run() {
SynchronizedBookIdsAggregator aggregator = getAggregator();
List<String> bookIds = aggregator.getBookIds();
// Access to the bookIds list must happen while synchronized on the
// chosen guard object -- in this case, aggregator
synchronized(aggregator) {
<work with the bookIds list>
}
}
}
I can imagine using Collections.newSynchronizedList as part of the design of this aggregator, but it is not a panacea. Concurrency design really requires an understanding of the underlying concerns, more than "picking the right tool / collection for the job" (although the latter is not unimportant).
Another potential option to look at is CopyOnWriteArrayList.
As skaffman alluded to, it might be better to not allow direct access to the bookIds list at all (e.g., remove the getter and setter). If you enforce that all access to the list must run through methods written in SynchronizedBookIdsAggregator, then SynchronizedBookIdsAggregator can enforce all concurrency control of the list. As my answer above indicates, allowing consumers of the aggregator to use a "getter" to get the list creates a problem for the user of that list: to write correct code they must have knowledge of the synchronization strategy / guard object, and furthermore they must also use that knowledge to actively synchronize externally and correctly.
Regarding your second approach. What you have shown looks technically correct (good!).
But, presumably you are going to read from masterBookIds at some point, too? And you don't show or describe that part of the program! So when you start thinking about when and how you are going to read masterBookIds (i.e. the return value of fetchMasterListOfBookIds()), just remember to consider concurrency concerns there too! :)
If you make sure all tasks/worker threads have finished before you start reading masterBookIds, you shouldn't have to do anything special.
But, at least in the code you have shown, you aren't ensuring that.
Note that libraryBookIdsExecutor.shutdown() returns immediately. So if you start using the masterBookIds list immediately after fetchMasterListOfBookIds() returns, you will be reading masterBookIds while your worker threads are actively writing data to it, and this entails some extra considerations.
Maybe this is what you want -- maybe you want to read the collection while it is being written to, to show realtime results or something. But then you must consider synchronizing properly on the collection if you want to iterate over it while it is being written to.
If you would just like to make sure all writes to masterBookIds by worker threads have completed before fetchMasterListOfBookIds() returns, you could use ExecutorService.awaitTermination (in combination with .shutdown(), which you are already calling).
Collections.SynchronizedList (which is the wrapper type you'd get) would synchronize almost every method on either itself or a mutex object you pass to the constructor (or Collections.synchronizedList(...) ). Thus it would basically be the same as your approach.
All the methods called using the wrapper returned by Collections.synchronizedList() will be synchronized. This means that the addAll method of normal List when called by this wrapper will be something like this :-
synchronized public static <T> boolean addAll(Collection<? super T> c, T... elements)
So, every method call for the list (using the reference returned and not the original reference) will be synchronized.
However, there is no synchronization between different method calls.
Consider following code snippet :-
List<String> l = Collections.synchronizedList(new ArrayList<String>);
l.add("Hello");
l.add("World");
While multiple threads are accessing the same code, it is quite possible that after Thread A has added "Hello", Thread B will start and again add "Hello" and "World" both to list and then Thread A resumes. So, list would have ["hello", "hello", "world", "world"] instead of ["hello", "world", hello", "world"] as was expected. This is just an example to show that list is not thread-safe between different method calls of the list. If we want the above code to have desired result, then it should be inside synchronized block with lock on list (or this).
However, with your design there is only one method call. SO IT IS SAME AS USING Collections.synchronizedList().
Moreover, as Mike Clark rightly pointed out, you should also synchronized getBookIds() and setBookIds(). And synchronizing it over List itself would be more clear since it is like locking the list before operating on it and unlocking it after operating. So that nothing in-between can use the List.