Why CopyOnWriteArrayList copys when writing? - java

From the CopyOnWriteArrayList.java, the add method is as follows:
public boolean add(E e) {
final ReentrantLock lock = this.lock;
lock.lock();
try {
Object[] elements = getArray();
int len = elements.length;
Object[] newElements = Arrays.copyOf(elements, len + 1);
newElements[len] = e;
setArray(newElements);
return true;
} finally {
lock.unlock();
}
}
It's not hard to understand that add operation should lock, what confuses me is that it copy old data to new array and abandon the previous one.
meanwhile get method is as follows:
public E get(int index) {
return (E)(getArray()[index]);
}
With no lock in get method.
I find some explanations, some say copy to a new array can avoid add and get method operate on the same array.
My problem is why two thread cannot read and write at the same time?

If you just look at the top of the class CopyOnWriteArrayList about array referance variablle declaration there is the answer of your question.
private volatile transient Object[] array; // this is volatile
return (E)(getArray()[index]);
which returns latest copy of array[index] so this is threadsafe
final Object[] getArray() {
return array;
}
getArray is returning reference to array.

Actually the reason that the write path locks is not because it needs to provide thread safety considering the read path, but because it wants to serialize writers. Since the copy-on-write technique replaces the volatile reference, it's usually best to serialize that operation.
The key to this idea is that writes are accomplished by copying the existing value, modifying it, and replacing the reference. It also follows that once set the object pointed by the reference is always read only (i.e. no mutation is done directly on the object referred by the reference). Therefore, readers can access it safely without synchronization.
Reads and writes can happen concurrently. However, the implication is that the reads will see the soon-to-be-stale state until the volatile reference set is done.

At the time of get() if multiple threads try to get from the list their will be no issue.
Because due to volatile array it will always read latest copy and return the element from array.
But
During add() or set() every time they created a new array to avoid mutual execution problems, this is one way to make objects thread safe to make the immutable.
If they have used same array object during add or set then they have to make traversal synchronized.or it may throw exception if any thread add/remove object to list during traversal
As per java doc
A thread-safe variant of java.util.ArrayList in which all mutative operations (add, set, and so on) are implemented by making a fresh copy of the underlying array.
This is ordinarily too costly, but may be more efficient than alternatives when traversal operations vastly outnumber mutations, and is useful when you cannot or don't want to synchronize traversals
See this
package com.concurrent;
import java.util.List;
import java.util.concurrent.CopyOnWriteArrayList;
public class CopyOnWriteArrayListTest {
/**
* #param args
*/
public static void main(String[] args) {
CopyOnWriteArrayList<Integer> list=new CopyOnWriteArrayList<>();
Viewer viewer=new Viewer();
viewer.setList(list);
Thread t1=new Thread(viewer);
Adder adder=new Adder();
adder.setList(list);
Thread t=new Thread(adder);
t.start();
t1.start();
}
static class Adder implements Runnable{
private List<Integer> list;
public void setList(List<Integer> list) {
this.list = list;
}
#Override
public void run() {
for(int i=0;i<100;i++){
list.add(i);
System.out.println("Added-"+i);
try {
Thread.sleep(500);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
}
static class Viewer implements Runnable{
private List<Integer> list;
public void setList(List<Integer> list) {
this.list = list;
}
#Override
public void run() {
while (true) {
System.out.println("Length of list->"+list.size());
for (Integer i : list) {
System.out.println("Reading-"+i);
try {
Thread.sleep(500);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
}
}
}

Related

Java Concurrency In Practice. Listing 5.6

In Java Concurrency in Practice author gives the following example of a not-thread safe class, that behind the scenes invokes iterator on a set object and if multiple threads are involved, this may cause a ConcurrentModificationException. This is understood, one thread is modifying the collection, the other is iterating over it and, - boom!
What I do not understand, - the author is saying that this code can be fixed by wrapping a HashSet with Collections.synchronizedSet(). How would this fix a problem? Even though access to all the methods will be synchronized and guarded by the same intrinsic lock, once the iterator object is obtained, there is no guarantee that the other thread won't modify the collection once an iteration is being made.
Quote from the book:
If HiddenIterator wrapped the HashSet with a synchronizedSet, encapsulating the synchronization, this sort of error would not occur.
public class HiddenIterator {
//Solution :
//If HiddenIterator wrapped the HashSet with a synchronizedSet, encapsulating the synchronization,
//this sort of error would not occur.
//#GuardedBy("this")
private final Set<Integer> set = new HashSet<Integer>();
public synchronized void add(Integer i) {
set.add(i);
}
public synchronized void remove(Integer i) {
set.remove(i);
}
public void addTenThings() {
Random r = new Random();
for (int i = 0; i < 10; i++)
add(r.nextInt());
/*The string concatenation gets turned by the compiler into a call to StringBuilder.append(Object),
* which in turn invokes the collection's toString method - and the implementation of toString in
* the standard collections iterates the collection and calls toString on each element to
* produce a nicely formatted representation of the collection's contents. */
System.out.println("DEBUG: added ten elements to " + set);
}
}
If someone could help me understand that, I'd be grateful.
Here is how I think it could've been fixed:
public class HiddenIterator {
private final Set<Integer> set = Collections.synchronizedSet(new HashSet<Integer>());
public void add(Integer i) {
set.add(i);
}
public void remove(Integer i) {
set.remove(i);
}
public void addTenThings() {
Random r = new Random();
for (int i = 0; i < 10; i++)
add(r.nextInt());
// synchronizing in set's intrinsic lock
synchronized(set) {
System.out.println("DEBUG: added ten elements to " + set);
}
}
}
Or, as an alternative, one could keep synchronized keyword for add() and remove() methods. We'd be synchronizing on this in this case. Also, we'd have to add a synchronized block (again sync'ed on this) into addTenThings(), which would contain a single operation - logging with implicit iteration:
public class HiddenIterator {
private final Set<Integer> set = new HashSet<Integer>();
public synchronized void add(Integer i) {
set.add(i);
}
public synchronized void remove(Integer i) {
set.remove(i);
}
public void addTenThings() {
Random r = new Random();
for (int i = 0; i < 10; i++)
add(r.nextInt());
synchronized(this) {
System.out.println("DEBUG: added ten elements to " + set);
}
}
}
Collections.synchronizedSet() wraps the collection in an instance of an internal class called SynchronizedSet, extending SynchronizedCollection. Now let's look how's the SynchronizedCollection.toString() is implemented:
public String toString() {
synchronized (mutex) {return c.toString();}
}
Basically the iteration is still there, hidden in the c.toString() call, but it's already synchronized with all other methods of this wrapper collection. So you don't need to repeat the synchronization in your code.
Edited
synchronizedSet()::toString()
As Sergei Petunin pointed out rightly, the toString() method of Collections.synchronizedSet() internally takes care about synchronisation, so no manual synchronistion is necessary in this case.
external iteration on synchronizedSet()
once the iterator object is obtained, there is no guarantee that the other thread won't modify the collection once an iteration is being made.
In cases of external iteration, like using for-each or an Iterator, the approach with encapsulating that iteration in an synchronize(set) block is required/sufficient.
That's why the JavaDoc of Collections.synchronizedSet() states, that
It is imperative that the user manually synchronize on the returned
sorted set when iterating over it or any of its subSet, headSet, or
tailSet views.
SortedSet s = Collections.synchronizedSortedSet(new TreeSet());
...
synchronized (s) {
Iterator i = s.iterator(); // Must be in the synchronized block
while (i.hasNext())
foo(i.next());
}
manual synchronization
Your second version with the synchronized add/remove methods of the class HiddenIterator and synchronize(this) would work too, but it introduces unneccesarry overhead as adding/removing would be synchronized twice (by HiddenIterator and Collections.synchronizedSet(..).
However, in this case you could omit the Collections.synchronizedSet(..) as HiddenIterator takes care of all the synchronization required when accessing the private Set field.

Thread-safe serializable Collection with atomic replace

I am facing a problem in my program when multiple threads access the same server over RMI. The server contains a list as a cache and performs some expensive computation sometimes changing that list. After the computation finished the list will be serialized and sent to the client.
First Problem: if the list is changed while being serialized (e.g. by a different client requesting some data) a ConcurrentModificationException is (probably) thrown, resulting in a EOFException for the RMI call / the deserialization on the client-side.
Therefore I need a some kind of list-structure which is "stable" for serialization while possibly being changed by a different thread.
Solutions we tried:
regular ArrayList / Set - not working because of concurrency
deep-copying the entire structure before every serialization - faaar too expensive
CopyOnWriteArrayList - expensive as well since it copies the list and
revealing the Second Problem: we need to be able to atomically replace any element in the list which is currently not thread-safe (first delete, then add (which is even more expensive)) or only doable by locking the list and therefore only doing the different threads in sequence.
Therefore my question is:
Do you know of a Collection implementation which allows us to serialize the Collection thread-safe while other Threads modify it and which contains some way of atomically replacing elements?
A bonus would be if the list would not need to be copied before serialization! Creating a snapshot for every serialization would be okay, but still meh :/
Illustration of the problem (C=compute, A=add to list, R=remove from list, S=serialize)
Thread1 Thread2
C
A
A C
C A
S C
S R <---- Remove and add have to be performed without Thread1 serializing
S A <---- anything in between (atomically) - and it has to be done without
S S blocking other threads computations and serializations for long
S and not third thread must be allowed to start serializing in this
S in-between state
S
The simplest solution would be to imply external synchronization to the ArrayList, possibly via read-write lock like this:
public class SyncList<T> implements Serializable {
private static final long serialVersionUID = -6184959782243333803L;
private List<T> list = new ArrayList<>();
private transient Lock readLock, writeLock;
public SyncList() {
ReentrantReadWriteLock readWriteLock = new ReentrantReadWriteLock();
readLock = readWriteLock.readLock();
writeLock = readWriteLock.writeLock();
}
public void add(T element) {
writeLock.lock();
try {
list.add(element);
} finally {
writeLock.unlock();
}
}
public T get(int index) {
readLock.lock();
try {
return list.get(index);
} finally {
readLock.unlock();
}
}
public String dump() {
readLock.lock();
try {
return list.toString();
} finally {
readLock.unlock();
}
}
public boolean replace(T old, T newElement) {
writeLock.lock();
try {
int pos = list.indexOf(old);
if (pos < 0)
return false;
list.set(pos, newElement);
return true;
} finally {
writeLock.unlock();
}
}
private void writeObject(ObjectOutputStream out) throws IOException {
readLock.lock();
try {
out.writeObject(list);
} finally {
readLock.unlock();
}
}
#SuppressWarnings("unchecked")
private void readObject(ObjectInputStream in) throws IOException,
ClassNotFoundException {
list = (List<T>) in.readObject();
ReentrantReadWriteLock readWriteLock = new ReentrantReadWriteLock();
readLock = readWriteLock.readLock();
writeLock = readWriteLock.writeLock();
}
}
Provide any operations you like, just properly use either read-lock or write-lock.
My wrong initial thought was that the CopyOnWriteArrayList was a bad idea since it copies everything. But of course it does only perform a shallow copy, copying only the references, not a deep copy copying all Objects as well.
Therefore we clearly went with the CopyOnWriteArrayList because it already offered a lot of the needed functionality. The only remaining problem was the replace which even got more complex to be a addIfAbsentOrReplace.
We tried the CopyOnWriteArraySet but that did not fit our need because it only offered addIfAbsent. But in our case we had a instance of a class C called c1 which we needed to store and then replace with a updated new instance c2. Of course we overwrite equals and hashCode. Now we had to choose wether or not we wanted the equality to return true or false for the two only minimally different objects. Both options did not work, because
true would mean that the objects are the same and the set would not even bother adding the new object c2 because c1 already is in
false would mean c2 would be added but c1 would not be removed
Therefore CopyOnWriteArrayList. That list already offers a
public void replaceAll(UnaryOperator<E> operator) { ... }
which somewhat fits our needs. It lets us replace the object we need via custom comparison.
We utilized it in the following way:
protected <T extends OurSpecialClass> void addIfAbsentOrReplace(T toAdd, List<T> elementList) {
OurSpecialClassReplaceOperator<T> op = new OurSpecialClassReplaceOperator<>(toAdd);
synchronized (elementList) {
elementList.replaceAll(op);
if (!op.isReplaced()) {
elementList.add(toAdd);
}
}
}
private class OurSpecialClassReplaceOperator<T extends OurSpecialClass> implements UnaryOperator<T> {
private boolean replaced = false;
private T toAdd;
public OurSpecialClassReplaceOperator(T toAdd) {
this.toAdd = toAdd;
}
#Override
public T apply(T toAdd) {
if (this.toAdd.getID().equals(toAdd.getID())) {
replaced = true;
return this.toAdd;
}
return toAdd;
}
public boolean isReplaced() {
return replaced;
}
}

Problems with race conditions on ConcurrentHashMap

I got a multithreaded application in which n threads write to an ConcurrentHashMap. Another n Threads read from that Map and copy its Value to a copy List.
After that the original List is removed from the map.
For some reason I always get a ConcurrentModificationException.
I even tried to create my own lock mechanism with a volatile boolean, but it won't work. When using Google Guava with Lists.newLinkedList() i get a ConcurrentModificationException. When using the StandardWay new LinkedList(list) I get an ArrayOutOfBoundsException.
Here is the compiling code example:
public class VolatileTest {
public static Map<String, List<String>> logMessages = new ConcurrentHashMap<String, List<String>>();
public static AtomicBoolean lock = new AtomicBoolean(false);
public static void main(String[] args) {
new Thread() {
public void run() {
while (true) {
try {
if (!VolatileTest.lock.get()) {
VolatileTest.lock.set(true);
List<String> list = VolatileTest.logMessages.get("test");
if (list != null) {
List<String> copyList = Collections.synchronizedList(list);
for (String string : copyList) {
System.out.println(string);
}
VolatileTest.logMessages.remove("test");
}
VolatileTest.lock.set(false);
}
} catch (ConcurrentModificationException ex) {
ex.printStackTrace();
System.exit(1);
}
}
};
}.start();
new Thread() {
#Override
public void run() {
while (true) {
if (!VolatileTest.lock.get()) {
VolatileTest.lock.set(true);
List<String> list = VolatileTest.logMessages.get("test");
if (list == null) {
list = Collections.synchronizedList(new LinkedList<String>());
}
list.add("TestError");
VolatileTest.logMessages.put("test", list);
VolatileTest.lock.set(false);
}
}
}
}.start();
}
You have ConcurrentModificationException because you have your locking broken and the reader thread reads the same list (by Iterator) the writer writes to at the same time.
Your code looks like a try of lock-free coding. If so, you must use CAS operation like this:
while (!VolatileTest.lock.compareAndSet(false, true) { } // or while (VolatileTest.lock.getAndSet(true)) {} - try to get lock
try {
// code to execute under lock
} finally {
VolatileTest.lock.set(false); // unlock
}
Your
if (!VolatileTest.lock.get()) {
VolatileTest.lock.set(true);
...
}
is not atomic. Or you can use synchronized section or any other standard locking mechanism (ReadWriteLock, for instance)
Also, if you deal with a list for reading and writing using one lock, you don't have to use synchronized list then. And moreover, you don't need even ConcurrentHashMap.
So:
use one global lock and plain HashMap/ArrayList OR
remove your global lock, use ConcurrentHashMap and plain ArrayList with synchronized on each particular instance of the list OR
use a Queue (some BlockingQueue or ConcurrentLinkedQueue) instead of all of your current stuff OR
use something like Disruptor (http://lmax-exchange.github.io/disruptor/) for inter-thread communication with many options. Also, here is a good example of how to build lock-free queues http://psy-lob-saw.blogspot.ru/2013/03/single-producerconsumer-lock-free-queue.html
ConcurrentHashMap is fail safe meaning you will not encounter ConcurrentModificationException. It's your List<String> within the map where one of your thread tries to read the data while other thread is trying to remove the data while iterating.
I would suggest, you don't try locking on whole map operation, but instead look out for making thread safe access to list may be using Vector or SynchronizedList.
Also note, your entry condition if (!VolatileTest.lock) { for both the threads means they can both run at the same time as initially by default boolean would hold false value and both may try to work on same list at the same time.
As already mentioned the locking pattern does not look valid. It is better to use synchronized. The below code works for me
final Object obj = new Object();
and then
synchronized (obj){....} instead of if (!VolatileTest.lock) {.....}

How to block write access to the array from Thread while reading

I have two threads running parallel, and to get information about their internal results, I have created int array of length 8. With respect to their id, they can update relative area on the statu array. They are not let to write others area. Moreover, to correctly get and display statu array, I try to write getStatu method. While getting the result, I want to block others to write to the statu array; unfortunately, I donot get how to block other to write the statu array while I am getting and displaying result in getStatu method. How?
Note: If there is a part to cause misunderstood, tell me my friend, I will fix
class A{
Semaphore semaphore;
int [] statu; // statu is of length 8
void update(int idOfThread, int []statu_){
try {
semaphore.acquire();
int idx = idOfThread * 4;
statu[idx] = statu_[0];
statu[idx+1] = statu_[1];
statu[idx+2] = statu_[2];
statu[idx+3] = statu_[3];
} catch (...) {
} finally {
semaphore.release();
}
}
int[] getStatu(){
// Block write access of threads
// display statu array
// return statu array as return value
// release block, so threads can write to the array
}
}
Apart from using another lock/snc mechanism than Semaphore, just a proposal to improve this a little.
Putting both status[4] arrays into a single array[8] is not hte best solution. Consider task A writing its quadruplet: it must lock out task B reading the same, but there's no point in locking out task B writing B's quadruplet, and vice versa.
Generally speaking, the granularity of what is being locked is one important factor: locking the entire database is nonsense (except for overall processing like backup), however locking individual fields of a record would produce excessive overhead.
There are possibly better ways to get to where you want to, but only you know what you are trying to do. Going with your own scheme, there are things you are doing wrong. First thing, currently you are not achieving the granular locking you are planning to. For that you must have an array of semaphores. So the acquisition will look something like
semaphore[idOfThread].acquire();
Secondly, one thing you've not realised is that controlled access to data among threads is a co-operative activity. You cannot lock on one thread and not care to deal with locking on another and somehow impose the access control.
So unless the caller of your getStatu() will use the same set of semaphores when inspecting the array, your best bet is for getStatu() to make a new int[] array, copying segments of each thread after locking with the respective semaphore. So the array returned by getStatu() will be a snapshot at the point of call.
Please try the below code it will work for you. call afterStatu() in it.
class A {
Semaphore semaphore;
int[] statu; // statu is of length 8
private boolean stuck;
public A() {
}
void update(int idOfThread, int[] statu_) {
// if true, control will not go further
while (stuck);
try {
semaphore.acquire();
int idx = idOfThread * 4;
statu[idx] = statu_[0];
statu[idx + 1] = statu_[1];
statu[idx + 2] = statu_[2];
statu[idx + 3] = statu_[3];
} catch (Exception e) {
} finally {
semaphore.release();
}
}
int[] getStatu() {
// Block write access of threads
stuck = true;
// display statu array
for (int eachStatu : statu) {
System.out.println(eachStatu);
}
// return statu array as return value
return statu;
}
public void afterStatu() {
getStatu();
// release block, so threads can write to the array
stuck = false;
}
}
ReentrantReadWriteLock lock = new ReentrantReadWriteLock();
int[] statu;
void update() {
lock.writeLock().lock();
try {
// update statu
} finally {
lock.writeLock().unlock();
}
}
int[] getStatu() {
lock.readLock().lock();
try {
// return statu
} finally {
lock.readLock().unlock();
}
}
Like ac3 said, only you know what you are trying to do.
Here's a solution that might be useful in the case where every thread that calls update() does so frequently, and calls to getStatu() are infrequent. It's complex, but it allows most of the update() calls to happen without any locking at all.
static final int NUMBER_OF_WORKER_THREADS = ...;
final AtomicReference<CountDownLatch> pauseRequested = new AtomicReference<CountDownLatch>(null);
final Object lock = new Object();
int[] statu = ...
//called in "worker" thread.
void update() {
if (pauseRequested.get() != null) {
pause();
}
... update my slots in statu[] array ...
}
private void pause() {
notifyMasterThatIAmPaused();
waitForMasterToLiftPauseRequest();
}
private void notifyMasterThatIAmPaused() {
pauseRequested.get().countDown();
}
private void waitForMasterToLiftPauseRequest() {
synchronized(lock) {
while (pauseRequested.get() != null) {
lock.wait();
}
}
}
//called in "master" thread
int[] getStatu( ) {
int[] result;
CountDownLatch cdl = requestWorkersToPause();
waitForWorkersToPause(cdl);
result = Arrays.copyOf(statu, statu.length);
liftPauseRequest();
return result;
}
private CountDownLatch requestWorkersToPause() {
cdl = new CountDownLatch(NUMBER_OF_WORKER_THREADS);
pauseRequested.set(cdl);
return cdl;
}
private void waitForWorkersToPause(CountDownLatch cdl) {
cdl.await();
}
private void liftPauseRequest() {
synchronized(lock) {
pauseRequested.set(null);
lock.notifyAll();
}
}

java synchronized block for more than 1 objects?

I have two arrays, and I need to synchronize access to them across threads. I am going to put them in a synchronized block. The problem is, I can pass only one of them to 'synchronized' st one go.
How do I ensure that the access to both the arrays is synchronized?
Do I put them in a class and create an object of it?
Or I access the other array only in the synchronized block, and this takes care of synchronized access to it?
Thanks,
Whatever you do don't do this:
synchronized (array1) {
synchronized (array2) {
// do stuff
}
}
This is likely to lead to deadlock unless you are very careful. If you do this approach, you must ensure you have an unchanging partial order on the objects - Google "Dining Philosophers" for discussion of the pitfalls.
Basically what you have to do is create one lock object that you will use if you want to access either array and then use that for all array access. It's coarse-grained but safe. You could do it this way:
public static class TwoArrays {
private int[] array1 = ...
private int[] array2 = ...
private final Object LOCK = new Object();
public void doUpdate() {
synchronized (LOCK) {
...
}
}
}
If you need a finer-grained method you want to use the Java 5+ concurrent utilities such as ReadWriteLock but this will be more complicated to implement and error-prone.
Prior to Java 5, I'd have written things like this:
// pre Java 5 code:
Object lock = new Object();
// ...
synchronized(lock) {
// do something that requires synchronized access
}
But since Java 5, I'd use classes from java.util.concurrent.locks (personally, I don't find this more complicated or error-prone):
// Java 5 Code Using Locks
Lock lock = // ...
lock.lock();
try {
// do something that requires synchronized access
}
finally {
lock.unlock();
}
If you need read-write locking, here is example implemented using read-write locks from Java 5:
private ReadWriteLock rwl = new ReentrantReadWriteLock();
private Lock rLock = rwl.readLock();
private Lock wLock = rwl.writeLock();
private List<String> data = new ArrayList<String>();
public String getData(int index) {
rLock.lock();
try {
return data.get(index);
} finally {
rLock.unlock();
}
}
public void addData(int index, String element) {
wLock.lock();
try {
data.add(index, element);
} finally {
wLock.unlock();
}
}
Of course, adapt it to suit your needs.

Categories

Resources