Establishing Happens-Before in Arrays - java

Have a quick synchronization question, here is what I have:
a) Class1 has a concurrent hash map defined as follows:
ConcurrentMap<String, int[][]> map = new ConcurrentHashMap<String, int[][]>();
b) Class2 has a thread, called Thread1. Thread1 creates an Id and checks if map contains it. If it does, it retrieves the value (int[][]), modifies the contents and puts it back. If it doesn't, it creates a new int[][] and stores it. This process of check->modify/create happens frequently.
private class Thread1 implements Runnable{
public void run(){
//keepRunning is volatile
while( keepRunning ){
String id = "ItemA";
int[][] value = map.get(id);
//If value is null, create an int[][] and put it back as value for Id
//If value is not null, modify the contents according to some logic
}
}
}
c) Finally, I have another thread, called Thread2. This thread takes an id, checks if the map has a value for it. If it doesn't, nothing happens. If it does, then it sums up the values in int[][] and uses the number for some calculation (no modifications here).
I am trying to figure out if my operations are atomic.
Operations in b) are fine as the creation/modification of the array and insertion into the map is confined to only one thread (Thread1).
Also, since insertion into the map establishes a happens-before action, this will ensure that c) will see the updated values in int[][].
However, I am not too sure of what would happen if Thread2 looks up the same int[][] in the map and tries to sum it up while Thread1 is modifying it.
Am I correct in thinking that Thread2 will see the older (but uncorrupted) values in int[][]. Reason being that until Thread1 has finished putting the value back into the map, the new modifications wont be visible to Thread2.
Thanks very much.

Your operations are not atomic, thread 2 will be trying to sum the values while thread 1 is modifying them.
To avoid that you'll need to duplicate the original modify the duplicate and put back the copy.

Related

Is it possible to synchronize ConcurrentHashMap update by it's key?

I have tons of update operation on ConcurrentHashMap and I need to minimize synchronization for update this map.
please see code below.
public ConcurrentHashMap<String, Integer> dataMap = new ConcurrentHashMap<>();
public void updateData(String key, int data) {
if ( dataMap.containsKey(key)) {
// do something with previous value and new value and update it
}
}
For example, When one thread calls updateData with Key "A", then every other try calling updateData with key "A" should be blocked until first thread is done. meanwhile, I want another thread trying to call updateData with key "B" runs concurrently.
I wonder if there is any fancy way to lock hashMap simply with its key.
I think you are looking for the compute method.
It takes a function that is given the key and the current value (or null if there is no value yet), and can compute the new value.
ConcurrentHashMap guarantees that only one such functions runs at the same time for the given key. A concurrent second call will block until ready.

Data Races on individual elements of an AtomicReference

I had a question related to accessing individual elements via an Atomic Reference.
If I have an IntegerArray and an atomic reference to it;will reading and writing to individual elements of the array via the AtomicReference variable cause data races?
In the code below: num is an Integer Array with aRnumbers being the atomic reference to the array.
In threads 1 and 2; I access aRnumbers.get()[1] and increment it by 1.
I am able to access individual elements via the atomic reference without data races to accurate results each time with 22 as the output of aRnumbers.get()[1] in the main thread after both threads complete.
But,since the atomic reference is defined on the array and not on the individual elements; shouldn't there be a data race in this case leading to 21/22 as the output?
Isn't having data races in this case the motivation for having a AtomicIntegerArray data structure which provides a separate AtomicReference to each element?
Please find below the java code that i am trying to run.Could anyone kindly let me know where I am going wrong.
import java.util.concurrent.atomic.AtomicReference;
public class AtomicReferenceExample {
private static int[] num= new int[2];
private static AtomicReference<int[]> aRnumbers;
public static void main(String[] args) throws InterruptedException {
Thread t1 = new Thread(new MyRun1());
Thread t2 = new Thread(new MyRun2());
num[0]=10;
num[1]=20;
aRnumbers = new AtomicReference<int[]>(num);
System.out.println("In Main before:"+aRnumbers.get()[0]+aRnumbers.get()[1]);
t1.start();
t2.start();
t1.join();
t2.join();
System.out.println("In Main after:"+aRnumbers.get()[0]+aRnumbers.get()[1]);
}
static class MyRun1 implements Runnable {
public void run() {
System.out.println("In T1 before:"+aRnumbers.get()[1]);
aRnumbers.get()[1]=aRnumbers.get()[1]+1;
}
}
static class MyRun2 implements Runnable {
public void run() {
System.out.println("In T2 before:"+aRnumbers.get()[1]);
aRnumbers.get()[1]=aRnumbers.get()[1]+1;
}
}
}
shouldn't there be a data race in this case leading to 21/22 as the output?
Indeed there is. Your thread are so short lived that most likely they are not running at the same time.
Isn't having data races in this case the motivation for having a AtomicIntegerArray data structure which provides a separate AtomicReference to each element?
Yes, it is.
Could anyone kindly let me know where I am going wrong.
Starting a thread takes 1 - 10 milli-seconds.
Incrementing a value like this even without the code being JITed is likely to take << 50 microseconds. If it was optimised it would take about 50 - 200 nano-seconds per increment.
As starting athread takes about 20 - 200x longer than the operating they won't be running at the same time so there is no race condition.
Try incrementing the value a few million times, so you have a race condition because both threads are running at the same time.
Incrementing an element consists of three steps:
Reading the value.
Incrementing the value.
Writing the value back.
A race condition can occur. Take an example: Thread 1 reads the value (let's say 20). Task switch. Thread 2 reads the value (20 again), increments it and writes it back (21). Task switch. The first thread increments the value and writes it back (21). So while 2 incrementing operations took place, the final value is still incremented only by one.
The data structure does not help in this case. A thread safe collection helps keeping the structure consistent when concurrent threads are adding, accessing and removing elements. But here you need to lock access to an element during the three steps of the increment operation.

Can we use AtomicInteger as a local variable in a method and achieve thread safety?

public void tSafe(List<Foo> list, Properties status) {
if(list == null) return;
String key = "COUNT";
AtomicInteger a = new AtomicInteger(Integer.valueOf(status.getProperty(key,"0")));
list.parallelStream().filter(Foo::check).
forEach(foo -> {status.setProperty(key, String.valueOf(a.incrementAndGet())); }
);
}
private interface Foo {
public boolean check();
}
Description:
In the above example, status is a shared properties and it contains a key with name COUNT. My aim is to increment count and put it back in properties to count the number of checks performed. Consider tSafe method is being called by multiple threads, Do I get the correct count at the end? Note that I've used AtomicInteger a as local variable.
If you only have one thread, this will work, however if you have more than one thread calling this, you have some operations which are thread safe. This will be fine provided each thread operates on different list and status objects.
As status is a thread safe collection, you can lock it, and provided the list is not changed in another thread, this would would.
In general, working with String as numbers in a thread safe manner is very tricky to get right. You are far better off making your value thread i.e. an AtomicInteger and never anything else.
No this will not guarantee thread safety. Even though incrementAndGet() is itself atomic, getting a value from the Properties object and setting it back is not.
Consider the following scenario:
Thread #1 gets a value from the Properties object. For argument's sake let's say it's "100".
Thread #2 gets a value from the Properties object. Since nothing has happened, this value is still "100".
Thread #1 creates an AtomicInteger, increments it, and places "101" in the Properties object.
Thread #2 does exactly the same, and places "101" in the Properties object, instead of the 102 you expected.
EDIT:
On a more productive note, a better approach would be to just store the AtomicInteger on your status map, and increment it inplace. That way, you have a single instance and don't have to worry about races as described above. As the Properties class extends Hashtable<Object, Object> this should technically work, although Properties really isn't intended for values that aren't Strings, and you'd be much better off with a modern thread safe Map implementation, such as a ConcurrentHashMap:
public void tSafe(List<Foo> list, ConcurrentMap<String, AtomicInteger> status) {
if(list == null) {
return;
}
String key = "COUNT";
status.putIfAbsent(key, new AtomicInteger(0));
list.parallelStream()
.filter(Foo::check)
.forEach(foo -> { status.get(ket).incrementAndGet(); });
}

java Volatile/synchronization on arraylist

My program looks like this:
public class Main {
private static ArrayList<T> list;
public static void main(String[] args) {
new DataListener().start();
new DataUpdater().start();
}
static class DataListener extends Thread {
#Override
public void run() {
while(true){
//Reading the ArrayList and displaying the updated data
Thread.sleep(5000);
}
}
}
static class DataUpdater extends Thread{
#Override
public void run() {
//Continuously receive data and update ArrayList;
}
}
}
In order to use this ArrayList in both threads, I know two options:
To make the ArrayList volatile. However I read in this article that making variables volatile is only allowed if it "Writes to the variable do not depend on its current value." which I think in this case it does (because for example when you do an add operation on an ArrayList, the contents of the ArrayList after this operation depend on the current contents of the ArrayList, or doesn't it?). Also the DataUpdater has to remove some elements from the list every now and then, and I also read that editing a volatile variable from different threads is not possible.
To make this ArrayList a synchronized variable. However, my DataUpdater will continuously update the ArrayList, so won't this block the DataListener from reading the ArrayList?
Did I misunderstand any concepts here or is there another option to make this possible?
Volatile won't help you at all. The meaning of volatile is that changes made by thread A to a shared variable are visible to thread B immediately. Usually such changes may be in some cache visible only to the thread that made them, and volatile just tells the JVM not to do any caching or optimization that will result in the value update being delayed.
So it is not a means of synchronization. It's just a means of ensuring visibility of change. Moreover, it's change to the variable, not to the object referenced by that variable. That is, if you mark list as volatile, it will only make any difference if you assign a new list to list, not if you change the content of the list!
Your other suggestion was to make the ArrayList a synchronized variable. There is a misconception here. Variables can't be synchronized. The only thing that can be synchronized is code - either an entire method or a specific block inside it. You use an object as the synchronization monitor.
The monitor is the object itself (actually, it's a logical part of the object that is the monitor), not the variable. If you assign a different object to the same variable after synchronizing on the old value, then you won't have your old monitor available.
But in any case, it's not the object that's synchronized, it's code that you decided to synchronize using that object.
You can therefore use the list as the monitor for synchronizing the operations on it. But you can not have list synchronized.
Suppose you want to synchronize your operations using the list as a monitor, you should design it so that the writer thread doesn't hold the lock all the time. That is, it just grabs it for a single read-update, insert, etc., and then releases it. Grabs it again for the next operation, then releases it. If you synchronize the whole method or the whole update loop, the other thread will never be able to read it.
In the reading thread, you should probably do something like:
List<T> listCopy;
synchronized (list) {
listCopy = new ArrayList(list);
}
// Use listCopy for displaying the value rather than list
This is because displaying is potentially slow - it may involve I/O, updating GUI etc. So to minimize the lock time, you just copy the values from the list, and then release the monitor so that the updating thread can do its work.
Other than that, there are many types of objects in the java.util.concurrent package etc. that are designed to help in situations like this, where one side is writing and the other is reading. Check the documentation - perhaps a ConcurrentLinkedDeque will work for you.
Indeed, none of the two solutions is sufficient. You actually need to synchronize the complete iteration on the arraylist, and every write access to the arraylist:
synchronized(list) {
for (T t : list) {
...
}
}
and
synchronized(list) {
// read/add/modify the list
}
make the ArrayList volatile.
You can't make an ArrayList volatile. You can't make any object volatile. The only things in Java that can be volatile are fields.
In your example, list is not an ArrayList.
private static ArrayList<T> list;
list is a static field of the Main class.
The volatile keyword only matters when one thread updates the field, and another thread subsequently accesses the field.
This line updates the list, but does not update the volatile field:
list.add(e);
After executing that line, the list has changed, but the field still refers to the same list object.

Why does HashMap.get(key) needs to be synchronized when change operations are synchronized?

I use the .get(...), .put(...) and .clear() operations from multiple threads on one HashMap. .put(...) and .clear() are inside a synchronized block but .get(...) is not. I can't imagine that this will cause problems but in other code I've seen .get() is pretty much always synchronized.
relevant code for get/put
Object value = map.get(key);
if(value == null) {
synchronized (map) {
value = map.get(key); // check again, might have been changed in between
if(value == null) {
map.put(key, new Value(...));
}
}
}
and clear is just:
synchronized (map) {
map.clear();
}
The write operations will invalidate caches because of the synchronized and the get(...) returns either null or an instance. I can't really see what could go wrong or what would improve by putting the .get(...) operation into a synchronized(map) block.
Here is one simple scenario that would produce a problem on unsynchronized get:
Thread A starts get, computes the hash bucket number, and gets pre-empted
Thread B calls clear(), so a smaller array of buckets gets allocated
Thread A wakes up, and may run into the index-out-of-bounds exception
Here is a more complex scenario:
Thread A locks the map for an update, and gets pre-empted
Thread B initiates a get operation, computes the hash bucket number, and gets pre-empted
Thread A wakes up, and continues the put, and realizes that the buckets need resizing
Thread A allocates new buckets, copies old content into them, and adds the new item
Thread B wakes up, and continues the search using the old bucket index on the new array of buckets.
At this point, A is probably not going to find the right item, because it is very likely to be in a hash bucket at a different index. That is why get needs to be synchronized as well.

Categories

Resources