Java fork join concurrent HashMap solve lost updates - java

I have a list of users and each user has a sequence of places he has visited (e.g. list = 1,2,3,1,2,8,10,1...usw.). Now I want figure out how often each place has been visited. Futhermore, I really want to take fork/join for that. Now my acutal question is, do you know a way to use the concurrentHashMap here, because the current problem is that there are lost updates at
map.put(i, map.get(i)+1);// lost updates here
Do you have a nice idea to solve that without locking the whole map (is there are partial lock for parts of the map as it is for put()?). I know, I could create a map for each user then join them again, but I thought, perhaps someone has a better solution.
public class ForkUsers extends RecursiveAction{
ArrayList<User>users;
ConcurrentHashMap<Integer,Integer>map;
int indexfrom;
int indexto;
ForkUsers(ArrayList<User>users,ConcurrentHashMap<Integer,Integer> map,int indexfrom,int indexto){
this.users=users;
this.map=map;
this.indexfrom=indexfrom;
this.indexto=indexto;
}
void computeDirectly(User user){
for(Integer i:user.getVisitedPlaces()){
if(map.get(i)==null){
map.putIfAbsent(i, 1);
}else{
map.put(i, map.get(i)+1);// lost updates here
}
}
}
protected void compute() {
if(indexfrom==indexto){
computeDirectly(users.get(indexfrom));
}else{
int half=(indexfrom+indexto)/2;
invokeAll(new ForkUsers(users,map,indexfrom,half),new ForkUsers(users,map,half+1,indexto));
}
}
}

Even though you're using a ConcurrentHashMap, that doesn't prevent read-update-write race conditions; both threads call get, then both add 1, then both put the value with just the single update back. You can either synchronize the whole read-update-write operation or (my preference) use an AtomicInteger for the value and use incrementAndGet instead.

Related

Thread safety writing in a map

I have a block of code provided below:
Map<String, BigDecimal> salesMap = new HashMap<>();
orderItems.parallelStream().forEach(orderItem -> {
synchronized (this) {
int itemId = orderItem.getItemId();
Item item = settingsClient.getItemByItemId(itemId);
String revenueCenterName = itemIdAndRevenueCenterNameMap.get(itemId);
updateSalesMap(salesMap, "Gross Sales: " + revenueCenterName, orderItem.getNetSales().toPlainString());
}
});
private void updateSalesMap(Map<String,BigDecimal> salesMap, String key, String amount) {
BigDecimal bd = getSalesAmount(salesMap, key);
int scale = 2;
if (StringUtils.isBlank(amount)) {
amount = "0.00";
}
BigDecimal addMe = BigDecimal.valueOf(Double.valueOf(amount)).setScale(scale, RoundingMode.HALF_UP);
salesMap.put(key, bd.add(addMe));
}
The code works fine, but if I don't use the synchronized block, it will end of varying data in the map. As far I know, the streams are thread safe, so I get curious about whats happening. I tried to use ConcurrentHashMap but it seems nothing changed.
My idea is the map data is not written in the RAM and read/ write is done in the thread cache and hence, we end up having various data.
Is it correct? If so, I will use volatile keyword then using a synchronized block.
Note: just find that I cant declare a variable volatile inside a method.
As far I know, the streams are thread safe, so I get curious about whats happening.
They are. As long as you only operate on the stream itself. The problem is that you try to manipulate other variable at the same time (map in this case). The idea of streams is that operations on each of elements are totally independent - check idea of funcional programming.
I tried to use ConcurrentHashMap but it seems nothing changed.
The issue comes from your approach. The general idea is that atomic operations on ConcurrentHashMap are thread safe. However, if you perform two thread safe operations together, it won't be atomic and thread safe. You need to synchronize it yourself or come up with some other solution.
In updateSalesMap() method you first get value from the map, do some calculations and then update the value. This sequence of operations isn't atomic - performing them on ConcurrentHashMap won't change much.
One of possible ways to achieve concurrency in this case would be to utilize CuncurrentHashMap.compute() Javadocs
You are doing read operation using getSalesAmount(salesMap, key) and write operation using salesMap.put(key, bd.add(addMe)), in separate statements. The non-atomicity of this breakup of these operations is not going to change, irrespective of the kind of Map, you use. The synchronized block will solve this ofcourse.
Alternatively, You can use ConcurrentHashMap's compute(K key, BiFunction<? super K, ? super V, ? extends V> remappingFunction), for the kind of atomicity, you are looking for.
I make the updateSalesMap thread-safe and that works for me:
protected synchronized void updateSalesMap(Map<String, BigDecimal> salesMap, String s, String amount) {
BigDecimal bd = updateSalesAmount(salesMap, s);
int scale = 2;
if (StringUtils.isBlank(amount)) {
amount = "0.00";
}
BigDecimal addMe = BigDecimal.valueOf(Double.valueOf(amount)).setScale(scale, RoundingMode.HALF_UP);
salesMap.put(s, bd.add(addMe));
}

Is following code Thread safe

I have a scenario where i have to maintain a Map which can be populated by multiple threads ,each modifying there respective List (unique identifier/key being thread name) and when the list size for a thread exceeds a fixed batch size we have to persist the records in DB.
Sample code below:
private volatile ConcurrentHashMap<String, List<T>> instrumentMap = new ConcurrentHashMap<String, List<T>>();
private ReadWriteLock lock ;
public void addAll(List<T> entityList, String threadName) {
try {
lock.readLock().lock();
List<T> instrumentList = instrumentMap.get(threadName);
if(instrumentList == null) {
instrumentList = new ArrayList<T>(batchSize);
instrumentMap.put(threadName, instrumentList);
}
if(instrumentList.size() >= batchSize -1){
instrumentList.addAll(entityList);
recordSaver.persist(instrumentList);
instrumentList.clear();
} else {
instrumentList.addAll(entityList);
}
} finally {
lock.readLock().unlock();
}
}
There is one more separate thread running after every 2 minutes to persist all the records in Map (to make sure we have something persisted after every 2 minutes and map size does not gets too big) and when it starts it block all other threads (check the readLock and writeLock usawhere writeLock has higher priority)
if(//Some condition) {
Thread.sleep(//2 minutes);
aggregator.getLock().writeLock().lock();
List<T> instrumentList = instrumentMap .values().stream().flatMap(x->x.stream()).collect(Collectors.toList());
if(instrumentList.size() > 0) {
saver.persist(instrumentList);
instrumentMap .values().parallelStream().forEach(x -> x.clear());
aggregator.getLock().writeLock().unlock();
}
This solution is working fine almost for every scenario we tested except sometime we see some of the records went missing i.e. not persisted at all although they were added fine in Map
My question is what is the problem with this code?
Is ConcurrentHashMap not the best solution here?
Does usage of read/write lock has some problem here?
Should i go with sequential processing?
No, it's not thread safe.
The problem is that you are using the read lock of the ReadWriteLock. This doesn't guarantee exclusive access for making updates. You'd need to use the write lock for that.
But you don't really need to use a separate lock at all. You can simply use the ConcurrentHashMap.compute method:
instrumentMap.compute(threadName, (tn, instrumentList) -> {
if (instrumentList == null) {
instrumentList = new ArrayList<>();
}
if(instrumentList.size() >= batchSize -1) {
instrumentList.addAll(entityList);
recordSaver.persist(instrumentList);
instrumentList.clear();
} else {
instrumentList.addAll(entityList);
}
return instrumentList;
});
This allows you to update items in the list whilst also guaranteeing exclusive access to the list for a given key.
I suspect that you could split the compute call into computeIfAbsent (to add the list if one is not there) and then a computeIfPresent (to update/persist the list): the atomicity of these two operations is not necessary here. But there is no real point in splitting them up.
Additionally, instrumentMap almost certainly shouldn't be volatile. Unless you really want to reassign its value (given this code, I doubt that), remove volatile and make it final.
Similarly, non-final locks are questionable too. If you stick with using a lock, make that final too.

Multithread communication: how good is the use of Atomic Variables like AtomicInteger? why is there no AtomicFloat?

Intro:
I want to create a multithreaded android app. My problem is the communication between the threads. I read about communication between threads and I came across stuff like Looper/Handler design, which seemed quite involved and Atomic Variables like AtomicInteger. For now, I used AtomicInteger as a communication but since I am not very experienced in Java, I am not sure if that is bad in my case/ if there is a better solution for my particular purpose. Also I got a little suspicious of my method, when I noticed I need actually something like AtomicFloat, but it's not existing. I felt like I am missusing the concept. I also found that you can make yourself an AtomicFloat, but I am just not sure if I am on the right way or if there is a better technique.
Question:
Is it ok/good to use Atomic Variables and implement also AtomicFloat for my particular purpose (described below) or is there a better way of handling the communication?
Purpose/Architecture of the App using AtomicVariables so far:
I have 4 Threads with the following purpose:
1.SensorThread: Reads sensor data and saves the most recent values in AtomicVariables like
AtomicFloat gyro_z,AtomicFloat gyro_y, ...
2.CommunicationThread: Communication with the PC, interprets commands which come form the socket and set the state of the app in terms of a AtomicInteger:
AtomicInteger state;
3.UIThread: Displays current sensor values from
AtomicFloat gyro_z,AtomicFloat gyro_y,
4.ComputationThread: uses sensor values AtomicFloat gyro_z,AtomicFloat gyro_y, ... and state AtomicInteger state to perform calculation and send commands over USB.
You basically have a readers writers problem, with two readers and (for the moment) only one writer. If you just want to pass simple types between threads, an AtomicInteger or a similarly implemented AtomicFloat will be just fine.
However, a more accommodating solution, which would enable you to work with more complex data types would be a ReadWriteLock protecting the code where you read or write your object data:
e.g.:
private ReadWriteLock readWriteLock = new ReentrantReadWriteLock(); //the reentrant impl
....
public void readMethod() {
readWriteLock.readLock().lock();
try {
//code that simply _reads_ your object
} finally {
readWriteLock.readLock().unlock();
}
}
public void writeMethod() {
readWriteLock.writeLock().lock();
try {
//... code that modifies your shared object / objects
} finally {
readWriteLock.writeLock().unlock();
}
}
This will only enable "one writer-only" or "multiple reader" scenarios for access to your shared objects.
This would enable you for example to work with a complex type that looks like this:
public class SensorRead {
public java.util.Date dateTimeForSample;
public float value;
}
While using this data type you should care if the two fields are set and modified safely and atomically. The AtomicXXX type objects are not useful anymore.
You have to first ask yourself if you truly need the functionality of a theoretical AtomicFloat. The only benefit you could have over a simple volatile float is the compareAndSet and the addAndGet operations (since I guess increment and decrement don't really make sense in the case of floats).
If you really need those, you could probably implement them by studying the code of AtomicInteger e.g.:
public final int addAndGet(int delta) {
for (;;) {
int current = get();
int next = current + delta;
if (compareAndSet(current, next))
return next;
}
}
Now the only problem here is that compareAndSet uses platform-specific calls that don't exist for floats, so you'll probably need to emulate it by using the Float.floatToIntBits method to obtain an int, then use the CAS of AtomicInteger, something like:
private volatile float value;
public final boolean compareAndSet(float expect, float next) {
AtomicInteger local = new AtomicInteger();
for(;;) {
local.set(Float.floatToIntBits(value));
if(local.compareAndSet(Float.floatToIntBits(expect),
Float.floatToIntBits(next)) {
set(Float.intBitsToFloat(local.get()));
return true;
}
}
}
public final float addAndGet(float delta) {
for (;;) {
float current = get();
float next = current + delta;
if (compareAndSet(current, next))
return next;
}
}

Atomically incrementing counters stored in ConcurrentHashMap

I would like to collect some metrics from various places in a web app. To keep it simple, all these will be counters and therefore the only modifier operation is to increment them by 1.
The increments will be concurrent and often. The reads (dumping the stats) is a rare operation.
I was thinking to use a ConcurrentHashMap. The issue is how to increment the counters correctly. Since the map doesn't have an "increment" operation, I need to read the current value first, increment it than put the new value in the map. Without more code, this is not an atomic operation.
Is it possible to achieve this without synchronization (which would defeat the purpose of the ConcurrentHashMap)? Do I need to look at Guava ?
Thanks for any pointers.
P.S.
There is a related question on SO (Most efficient way to increment a Map value in Java) but focused on performance and not multi-threading
UPDATE
For those arriving here through searches on the same topic: besides the answers below, there's a useful presentation which incidentally covers the same topic. See slides 24-33.
In Java 8:
ConcurrentHashMap<String, LongAdder> map = new ConcurrentHashMap<>();
map.computeIfAbsent("key", k -> new LongAdder()).increment();
Guava's new AtomicLongMap (in release 11) might address this need.
You're pretty close. Why don't you try something like a ConcurrentHashMap<Key, AtomicLong>?
If your Keys (metrics) are unchanging, you could even just use a standard HashMap (they are threadsafe if readonly, but you'd be well advised to make this explicit with an ImmutableMap from Google Collections or Collections.unmodifiableMap, etc.).
This way, you can use map.get(myKey).incrementAndGet() to bump statistics.
Other than going with AtomicLong, you can do the usual cas-loop thing:
private final ConcurrentMap<Key,Long> counts =
new ConcurrentHashMap<Key,Long>();
public void increment(Key key) {
if (counts.putIfAbsent(key, 1)) == null) {
return;
}
Long old;
do {
old = counts.get(key);
} while (!counts.replace(key, old, old+1)); // Assumes no removal.
}
(I've not written a do-while loop for ages.)
For small values the Long will probably be "cached". For longer values, it may require allocation. But the allocations are actually extremely fast (and you can cache further) - depends upon what you expect, in the worst case.
Got a necessity to do the same.
I'm using ConcurrentHashMap + AtomicInteger.
Also, ReentrantRW Lock was introduced for atomic flush(very similar behavior).
Tested with 10 Keys and 10 Threads per each Key. Nothing was lost.
I just haven't tried several flushing threads yet, but hope it will work.
Massive singleusermode flush is torturing me...
I want to remove RWLock and break down flushing into small pieces. Tomorrow.
private ConcurrentHashMap<String,AtomicInteger> counters = new ConcurrentHashMap<String, AtomicInteger>();
private ReadWriteLock rwLock = new ReentrantReadWriteLock();
public void count(String invoker) {
rwLock.readLock().lock();
try{
AtomicInteger currentValue = counters.get(invoker);
// if entry is absent - initialize it. If other thread has added value before - we will yield and not replace existing value
if(currentValue == null){
// value we want to init with
AtomicInteger newValue = new AtomicInteger(0);
// try to put and get old
AtomicInteger oldValue = counters.putIfAbsent(invoker, newValue);
// if old value not null - our insertion failed, lets use old value as it's in the map
// if old value is null - our value was inserted - lets use it
currentValue = oldValue != null ? oldValue : newValue;
}
// counter +1
currentValue.incrementAndGet();
}finally {
rwLock.readLock().unlock();
}
}
/**
* #return Map with counting results
*/
public Map<String, Integer> getCount() {
// stop all updates (readlocks)
rwLock.writeLock().lock();
try{
HashMap<String, Integer> resultMap = new HashMap<String, Integer>();
// read all Integers to a new map
for(Map.Entry<String,AtomicInteger> entry: counters.entrySet()){
resultMap.put(entry.getKey(), entry.getValue().intValue());
}
// reset ConcurrentMap
counters.clear();
return resultMap;
}finally {
rwLock.writeLock().unlock();
}
}
I did a benchmark to compare the performance of LongAdder and AtomicLong.
LongAdder had a better performance in my benchmark: for 500 iterations using a map with size 100 (10 concurrent threads), the average time for LongAdder was 1270ms while that for AtomicLong was 1315ms.

Lock Free Array Element Swapping

In multi-thread environment, in order to have thread safe array element swapping, we will perform synchronized locking.
// a is char array.
synchronized(a) {
char tmp = a[1];
a[1] = a[0];
a[0] = tmp;
}
Is it possible that we can make use of the following API in the above situation, so that we can have a lock free array element swapping? If yes, how?
http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/atomic/AtomicReferenceFieldUpdater.html#compareAndSet%28T,%20V,%20V%29
Regardless of API used you won't be able to achieve both thread-safe and lock-free array element swapping in Java.
The element swapping requires multiple read and update operations that need to be performed atomically. To simulate the atomicity you need a lock.
EDIT:
An alternative to lock-free algorithm might be micro-locking: instead of locking the entire array it’s possible to lock only elements that are being swapped.
The value of this approach fully is questionable. That is to say if the algorithm that requires swapping elements can guarantee that different threads are going to work on different parts of the array then no synchronisation required.
In the opposite case, when different threads can actually attempt swapping overlapping elements then thread execution order will matter. For example if one thread tries to swap elements 0 and 1 of the array and the other simultaneously attempts to swap 1 and 2 then the result will depend entirely on the order of execution, for initial {‘a’,’b’,’c’} you can end up either with {‘b’,’c’,’a’} or {‘c’,’a’,’b’}. Hence you’d require a more sophisticated synchronisation.
Here is a quick and dirty class for character arrays that implements micro locking:
import java.util.concurrent.atomic.AtomicIntegerArray;
class SyncCharArray {
final private char array [];
final private AtomicIntegerArray locktable;
SyncCharArray (char array[])
{
this.array = array;
// create a lock table the size of the array
// to track currently locked elements
this.locktable = new AtomicIntegerArray(array.length);
for (int i = 0;i<array.length;i++) unlock(i);
}
void swap (int idx1, int idx2)
{
// return if the same element
if (idx1==idx2) return;
// lock element with the smaller index first to avoid possible deadlock
lock(Math.min(idx1,idx2));
lock(Math.max(idx1,idx2));
char tmp = array[idx1];
array [idx1] = array[idx2];
unlock(idx1);
array[idx2] = tmp;
unlock(idx2);
}
private void lock (int idx)
{
// if required element is locked when wait ...
while (!locktable.compareAndSet(idx,0,1)) Thread.yield();
}
private void unlock (int idx)
{
locktable.set(idx,0);
}
}
You’d need to create the SyncCharArray and then pass it to all threads that require swapping:
char array [] = {'a','b','c','d','e','f'};
SyncCharArray sca = new SyncCharArray(array);
// then pass sca to any threads that require swapping
// then within a thread
sca.swap(15,3);
Hope that makes some sense.
UPDATE:
Some testing demonstrated that unless you have a great number of threads accessing the array simulteniously (100+ on run-of-the-mill hardware) a simple synchronise (array) {} works much faster than the elaborate synchronisation.
// lock-free swap array[i] and array[j] (assumes array contains not null elements only)
static <T> void swap(AtomicReferenceArray<T> array, int i, int j) {
while (true) {
T ai = array.getAndSet(i, null);
if (ai == null) continue;
T aj = array.getAndSet(j, null);
if (aj == null) {
array.set(i, ai);
continue;
}
array.set(i, aj);
array.set(j, ai);
break;
}
}
The closest you're going to get is java.util.concurrent.atomic.AtomicReferenceArray, which offers CAS-based operations such as boolean compareAndSet(int i, E expect, E update). It does not have a swap(int pos1, int pos2) operation though so you're going to have to emulate it with two compareAndSet calls.
"The principal threat to scalability in concurrent applications is the exclusive resource lock." - Java Concurrency in Practice.
I think you need a lock, but as others mention that lock can be more granular than it is at present.
You can use lock striping like java.util.concurrent.ConcurrentHashMap.
The API you mentioned, as already stated by others, may only be used to set values of a single object, not an array. Nor even for two objects simultaneously, so you wouldn't have a secure swap anyway.
The solution depends on your specific situation. Can the array be replaced by another data structure? Is it also changing in size concurrently?
If you must use an array, it could be changed it to hold updatable objects (not primitive types nor a Char), and synchronize over both being swapped. S data structure like this would work:
public class CharValue {
public char c;
}
CharValue[] a = new CharValue[N];
Remember to use a deterministic synchronization order for not having a deadlocks (http://en.wikipedia.org/wiki/Deadlock#Circular_wait_prevention)! You could simply follow index ordering to avoid it.
If items should also be added or removed concurrently from the collection, you could use a Map instead, synchronize swaps on the Map.Entry'es and use a synchronized Map implementation. A simple List wouldn't do it because there are no isolated structures for retaining the values (or you don't have access to them).
I don't think the AtomicReferenceFieldUpdater is meant for array access, and even if it were, it only provides atomic guarantees on one reference at a time. AFAIK, all the classes in java.util.concurrent.atomic only provide atomic access to one reference at a time. In order to change two or more references as one atomic operation, you must use some kind of locking.

Categories

Resources