I would like to collect some metrics from various places in a web app. To keep it simple, all these will be counters and therefore the only modifier operation is to increment them by 1.
The increments will be concurrent and often. The reads (dumping the stats) is a rare operation.
I was thinking to use a ConcurrentHashMap. The issue is how to increment the counters correctly. Since the map doesn't have an "increment" operation, I need to read the current value first, increment it than put the new value in the map. Without more code, this is not an atomic operation.
Is it possible to achieve this without synchronization (which would defeat the purpose of the ConcurrentHashMap)? Do I need to look at Guava ?
Thanks for any pointers.
P.S.
There is a related question on SO (Most efficient way to increment a Map value in Java) but focused on performance and not multi-threading
UPDATE
For those arriving here through searches on the same topic: besides the answers below, there's a useful presentation which incidentally covers the same topic. See slides 24-33.
In Java 8:
ConcurrentHashMap<String, LongAdder> map = new ConcurrentHashMap<>();
map.computeIfAbsent("key", k -> new LongAdder()).increment();
Guava's new AtomicLongMap (in release 11) might address this need.
You're pretty close. Why don't you try something like a ConcurrentHashMap<Key, AtomicLong>?
If your Keys (metrics) are unchanging, you could even just use a standard HashMap (they are threadsafe if readonly, but you'd be well advised to make this explicit with an ImmutableMap from Google Collections or Collections.unmodifiableMap, etc.).
This way, you can use map.get(myKey).incrementAndGet() to bump statistics.
Other than going with AtomicLong, you can do the usual cas-loop thing:
private final ConcurrentMap<Key,Long> counts =
new ConcurrentHashMap<Key,Long>();
public void increment(Key key) {
if (counts.putIfAbsent(key, 1)) == null) {
return;
}
Long old;
do {
old = counts.get(key);
} while (!counts.replace(key, old, old+1)); // Assumes no removal.
}
(I've not written a do-while loop for ages.)
For small values the Long will probably be "cached". For longer values, it may require allocation. But the allocations are actually extremely fast (and you can cache further) - depends upon what you expect, in the worst case.
Got a necessity to do the same.
I'm using ConcurrentHashMap + AtomicInteger.
Also, ReentrantRW Lock was introduced for atomic flush(very similar behavior).
Tested with 10 Keys and 10 Threads per each Key. Nothing was lost.
I just haven't tried several flushing threads yet, but hope it will work.
Massive singleusermode flush is torturing me...
I want to remove RWLock and break down flushing into small pieces. Tomorrow.
private ConcurrentHashMap<String,AtomicInteger> counters = new ConcurrentHashMap<String, AtomicInteger>();
private ReadWriteLock rwLock = new ReentrantReadWriteLock();
public void count(String invoker) {
rwLock.readLock().lock();
try{
AtomicInteger currentValue = counters.get(invoker);
// if entry is absent - initialize it. If other thread has added value before - we will yield and not replace existing value
if(currentValue == null){
// value we want to init with
AtomicInteger newValue = new AtomicInteger(0);
// try to put and get old
AtomicInteger oldValue = counters.putIfAbsent(invoker, newValue);
// if old value not null - our insertion failed, lets use old value as it's in the map
// if old value is null - our value was inserted - lets use it
currentValue = oldValue != null ? oldValue : newValue;
}
// counter +1
currentValue.incrementAndGet();
}finally {
rwLock.readLock().unlock();
}
}
/**
* #return Map with counting results
*/
public Map<String, Integer> getCount() {
// stop all updates (readlocks)
rwLock.writeLock().lock();
try{
HashMap<String, Integer> resultMap = new HashMap<String, Integer>();
// read all Integers to a new map
for(Map.Entry<String,AtomicInteger> entry: counters.entrySet()){
resultMap.put(entry.getKey(), entry.getValue().intValue());
}
// reset ConcurrentMap
counters.clear();
return resultMap;
}finally {
rwLock.writeLock().unlock();
}
}
I did a benchmark to compare the performance of LongAdder and AtomicLong.
LongAdder had a better performance in my benchmark: for 500 iterations using a map with size 100 (10 concurrent threads), the average time for LongAdder was 1270ms while that for AtomicLong was 1315ms.
Related
I have a block of code provided below:
Map<String, BigDecimal> salesMap = new HashMap<>();
orderItems.parallelStream().forEach(orderItem -> {
synchronized (this) {
int itemId = orderItem.getItemId();
Item item = settingsClient.getItemByItemId(itemId);
String revenueCenterName = itemIdAndRevenueCenterNameMap.get(itemId);
updateSalesMap(salesMap, "Gross Sales: " + revenueCenterName, orderItem.getNetSales().toPlainString());
}
});
private void updateSalesMap(Map<String,BigDecimal> salesMap, String key, String amount) {
BigDecimal bd = getSalesAmount(salesMap, key);
int scale = 2;
if (StringUtils.isBlank(amount)) {
amount = "0.00";
}
BigDecimal addMe = BigDecimal.valueOf(Double.valueOf(amount)).setScale(scale, RoundingMode.HALF_UP);
salesMap.put(key, bd.add(addMe));
}
The code works fine, but if I don't use the synchronized block, it will end of varying data in the map. As far I know, the streams are thread safe, so I get curious about whats happening. I tried to use ConcurrentHashMap but it seems nothing changed.
My idea is the map data is not written in the RAM and read/ write is done in the thread cache and hence, we end up having various data.
Is it correct? If so, I will use volatile keyword then using a synchronized block.
Note: just find that I cant declare a variable volatile inside a method.
As far I know, the streams are thread safe, so I get curious about whats happening.
They are. As long as you only operate on the stream itself. The problem is that you try to manipulate other variable at the same time (map in this case). The idea of streams is that operations on each of elements are totally independent - check idea of funcional programming.
I tried to use ConcurrentHashMap but it seems nothing changed.
The issue comes from your approach. The general idea is that atomic operations on ConcurrentHashMap are thread safe. However, if you perform two thread safe operations together, it won't be atomic and thread safe. You need to synchronize it yourself or come up with some other solution.
In updateSalesMap() method you first get value from the map, do some calculations and then update the value. This sequence of operations isn't atomic - performing them on ConcurrentHashMap won't change much.
One of possible ways to achieve concurrency in this case would be to utilize CuncurrentHashMap.compute() Javadocs
You are doing read operation using getSalesAmount(salesMap, key) and write operation using salesMap.put(key, bd.add(addMe)), in separate statements. The non-atomicity of this breakup of these operations is not going to change, irrespective of the kind of Map, you use. The synchronized block will solve this ofcourse.
Alternatively, You can use ConcurrentHashMap's compute(K key, BiFunction<? super K, ? super V, ? extends V> remappingFunction), for the kind of atomicity, you are looking for.
I make the updateSalesMap thread-safe and that works for me:
protected synchronized void updateSalesMap(Map<String, BigDecimal> salesMap, String s, String amount) {
BigDecimal bd = updateSalesAmount(salesMap, s);
int scale = 2;
if (StringUtils.isBlank(amount)) {
amount = "0.00";
}
BigDecimal addMe = BigDecimal.valueOf(Double.valueOf(amount)).setScale(scale, RoundingMode.HALF_UP);
salesMap.put(s, bd.add(addMe));
}
I have a scenario where i have to maintain a Map which can be populated by multiple threads ,each modifying there respective List (unique identifier/key being thread name) and when the list size for a thread exceeds a fixed batch size we have to persist the records in DB.
Sample code below:
private volatile ConcurrentHashMap<String, List<T>> instrumentMap = new ConcurrentHashMap<String, List<T>>();
private ReadWriteLock lock ;
public void addAll(List<T> entityList, String threadName) {
try {
lock.readLock().lock();
List<T> instrumentList = instrumentMap.get(threadName);
if(instrumentList == null) {
instrumentList = new ArrayList<T>(batchSize);
instrumentMap.put(threadName, instrumentList);
}
if(instrumentList.size() >= batchSize -1){
instrumentList.addAll(entityList);
recordSaver.persist(instrumentList);
instrumentList.clear();
} else {
instrumentList.addAll(entityList);
}
} finally {
lock.readLock().unlock();
}
}
There is one more separate thread running after every 2 minutes to persist all the records in Map (to make sure we have something persisted after every 2 minutes and map size does not gets too big) and when it starts it block all other threads (check the readLock and writeLock usawhere writeLock has higher priority)
if(//Some condition) {
Thread.sleep(//2 minutes);
aggregator.getLock().writeLock().lock();
List<T> instrumentList = instrumentMap .values().stream().flatMap(x->x.stream()).collect(Collectors.toList());
if(instrumentList.size() > 0) {
saver.persist(instrumentList);
instrumentMap .values().parallelStream().forEach(x -> x.clear());
aggregator.getLock().writeLock().unlock();
}
This solution is working fine almost for every scenario we tested except sometime we see some of the records went missing i.e. not persisted at all although they were added fine in Map
My question is what is the problem with this code?
Is ConcurrentHashMap not the best solution here?
Does usage of read/write lock has some problem here?
Should i go with sequential processing?
No, it's not thread safe.
The problem is that you are using the read lock of the ReadWriteLock. This doesn't guarantee exclusive access for making updates. You'd need to use the write lock for that.
But you don't really need to use a separate lock at all. You can simply use the ConcurrentHashMap.compute method:
instrumentMap.compute(threadName, (tn, instrumentList) -> {
if (instrumentList == null) {
instrumentList = new ArrayList<>();
}
if(instrumentList.size() >= batchSize -1) {
instrumentList.addAll(entityList);
recordSaver.persist(instrumentList);
instrumentList.clear();
} else {
instrumentList.addAll(entityList);
}
return instrumentList;
});
This allows you to update items in the list whilst also guaranteeing exclusive access to the list for a given key.
I suspect that you could split the compute call into computeIfAbsent (to add the list if one is not there) and then a computeIfPresent (to update/persist the list): the atomicity of these two operations is not necessary here. But there is no real point in splitting them up.
Additionally, instrumentMap almost certainly shouldn't be volatile. Unless you really want to reassign its value (given this code, I doubt that), remove volatile and make it final.
Similarly, non-final locks are questionable too. If you stick with using a lock, make that final too.
I was trying to get the maximum value of a calculatedValue in a cycle and I wanted it to be thread safe. So I decided to use AtomicInteger and Math.max, but I can't find a solution so that the operation can be considered atomic.
AtomicInteger value = new AtomicInteger(0);
// Having some cycle here... {
Integer anotherCalculatedValue = ...;
value.set(Math.max(value.get(), anotherCalculatedValue));
}
return value.get()
The problem with that is that I make two operations, therefore is not threadsafe. How can I solve this? The only way is to use synchronized?
If Java 8 is available you can use:
AtomicInteger value = new AtomicInteger(0);
Integer anotherCalculatedValue = ...;
value.getAndAccumulate(anotherCalculatedValue, Math::max);
Which from the specification will:
Atomically updates the current value with the results of
applying the given function to the current and given values,
returning the previous value.
I have a ConcurrentHashMap which I am populating from multiple threads.
private static Map<DataCode, Long> errorMap = new ConcurrentHashMap<DataCode, Long>();
public static void addError(DataCode error) {
if (errorMap.keySet().contains(error)) {
errorMap.put(error, errorMap.get(error) + 1);
} else {
errorMap.put(error, 1L);
}
}
My above addError method is called from multiple threads which populates errorMap. I am not sure whether this is thread safe? Is there anything wrong I am doing here?
Any explanation of why it can skip updates will help me to understand better.
Whether this is safe depends on what you mean. It won't throw exceptions or corrupt the map, but it can skip updates. Consider:
Thread1: errorMap.get(error) returns 1
Thread2: errorMap.get(error) returns 1
Thread1: errorMap.put(error, 1+1);
Thread2: errorMap.put(error, 1+1);
A similar race exists around the keySet().contains(error) operation. To fix this you'll need to use atomic operations to update the map.
On Java 8, this is easy:
errorMap.compute(error, oldValue -> oldValue == null ? 1L : oldValue + 1L);
On older versions of Java you need to use a compare-and-update loop:
Long prevValue;
boolean done;
do {
prevValue = errorMap.get(error);
if (prevValue == null) {
done = errorMap.putIfAbsent(error, 1L);
} else {
done = errorMap.replace(error, prevValue, newValue);
}
} while (!done);
With this code, if two threads race one may end up retrying its update, but they'll get the right value in the end.
Alternately, you can also use Guava's AtomicLongMap which does all the thread-safety magic for you and gets higher performance (by avoiding all those boxing operations, among other things):
errorAtomicLongMap.incrementAndGet(error);
I have a list of users and each user has a sequence of places he has visited (e.g. list = 1,2,3,1,2,8,10,1...usw.). Now I want figure out how often each place has been visited. Futhermore, I really want to take fork/join for that. Now my acutal question is, do you know a way to use the concurrentHashMap here, because the current problem is that there are lost updates at
map.put(i, map.get(i)+1);// lost updates here
Do you have a nice idea to solve that without locking the whole map (is there are partial lock for parts of the map as it is for put()?). I know, I could create a map for each user then join them again, but I thought, perhaps someone has a better solution.
public class ForkUsers extends RecursiveAction{
ArrayList<User>users;
ConcurrentHashMap<Integer,Integer>map;
int indexfrom;
int indexto;
ForkUsers(ArrayList<User>users,ConcurrentHashMap<Integer,Integer> map,int indexfrom,int indexto){
this.users=users;
this.map=map;
this.indexfrom=indexfrom;
this.indexto=indexto;
}
void computeDirectly(User user){
for(Integer i:user.getVisitedPlaces()){
if(map.get(i)==null){
map.putIfAbsent(i, 1);
}else{
map.put(i, map.get(i)+1);// lost updates here
}
}
}
protected void compute() {
if(indexfrom==indexto){
computeDirectly(users.get(indexfrom));
}else{
int half=(indexfrom+indexto)/2;
invokeAll(new ForkUsers(users,map,indexfrom,half),new ForkUsers(users,map,half+1,indexto));
}
}
}
Even though you're using a ConcurrentHashMap, that doesn't prevent read-update-write race conditions; both threads call get, then both add 1, then both put the value with just the single update back. You can either synchronize the whole read-update-write operation or (my preference) use an AtomicInteger for the value and use incrementAndGet instead.