Synchronization on ConcurrentHashMap - java

In my application I am using a ConcurrentHashMap and I need this type of "custom put-if-absent" method to be executed atomically.
public boolean putIfSameMappingNotExistingAlready(String key, String newValue) {
String value;
synchronized (concurrentHashMap) {
if (value = concurrentHashMap.putIfAbsent(key, newValue)) == null) {
// There was no mapping for the key
return true;
} else { if (value.equals(newValue)) {
// The mapping <key, newValue> already exists in the map
return false;
} else {
concurrentHashMap.put(key, newValue);
return true;
}
}
}
}
I read (in the concurrent package documentation) that
A concurrent collection is thread-safe, but not governed by a single exclusion lock.
So you can not get an exclusive lock on a ConcurrentHashMap.
My questions are:
Is the code above thread-safe? To me it looks like it is guaranteed that the code in the synchronized block can be executed only by a single thread at the same time, but I want to confirm it.
Wouldn't it be "cleaner" to use Collections.synchronizedMap() instead of ConcurrentHashMap in this case?
Thanks a lot!

The following code uses a compare-and-set loop (as suggested by SlakS) to implement thread safety (Note the infinite loop):
/**
* Updates or adds the mapping for the given key.
* Returns true, if the operation was successful and false,
* if key is already mapped to newValue.
*/
public boolean updateOrAddMapping(String key, String newValue) {
while (true) {
// try to insert if absent
String oldValue = concurrentHashMap.putIfAbsent(key, newValue);
if (oldValue == null) return true;
// test, if the values are equal
if (oldValue.equals(newValue)) return false;
// not equal, so we have to replace the mapping.
// try to replace oldValue by newValue
if (concurrentHashMap.replace(key, oldValue, newValue)) return true;
// someone changed the mapping in the meantime!
// loop and try again from start.
}
}

By synchronizing on the entire collection like that you are essentially replacing the fine-grained synchronization within the concurrent collection with your own blunt-force approach.
If you aren't using the concurrency protections elsewhere then you could just use a standard HashMap for this and wrap it in your own synchronization. Using a synchronizedMap may work, but wouldn't cover multi-step operations such as above where you put, check, put.

Related

ConcurrentHashMap removal of keys

I'm trying to figure out how to write a thread-safe, expiring entries, cache. This cache will be used as a no-hits cache, so that if an entry is not found in some storage, I will put it in this cache and avoid the subsequent calls in the next minutes.
There will be multiple threads reading and writing this cache.
There will be just a single ThreadSafeCache instance in my application.
I'm not sure if removing an entry in the contains method will arise synchronization issues.
How may I test this class for thread-safety?
Kind regards
public class ThreadSafeCache
{
private final Clock clock = Clock.systemUTC();
private final Duration expiration = Duration.ofMinutes(10);
private final ConcurrentHashMap<CacheKey, CacheValue> internalMap = new ConcurrentHashMap<>();
public boolean contains(String a, String b, byte[] c, String d)
{
CacheKey key = new CacheKey(a, b, c, d);
CacheValue value = internalMap.get(key);
if (value == null || value.isExpired())
{
internalMap.remove(key);
return false;
}
return true;
}
public void put(String a, String b, byte[] c, String d)
{
internalMap.computeIfAbsent(new CacheKey(a, b, c, d), key -> new CacheValue());
}
private class CacheValue
{
private final Instant insertionDate;
private CacheValue()
{
this.insertionDate = clock.instant();
}
boolean isExpired()
{
return Duration.between(insertionDate,
clock.instant()).compareTo(expiration) > 0;
}
}
}
You are calling 2 Map operations in the same function, meaning there is scope for interleaving (i.e. another operation happens in between the 2 operations in the function, changing its behaviour). To fix this, you can put the map operations in a synchronized (internalMap) {} block. Note, you must do this to any method that interacts with the map in 2 discrete method calls.
From a code-style point of view, it is bad practice to modify the map in the contains method. This will make your code less predictable. Another person coming to your code for the first time (or you in a few months time) may not remember that contains() actually modifies the cache. contains implies that it simply checks the cache, rather than modifying it.
My recommendation would be:
If the key has expired, simply return false.
In the get() method, check if the value has expired, and, if it has, compute a new one there.
Your question: "I'm not sure if removing an entry in the contains method will arise synchronization issues".
=> No problem with remove operation because you use synchronized collections ConcurrentHashMap, it's the best choice.
extra: another way to get a synchronized collections is: Collections.synchonize(myCollection), but it's not OK if we use remove operation in multithread envi (maybe in a loop), it throws ModificationException.
So, use synchronized collections (ex: ConcurrentHashMap) is alaways recommended

Populating map from multiple threads

I have a ConcurrentHashMap which I am populating from multiple threads as shown below:
private static Map<ErrorData, Long> holder = new ConcurrentHashMap<ErrorData, Long>();
public static void addError(ErrorData error) {
if (holder.keySet().contains(error)) {
holder.put(error, holder.get(error) + 1);
} else {
holder.put(error, 1L);
}
}
Is there any possibility of race condition in above code and it can skip updates? Also how can I use Guava AtomicLongMap here if that can give better performance?
I am on Java 7.
Yes, there is a possibility of a race because you are not checking contains and putting atomically.
You can use AtomicLongMap as follows, which does this check atomically:
private static final AtomicLongMap<ErrorData> holder = AtomicLongMap.create();
public static void addError(ErrorData error) {
holder.getAndIncrement(error);
}
As described in the javadoc:
[T]he typical mechanism for writing to this map is addAndGet(K, long), which adds a long to the value currently associated with K. If a key has not yet been associated with a value, its implicit value is zero.
and
All operations are atomic unless otherwise noted.
If you are using java 8, you can take advantage of the new merge method:
holder.merge(error, 1L, Long::sum);
A 'vanilla' java 5+ solution :
public static void addError(final ErrorData errorData) {
Long previous = holder.putIfAbsent(errorData, 1L);
// if the error data is already mapped to some value
if (previous != null) {
// try to replace the existing value till no update takes place in the meantime
while (!map.replace(errorData, previous, previous + 1)) {
previous = map.get(errorData);
}
}
}
In Java 7 or older versions you need to use a compare-and-update loop:
Long prevValue;
boolean done;
do {
prevValue = holder.get(error);
if (prevValue == null) {
done = holder.putIfAbsent(error, 1L);
} else {
done = holder.replace(error, prevValue, newValue);
}
} while (!done);
With this code, if two threads race one may end up retrying its update, but they'll get the right value in the end.
Consider:
Thread1: holder.get(error) returns 1
Thread2: holder.get(error) returns 1
Thread1: holder.put(error, 1+1);
Thread2: holder.put(error, 1+1);
To fix this you need to use atomic operations to update the map.

Putting a value in a google guava loadingCache

This is my loading cache definition:
private class ProductValue {
private long regionAValue;
private long regionBValue;
// constructor and general stuff here
}
private final LoadingCache<ProductId, ProductValue> productCache = CacheBuilder.newBuilder()
.expireAfterAccess(4, TimeUnit.MINUTES)
.build(new CacheLoader<ProductId, ProductValue>() {
#Override
public ProductValue load(final ProductId productId) throws Exception {
return updateProductValues(productId);
}
});
private ProductValue updateProductValues(final ProductId productId) {
// Read from disk and return
}
Now, I've a use case where I'm required to set the value of regionA or regionB in the cache until the next update happens. I'm utterly confused about the concurrency implications of the logic I've:
public void setProductValue(final ProductId productId, final boolean isTypeA, final long newValue) throws ExecutionException {
ProductValue existingValues = productCache.get(productId); // 1
if (isTypeA) {
existingValues.regionAValue = newValue;
} else {
existingValues.regionBValue = newValue;
}
productCache.put(productId, existingValues); // 2
}
In 1 I just read the reference of information stored in cache for given key, this get is thread safe because loading cache acts like a concurrent map. But between 1 and 2 this reference can be overwritten by some other thread. Since I've overwritten 'value' using reference which already existed in the cache, do I need to put the key-value pair in the cache? Do I need line 2?
(Disclaimer: I am not a Guava Cache expert)
I think you have two concurrency issues in your code:
You have two operations that mutate the object in existingValues, that is existingValues.regionAValue = ... and existingValues.setRegionValue(...). Other threads can see the state when only one operation is applied. I think that is not wanted. (correct?)
Between the get() and the put() the value may be loaded again in the cache and put() overwrites a new value.
Regarding 1:
If you have a more reads to the object then writes, a good option is to use an immutable object. You don't touch the instance but do a copy of the original object, mutate, and put the new object into the cache. This way only the final state becomes visible.
Regarding 2:
Atomic CAS operations can help you here (e.g. JSR107 compatible caches). The useful method would be boolean replace(K key, V oldValue, V newValue);
In Google Guava the CAS methods are accessible via the ConcurrentMap interface, that you can retrieve via asMap().

My ConcurrentHashmap's value type is List,how to make appending to that list thread safe?

My class extends from ConcurrentHashmap[String,immutable.List[String]]
and it has 2 methods :
def addEntry(key: String, newList: immutable.List[String]) = {
...
//if key exist,appending the newList to the exist one
//otherwise set the newList as the value
}
def resetEntry(key: String): Unit = {
this.remove(key)
}
in order to make the addEntry method thread safe,I tried :
this.get(key).synchronized{
//append or set here
}
but that will raise null pointer exception if key does not exist,and use putIfAbsent(key, new immutable.List()) before synchronize won't work cause after putIfAbsent and before goes into synchronized block,the key may be removed by resetEntry.
make addEntry and resetEntry both synchronized method will work but the lock is too large
So, what could I do?
ps.this post is similiar with How to make updating BigDecimal within ConcurrentHashMap thread safe while plz help me figure out how to code other than general guide
--update--
checkout https://stackoverflow.com/a/34309186/404145, solved this after almost 3+ years later.
Instead of removing the entry, can you simply clear it? You can still use a synchronized list and ensure atomicity.
def resetEntry(key: String, currentBatchSize: Int): Unit = {
this.get(key).clear();
}
This works with the assumption that each key has an entry. For example if this.get(key)==null You would want to insert a new sychronizedList which should act as a clear as well.
After more than 3 years, I think now I can answer my question.
The original problem is:
I get a ConcurrentHashMap[String, List], many threads are appending values to it, how can I make it thread-safe?
Make addEntry() synchronized will work, right?
synchronize(map.get(key)){
map.append(key, value)
}
In most cases yes except when map.get(key) is null, which will cause NullPointerException.
So what about adding map.putIfAbsent(key, new List) like this:
map.putIfAbsent(key, new List)
synchronize(map.get(key)){
map.append(key, value)
}
Better now, but if after putIfAbsent() another thread called resetEntry(), we will see NullPointerException again.
Make addEntry and resetEntry both synchronized method will work but the lock is too big.
So what about MapEntry Level Lock when appending and Map Level Lock when resetting?
Here comes the ReentrantReadWriteLock:
When calling addEntry(), we acquire a share lock of the map, that makes appending as concurrently as possible, and when calling resetEntry(), we acquire an exclusive lock to make sure that no other threads are changing the map at the same time.
The code looks like this:
class MyMap extends ConcurrentHashMap{
val lock = new ReentrantReadWriteLock();
def addEntry(key: String, newList: immutable.List[String]) = {
lock.readLock.lock()
//if key exist,appending the newList to the exist one
//otherwise set the newList as the value
this.putIfAbsent(key, new List())
this(key).synchronized{
this(key) += newList
}
lock.readLock.unlock()
}
def resetEntry(key: String, currentBatchSize: Int): Unit = {
lock.writeLock.lock()
this.remove(key)
lock.writeLock.unlock()
}
}
You can try a method inspired by the CAS (Compare and Swap) process:
(in pseudo-java-scala-code, as my Scala is still in its infancy)
def addEntry(key: String, newList: immutable.List[String]) = {
val existing = putIfAbsent(key, newList);
if (existing != null) {
synchronized(existing) {
if (get(key) == existing) { // ask again for the value within the synchronized block to ensure consistence. This is the compare part of CAS
return put(key,existing ++ newList); // Swap the old value by the new
} else {
throw new ConcurrentModificationException(); // how else mark failure?
}
}
}
return existing;
}

ConcurrentHashMap putIfAbsent : atomicity when followed by a get() call

I wanted to discuss a specific use I have of a concurrent map to sense check my logic...
If I used ConcurrentHashMap, I can do the familar
private final ConcurrentHashMap<K, V> map = new ConcurrentHashMap<K, V>();
public V getExampleOne(K key) {
map.putIfAbsent(key, new Object());
return map.get(key);
}
but I realise that a race condition exists whereby if I remove the item from the map between the putIfAbsent and the get, the method above would return something that no longer exists in the collection. This may or may not be fine, but lets assume that for my use case, it's not ok.
What I'd really like is to have the whole thing atomic. So,
public V getExampleTwo(K key) {
return map.putIfAbsent(key, new Object());
}
but as this expands out to
if (!map.containsKey(key))
return map.put(key, value); [1]
return map.get(key);
which for line [1] will return null for first usage (ie, map.put will return the previous value, which for first time use is null).
I can't have it return null in this instance
Which leaves me with something like;
public V getExampleThree(K key) {
Object object = new Object();
V value = locks.putIfAbsent(key, object);
if (value == null)
return object;
return value;
}
So, finally, my question; how do the examples above differ in semantics?. Does getExampleThree ensure atomicity like getExampleTwo but avoid the null return correctly? Are there other problems with getExampleThree?
I was hoping for a bit of discussion around the choices. I realise I could use a non ConcurrentHashMap and synchronize around clients calling my get method and a method to remove from the map but that seems to defeat the purpose (non blocking nature) of the ConcurrentHashMap. Is that my only choice to keep the data accurate?
I guess that's a bit part of why you'd choose ConcurrentHashMap; that its visible/up-to-date/acurrate at the point you interact with it, but there may be an impact further down the line if old data is going to be a problem...
It sounds like you are trying to create a global lock object for a key.
Instead of deleting an entry with the possibility of have it re-created almost immediately, I would only delete the entry when you pretty sure its not needed ever again.
Otherwise, if you are using this for a lock, you can have two thread locking on different objects for the same key.
If its not ok, you can busy loop it.
public V getExampleOne(K key) {
for(Object o = null, ret = null; (ret = map.get(key)) == null; )
map.putIfAbsent(key, o == null ? o = new Object() : o);
return ret;
}
it can still be removed or replaced as soon as the loop exists so its effectively much the same as.
public V getExampleThree(K key) {
Object o = new Object();
map.putIfAbsent(key, o);
Object ret = map.get(key);
return ret == null ? o : ret;
}
So, finally, my question; how do the examples above differ in semantics?.
The difference is only the obvious.
Does getExampleThree ensure atomicity like getExampleTwo but avoid the null return correctly?
Yes.
Are there other problems with getExampleThree?
Only if you believe the very next call might not give you a different value (if you believe it can be removed in another thread)
The methods have different semantics:
getExampleOne is not atomic.
getExampleTwo returns null if the new object was inserted into the map. This differs from the behavior of getExampleOne, but it is atomic.
getExampleThree is probably what you want. It is atomic and it return the object that is in the map after the point in time of the putIfAbsent call. But it was problem when nulls are valid values in your application. The null return value is then ambiguous.
However, depending of the situation it might not be the actual object at the point in time when you use the return value. You then need explicit locking.
Why not simply use the first version and synchronize the method?
public synchronized V getExampleOne(K key) {
map.putIfAbsent(key, new Object());
return map.get(key);
}
While it won't provide you maximum parallelism, it's also true that you only have two operations and getExampleThree, while correct, is less readable and less easy to understand for someone else reading your code.
I think you will find the trick is to assume you will be non-atomic and handle it.
I am not really clear what you are looking for. Let me know if this is off at a tangent and I'll modify.
Perhaps you are looking for something like:
private final ConcurrentHashMap<String, Object> map = new ConcurrentHashMap();
/*
* Guaranteed to return the object replaced.
*
* Note that by the time this method returns another thread
* may have already replaced the object again.
*
* All we can guarantee here is that the return value WAS
* associated with the key in the map at the time the entry was
* replaced.
*
* A null return implies that either a null object was associated
* with the specified key or there was no entry in the map for
* the specified key wih 'null' as it's 'value'.
*/
public Object consistentReplace ( String key, Object newValue ) {
Object oldValue = map.get(key);
while ( !map.replace(key, oldValue, newValue) ) {
// Failed to replace because someone got in there before me.
oldValue = map.get(key);
}
return oldValue;
}

Categories

Resources