Synchronized on HashMap value object

Synchronized on HashMap value object - java

I've got a question about synchronization of objects inside a Map (same objects I later change value of). I want to atomically read, do checks and possibly do updates to a value from a map without locking the entire map. Is this a valid way to work with synchronization of objects?
private final Map<String, AtomicInteger> valueMap = new HashMap<>();
public Response addValue(#NotNull String key, #NotNull Integer value) {
AtomicInteger currentValue = valueMap.get(key);
if (currentValue == null) {
synchronized (valueMap) {
// Doublecheck that value hasn't been changed before entering synchronized
currentValue = valueMap.get(key);
if (currentValue == null) {
currentValue = new AtomicInteger(0);
valueMap.put(key, currentValue);
}
}
}
synchronized (valueMap.get(key)) {
// Check that value hasn't been changed when changing synchronized blocks
currentValue = valueMap.get(key);
if (currentValue.get() + value > MAX_LIMIT) {
return OVERFLOW;
}
currentValue.addAndGet(value);
return OK;
}
}

I fail to see much of a difference between your approach and that of a standard ConcurrentHashMap - asides from the fact that ConcurrentHashMap has been heavily tested, and can be configured for minimal overhead with the exact number of threads you want to run the code with.
In a ConcurrentHashMap, you would use the replace(K key, V old, V new) method to atomically update key to new only when the old value has not changed.
The space savings due to removing all those AtomicIntegers and the time savings due to lower synchronization overhead will probably compensate having to wrap the replace(k, old, new) calls within while-loops:
ConcurrentHashMap<String, Integer> valueMap =
new ConcurrentHashMap<>(16, .75f, expectedConcurrentThreadCount);
public Response addToKey(#NotNull String key, #NotNull Integer value) {
if (value > MAX_LIMIT) {
// probably should set value to MAX_LIMIT-1 before failing
return OVERFLOW;
}
boolean updated = false;
do {
Integer old = putIfAbsent(key, value);
if (old == null) {
// it was absent, and now it has been updated to value: ok
updated = true;
} else if (old + value > MAX_LIMIT) {
// probably should set value to MAX_LIMIT-1 before failing
return OVERFLOW;
} else {
updated = valueMap.replace(key, old, old+value);
}
} while (! updated);
return OK;
}
Also, on the plus side, this code works even if the key was removed after checking it (yours throws an NPE in this case).

Related

How to achieve Map putIfAbsent semantics with computeIfAbsent efficiency?

Consider the following code:
ConcurrentHashMap<String, Value> map = new ConcurrentHashMap<>();
boolean foo(String key) {
Value value = map.get(key);
if (value == null) {
value = map.putIfAbsent(key, new Value());
if (value == null) {
// do some stuff
return true;
}
}
// do some other stuff
return false;
}
Assume that foo() is called by multiple threads concurrently. Also assume that calling new Value() is expensive. The code is verbose and can still result in redundant Value objects created. Can the above logic be implemented in a way that guarantees no redundant Value objects are created (i.e. new Value() is called at most once)? I'm looking for a clean implementation- minimal code without acquiring locks explicitly.
computeIfAbsent could have been a good alternative, however its return semantics are not in line with the required logic.

Some minimal code that does the job:
boolean foo(String key) {
AtomicBoolean flag = new AtomicBoolean();
Value value = map.computeIfAbsent(key, k -> {flag.set(true); return new Value();});
if (flag.get()) {
// do some stuff
} else {
// do some other stuff
}
return flag.get();
}

One solution is to store Future<Value> instead of Value in the map:
ConcurrentHashMap<String, Future<Value>> map = new ConcurrentHashMap<>();
boolean foo(String key) {
Future<Value> value = map.get(key);
if (value == null) {
value = map.putIfAbsent(key, new FutureTask<Value>(() -> new Value()));
if (value == null) {
// do some stuff
return true;
}
}
// do some other stuff
return false;
}
You can access the underlying value by calling value.get(), which will block until the computation is complete.
There is a chance that more than one FutureTask is created, but only one will reach the map and only one computation of new Value() will be done.

First let's fix the fact that you are not acting atomically, and do a needless look-up. Two threads could both simultaneously pass the first value == null check. Not really a problem now (except 2 Values will be created, which is slow), but a bug waiting to happen if someone adds an else clause to the second value == null check. It's cleaner this way too.
boolean foo(String key) {
Value value = map.putIfAbsent(key, new Value());
if (value == null) {
// do some stuff
return true;
}
else {
// do some other stuff
return false;
}
}
Now let's address the fact that Value creation is slow (sounds like you are abusing constructor, but anyway).
boolean foo(String key) {
final AtomicBoolean wasCreated = new AtomicBoolean(false);
final Value value = map.computeIfAbsent(key, k -> {
wasCreated.set(true);
return new Value();
});
if (wasCreated.get()) {
// do some stuff
return true;
}
else {
// do some other stuff
return false;
}
}

One way is to use local state and update it in computeIfAbsent's mapping function:
boolean foo(String key) {
boolean[] b = { false };
map.computeIfAbsent(key, k -> {
// do some stuff
b[0] = true;
return new Value();
});
return b[0];
}
Because mappingFunction is only run if the key is not present in the map, you can guarantee that the heavy new Value() is only called when necessary and that the return value is set to true only when there was no mapping before the call.

Consider your method foo ( which is literally a wrapper for putIfAbsent ).
String key = "test";
if(foo(key)){
//Positive Branch.
} else{
//Negative Branch.
}
Now, in a multi-threaded environment, thread A calls and completes foo and adds a new value. It is going down the Positive branch. Before it enters the Positive branch, it gets scheduled for later. Thread B calls and completes foo, it continues executing and goes down the Negative branch.
The Negative branch is getting executed before the Positive branch, which in my mind is not wanted. In your particular case it might be OK. compute might be a better replacement.
map.compute( key, ( k, old ) -> {
if(old==null){
Value v = new Value();
//positive branch.
return v;
} else{
//negative branch.
return old;
}
});
Now you would be acting atomically on whether or not the value exists.

synchronize a method by achieving better performance?

I have a class that is being called by multiple threads on multi core machine. I want to make it thread safe.
add method will be called by multiple threads. And if key exists, just append the current value to new value otherwise just put key and value in the map.
Now to make it thread safe, I was planning to synchronize add method but it will destroy performance. Is there any better way by which we can achieve better performance without synchronizing add method?
class Test {
private final Map<Integer, Integer> map = new ConcurrentHashMap<>();
public void add(int key, int value) {
if (map.containsKey(key)) {
int val = map.get(key);
map.put(key, val + value);
return;
}
map.put(key, value);
}
public Object getResult() {
return map.toString();
}
}

but it will destroy performance
It likely wouldn't destroy performance. It will reduce it some, with further reduction if there is a high collision rate.
Is there any better way by which we can achieve better performance?
Yes, use merge() (Java 8+). Quoting the javadoc:
If the specified key is not already associated with a value or is associated with null, associates it with the given non-null value. Otherwise, replaces the associated value with the results of the given remapping function, or removes if the result is null.
Example:
public void add(int key, int value) {
map.merge(key, value, (a, b) -> a + b);
}
Or using a method reference to sum(int a, int b) instead of a lambda expression:
public void add(int key, int value) {
map.merge(key, value, Integer::sum);
}

Use merge:
class Test {
final Map<Integer, Integer> map = new ConcurrentHashMap<>();
public void add(int key, int value) {
map.merge(key, value, Integer::sum);
}
public Object getResult() {
return map.toString();
}
}
Java 7 solution if you absolutely can't use synchronized (or, you absolutely cannot lock explicitly):
class Test {
final Map<Integer, AtomicInteger> map = new ConcurrentHashMap<>();
public void add(int key, int value) {
get(key).addAndGet(value);
}
private AtomicInteger get(int key) {
AtomicInteger current = map.get(key);
if (current == null) {
AtomicInteger ai = new AtomicInteger();
current = map.putIfAbsent(key, ai);
if (current == null) {
current = ai;
}
}
return current;
}
public Object getResult() {
return map.toString();
}
}

synchronized causes a bottleneck only when you run an expensive operation holding a lock.
In your case by adding a synchronized you are doing:
1. check a hashmap for existence of a key
2. get the value mapped to that key
3. do an addition and put the result back to the hashmap.
All these operations are super cheap O(1) and unless you are using some strange pattern for the keys which are integers it should be very unlikely that you can get some degenerate performance due to collisions.
I would suggest if you can't use merge as the other answers point out, to just synchronize. You should be considered so much about performance only in critical hotpaths and after you have actually profiled that there is an issue there

How to search map in between value efficiently?

Map<Long, Object> map = new TreeMap<>();
map.put(100, object100);
map.put(120, object120);
map.put(200, object200);
map.put(277, object277);
map.put(300, object300);
map.put(348, object348);
map.put(400, object400);
//...
If a method gets a value in between the map's key and the next map's key, it would return the first key's object. For example, if the search method is invoked with the value 350, it should return object348.
The difference of value in the keys is not fixed.
But searching like that requires the iteration through all the entries until it gets the correct value. So, how do I make this efficient?

I'm not absolutely clear on whether you want only the object for the key which is lower than the target number, or the object for the nearest key either below or above.
I suspect you're asking just for the object for the key below, in which case NavigableMap.floorKey(K) should find what you seek.
But just in case you'd prefer to find the object whose key has the value nearest to the target value, then this should do what you need:
public static Object findNearestTo(long targetNumber) {
if (map.isEmpty()) {
return null; // or throw an appropriate exception.
}
Object exactMatch = map.get(targetNumber);
if (exactMatch != null) {
return exactMatch;
}
Long nearestBelow = map.floorKey(targetNumber);
Long nearestAbove = map.ceilingKey(targetNumber);
if (nearestBelow == null) {
return map.get(nearestAbove);
} else if (nearestAbove == null) {
return map.get(nearestBelow);
}
if (targetNumber - nearestBelow <= nearestAbove - targetNumber) {
return map.get(nearestBelow);
} else {
return map.get(nearestAbove);
}
}
Note that where the target number is an equal distance from the nearest below and the nearest above, it will favour the object in the key with the lower value. But you can favour the higher value simply by changing <= to < in the final if test.

As pointed out in the comment, check out NavigableMap.
Method map.floorEntry(key) should do what you want.

Atomic compareAndSet but with callback?

I know that AtomicReference has compareAndSet, but I feel like what I want to do is this
private final AtomicReference<Boolean> initialized = new AtomicReference<>( false );
...
atomicRef.compareSetAndDo( false, true, () -> {
// stuff that only happens if false
});
this would probably work too, might be better.
atomicRef.compareAndSet( false, () -> {
// stuff that only happens if false
// if I die still false.
return true;
});
I've noticed there's some new functional constructs but I'm not sure if any of them are what I'm looking for.
Can any of the new constructs do this? if so please provide an example.
update
To attempt to simplify my problem, I'm trying to find a less error prone way to guard code in a "do once for object" or (really) lazy initializer fashion, and I know that some developers on my team find compareAndSet confusing.

guard code in a "do once for object"
how exactly to implement that depends on what you want other threads attempting to execute the same thing in the meantime. if you just let them run past the CAS they may observe things in an intermediate state while the one thread that succeeded does its action.
or (really) lazy initializer fashion
that construct is not thread-safe if you're using it for lazy initializers because the "is initialized" boolean may be set to true by one thread and then execute the block while another thread observes the true-state but reads an empty result.
You can use Atomicreference::updateAndGet if multiple concurrent/repeated initialization attempts are acceptable with one object winning in the end and the others being discarded by GC. The update method should be side-effect-free.
Otherwise you should just use the double checked locking pattern with a variable reference field.
Of course you can always package any of these into a higher order function that returns a Runnable or Supplier which you then assign to a final field.
// == FunctionalUtils.java
/** #param mayRunMultipleTimes must be side-effect-free */
public static <T> Supplier<T> instantiateOne(Supplier<T> mayRunMultipleTimes) {
AtomicReference<T> ref = new AtomicReference<>(null);
return () -> {
T val = ref.get(); // fast-path if already initialized
if(val != null)
return val;
return ref.updateAndGet(v -> v == null ? mayRunMultipleTimes.get() : v)
};
}
// == ClassWithLazyField.java
private final Supplier<Foo> lazyInstanceVal = FunctionalUtils.instantiateOne(() -> new Foo());
public Foo getFoo() {
lazyInstanceVal.get();
}
You can easily encapsulate various custom control-flow and locking patterns this way. Here are two of my own..

compareAndSet returns true if the update was done, and false if the actual value was not equal to the expected value.
So just use
if (ref.compareAndSet(expectedValue, newValue)) {
...
}
That said, I don't really understand your examples, since you're passing true and false to a method taking object references as argument. And your second example doesn't do the same thing as the first one. If the second is what you want, I think what you're after is
ref.getAndUpdate(value -> {
if (value.equals(expectedValue)) {
return someNewValue(value);
}
else {
return value;
}
});

You’re over-complicating things. Just because there are now lambda expression, you don’t need to solve everything with lambdas:
private volatile boolean initialized;
…
if(!initialized) synchronized(this) {
if(!initialized) {
// stuff to be done exactly once
initialized=true;
}
}
The double checked locking might not have a good reputation, but for non-static properties, there are little alternatives.
If you consider multiple threads accessing it concurrently in the uninitialized state and want a guaranty that the action runs only once, and that it has completed, before dependent code is executed, an Atomic… object won’t help you.
There’s only one thread that can successfully perform compareAndSet(false,true), but since failure implies that the flag already has the new value, i.e. is initialized, all other threads will proceed as if the “stuff to be done exactly once” has been done while it might still be running. The alternative would be reading the flag first and conditionally perform the stuff and compareAndSet afterwards, but that allows multiple concurrent executions of “stuff”. This is also what happens with updateAndGet or accumulateAndGet and it’s provided function.
To guaranty exactly one execution before proceeding, threads must get blocked, if the “stuff” is currently executed. The code above does this. Note that once the “stuff” has been done, there will be no locking anymore and the performance characteristics of the volatile read are the same as for the Atomic… read.
The only solution which is simpler in programming, is to use a ConcurrentMap:
private final ConcurrentHashMap<String,Boolean> initialized=new ConcurrentHashMap<>();
…
initialized.computeIfAbsent("dummy", ignore -> {
// stuff to do exactly once
return true;
});
It might look a bit oversized, but it provides exactly the required performance characteristics. It will guard the initial computation using synchronized (or well, an implementation dependent exclusion mechanism) but perform a single read with volatile semantics on subsequent queries.
If you want a more lightweight solution, you may stay with the double checked locking shown at the beginning of this answer…

I know this is old, but I've found there is no perfect way to achieve this, more specifically this:
trying to find a less error prone way to guard code in a "do (anything) once..."
I'll add to this "while respecting a happens before behavior." which is required for instantiating singletons in your case.
IMO The best way to achieve this is by means of a synchronized function:
public<T> T transaction(Function<NonSyncObject, T> transaction) {
synchronized (lock) {
return transaction.apply(nonSyncObject);
}
}
This allows to preform atomic "transactions" on the given object.
Other options are double-check spin-locks:
for (;;) {
T t = atomicT.get();
T newT = new T();
if (atomicT.compareAndSet(t, newT)) return;
}
On this one new T(); will get executed repeatedly until the value is set successfully, so it is not really a "do something once".
This would only work on copy on write transactions, and could help on "instantiating objects once" (which in reality is instantiating many but at the end is referencing the same) by tweaking the code.
The final option is a worst performant version of the first one, but this one is a true happens before AND ONCE (as opposed to the double-check spin-lock):
public void doSomething(Runnable r) {
while (!atomicBoolean.compareAndSet(false, true)) {}
// Do some heavy stuff ONCE
r.run();
atomicBoolean.set(false);
}
The reason why the first one is the better option is that it is doing what this one does, but in a more optimized way.
As a side note, in my projects I've actually used the code below (similar to #the8472's answer), that at the time I thought safe, and it may be:
public T get() {
T res = ref.get();
if (res == null) {
res = builder.get();
if (ref.compareAndSet(null, res))
return res;
else
return ref.get();
} else {
return res;
}
}
The thing about this code is that, as the copy on write loop, this one generates multiple instances, one for each contending thread, but only one is cached, the first one, all the other constructions eventually get GC'd.
Looking at the putIfAbsent method I see the benefit is the skipping of 17 lines of code and then a synchronized body:
/** Implementation for put and putIfAbsent */
final V putVal(K key, V value, boolean onlyIfAbsent) {
if (key == null || value == null) throw new NullPointerException();
int hash = spread(key.hashCode());
int binCount = 0;
for (Node<K,V>[] tab = table;;) {
Node<K,V> f; int n, i, fh;
if (tab == null || (n = tab.length) == 0)
tab = initTable();
else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
if (casTabAt(tab, i, null,
new Node<K,V>(hash, key, value, null)))
break; // no lock when adding to empty bin
}
else if ((fh = f.hash) == MOVED)
tab = helpTransfer(tab, f);
else {
V oldVal = null;
synchronized (f) {
if (tabAt(tab, i) == f) {
And then the synchronized body itself is another 34 lines:
synchronized (f) {
if (tabAt(tab, i) == f) {
if (fh >= 0) {
binCount = 1;
for (Node<K,V> e = f;; ++binCount) {
K ek;
if (e.hash == hash &&
((ek = e.key) == key ||
(ek != null && key.equals(ek)))) {
oldVal = e.val;
if (!onlyIfAbsent)
e.val = value;
break;
}
Node<K,V> pred = e;
if ((e = e.next) == null) {
pred.next = new Node<K,V>(hash, key,
value, null);
break;
}
}
}
else if (f instanceof TreeBin) {
Node<K,V> p;
binCount = 2;
if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key,
value)) != null) {
oldVal = p.val;
if (!onlyIfAbsent)
p.val = value;
}
}
}
}
The pro(s) of using a ConcurrentHashMap is that it will undoubtedly work.

Create and put a map value only if not already present, and get it: thread-safe implementation

What is the best way to make this snippet thread-safe?
private static final Map<A, B> MAP = new HashMap<A, B>();
public static B putIfNeededAndGet(A key) {
B value = MAP.get(key);
if (value == null) {
value = buildB(...);
MAP.put(key, value);
}
return value;
}
private static B buildB(...) {
// business, can be quite long
}
Here are the few solutions I could think about:
I could use a ConcurrentHashMap, but if I well understood, it just makes the atomic put and get operations thread-safe, i.e. it does not ensure the buildB() method to be called only once for a given value.
I could use Collections.synchronizedMap(new HashMap<A, B>()), but I would have the same issue as the first point.
I could set the whole putIfNeededAndGet() method synchronized, but I can have really many threads accessing this method together, so it could be quite expensive.
I could use the double-checked locking pattern, but there is still the related out-of-order writes issue.
What other solutions may I have?
I know this is a quite common topic on the Web, but I didn't find a clear, full and working example yet.

Use ConcurrentHashMap and the lazy init pattern which you used
public static B putIfNeededAndGet(A key) {
B value = map.get(key);
if (value == null) {
value = buildB(...);
B oldValue = map.putIfAbsent(key, value);
if (oldValue != null) {
value = oldValue;
}
}
return value;
}

This might not be the answer you're looking for, but use the Guava CacheBuilder, it already does all that and more:
private static final LoadingCache<A, B> CACHE = CacheBuilder.newBuilder()
.maximumSize(100) // if necessary
.build(
new CacheLoader<A, B>() {
public B load(A key) {
return buildB(key);
}
});
You can also easily add timed expiration and other features as well.
This cache will ensure that load() (or in your case buildB) will not be called concurrently with the same key. If one thread is already building a B, then any other caller will just wait for that thread.

In the above solution it is possible that many threads will class processB(...) simultaneously hence all will calculate. But in my case i am using Future and a single thread only get the old value as null hence it will only compute the processB rest will wait on f.get().
private static final ConcurrentMap<A, Future<B>> map = new ConcurrentHashMap<A, Future<B>>();
public static B putIfNeededAndGet(A key) {
while (true) {
Future<V> f = map.get(key);
if (f == null) {
Callable<B> eval = new Callable<V>() {
public B call() throws InterruptedException {
return buildB(...);
}
};
FutureTask<V> ft = new FutureTask<V>(eval);
f = map.putIfAbsent(arg, ft);
if (f == null) {
f = ft;
ft.run();
}
}
try {
return f.get();
} catch (CancellationException e) {
cache.remove(arg, f);
} catch (ExecutionException e) {
}
}
}

Thought maybe this will be useful for someone else as well, using java 8 lambdas I created this function which worked great for me:
private <T> T getOrCreate(Object key, Map<Object, T> map,
Function<Object, T> creationFunction) {
T value = map.get(key);
// if the doesn't exist yet - create and add it
if (value == null) {
value = creationFunction.apply(key);
map.put(label, metric);
}
return value;
}
then you can use it like this:
Object o = getOrCreate(key, map, s -> createSpecialObjectWithKey(key));
I created this for something specific but changed the context and code to a more general look, that is why my creationFunction has one parameter, it can also have no parameters...
also you can generify it more by changing Object to a generic type, if it's not clear let me know and I'll add another example.
UPDATE:
I just found out about Map.computeIfAbsent which basically does the same, gotta love java 8 :)

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Synchronized on HashMap value object - java

Related

How to achieve Map putIfAbsent semantics with computeIfAbsent efficiency?

synchronize a method by achieving better performance?

How to search map in between value efficiently?

Atomic compareAndSet but with callback?

Create and put a map value only if not already present, and get it: thread-safe implementation

Categories

Resources