I have a class that is being called by multiple threads on multi core machine. I want to make it thread safe.
add method will be called by multiple threads. And if key exists, just append the current value to new value otherwise just put key and value in the map.
Now to make it thread safe, I was planning to synchronize add method but it will destroy performance. Is there any better way by which we can achieve better performance without synchronizing add method?
class Test {
private final Map<Integer, Integer> map = new ConcurrentHashMap<>();
public void add(int key, int value) {
if (map.containsKey(key)) {
int val = map.get(key);
map.put(key, val + value);
return;
}
map.put(key, value);
}
public Object getResult() {
return map.toString();
}
}
but it will destroy performance
It likely wouldn't destroy performance. It will reduce it some, with further reduction if there is a high collision rate.
Is there any better way by which we can achieve better performance?
Yes, use merge() (Java 8+). Quoting the javadoc:
If the specified key is not already associated with a value or is associated with null, associates it with the given non-null value. Otherwise, replaces the associated value with the results of the given remapping function, or removes if the result is null.
Example:
public void add(int key, int value) {
map.merge(key, value, (a, b) -> a + b);
}
Or using a method reference to sum(int a, int b) instead of a lambda expression:
public void add(int key, int value) {
map.merge(key, value, Integer::sum);
}
Use merge:
class Test {
final Map<Integer, Integer> map = new ConcurrentHashMap<>();
public void add(int key, int value) {
map.merge(key, value, Integer::sum);
}
public Object getResult() {
return map.toString();
}
}
Java 7 solution if you absolutely can't use synchronized (or, you absolutely cannot lock explicitly):
class Test {
final Map<Integer, AtomicInteger> map = new ConcurrentHashMap<>();
public void add(int key, int value) {
get(key).addAndGet(value);
}
private AtomicInteger get(int key) {
AtomicInteger current = map.get(key);
if (current == null) {
AtomicInteger ai = new AtomicInteger();
current = map.putIfAbsent(key, ai);
if (current == null) {
current = ai;
}
}
return current;
}
public Object getResult() {
return map.toString();
}
}
synchronized causes a bottleneck only when you run an expensive operation holding a lock.
In your case by adding a synchronized you are doing:
1. check a hashmap for existence of a key
2. get the value mapped to that key
3. do an addition and put the result back to the hashmap.
All these operations are super cheap O(1) and unless you are using some strange pattern for the keys which are integers it should be very unlikely that you can get some degenerate performance due to collisions.
I would suggest if you can't use merge as the other answers point out, to just synchronize. You should be considered so much about performance only in critical hotpaths and after you have actually profiled that there is an issue there
Related
I want to implement a util getting an Enum object by its string value. Here is my implementation.
IStringEnum.java
public interface IStringEnum {
String getValue();
}
StringEnumUtil.java
public class StringEnumUtil {
private volatile static Map<String, Map<String, Enum>> stringEnumMap = new HashMap<>();
private StringEnumUtil() {}
public static <T extends Enum<T>> Enum fromString(Class<T> enumClass, String symbol) {
final String enumClassName = enumClass.getName();
if (!stringEnumMap.containsKey(enumClassName)) {
synchronized (enumClass) {
if (!stringEnumMap.containsKey(enumClassName)) {
System.out.println("aaa:" + stringEnumMap.get(enumClassName));
Map<String, Enum> innerMap = new HashMap<>();
EnumSet<T> set = EnumSet.allOf(enumClass);
for (Enum e: set) {
if (e instanceof IStringEnum) {
innerMap.put(((IStringEnum) e).getValue(), e);
}
}
stringEnumMap.put(enumClassName, innerMap);
}
}
}
return stringEnumMap.get(enumClassName).get(symbol);
}
}
I wrote a unit test in order to test whether it works in multi-thread case.
StringEnumUtilTest.java
public class StringEnumUtilTest {
enum TestEnum implements IStringEnum {
ONE("one");
TestEnum(String value) {
this.value = value;
}
#Override
public String getValue() {
return this.value;
}
private String value;
}
#Test
public void testFromStringMultiThreadShouldOk() {
final int numThread = 100;
CountDownLatch startLatch = new CountDownLatch(1);
CountDownLatch doneLatch = new CountDownLatch(numThread);
List<Boolean> resultList = new LinkedList<>();
for (int i = 0; i < numThread; ++i) {
new Thread(() -> {
try {
startLatch.await();
} catch (Exception e) {
e.printStackTrace();
}
resultList.add(StringEnumUtil.fromString(TestEnum.class, "one") != null);
doneLatch.countDown();
}).start();
}
startLatch.countDown();
try {
doneLatch.await();
} catch (Exception e) {
e.printStackTrace();
}
assertEquals(numThread, resultList.stream().filter(item -> item.booleanValue()).count());
}
}
The testing result is:
aaa:null
java.lang.AssertionError:
Expected :100
Actual :98
It denotes that only one thread execute this line of code:
System.out.println("aaa:" + stringEnumMap.get(enumClassName));
So the initialization codes should be executed by only one thread.
The strange thing is, the result of some thread will be null after executing this line of code:
return stringEnumMap.get(enumClassName).get(symbol);
Since there is no NullPointerException, stringEnumMap.get(enumClassName) must return the reference of innerMap. But why it will get null after calling get(symbol) of innerMap?
Please help, it drive me crazy the whole day!
The problem is due to the line
List<Boolean> resultList = new LinkedList<>();
From JavaDoc of LinkedList:
Note that this implementation is not synchronized.If multiple threads access a linked list concurrently, and at least one of the threads modifies the list structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more elements; merely setting the value of an element is not a structural modification.) This is typically accomplished by synchronizing on some object that naturally encapsulates the list.If no such object exists, the list should be "wrapped" using the Collections.synchronizedListmethod. This is best done at creation time, to prevent accidental unsynchronized access to the list:
List list = Collections.synchronizedList(new LinkedList(...));
As LinkedList is not thread safe, and unexpected behavior may happens during the add operation.
Which cause the resultList size less than the thread count, and hence the expected count is less than the result count.
To get correct result, add Collections.synchronizedList as suggested.
Although you implementation is fine, I suggest you to follow Matt Timmermans answer for simpler and robust solution.
stringEnumMap should be a ConcurrentHashMap<String, Map<String,Enum>>, and use computeIfAbsent to do the lazy initialization.
ConcurrentMap interface
As others noted, if manipulating a Map across threads you must account for concurrency.
You could handle concurrent access yourself. But there is no need. Java comes with two implementations of Map that are built to internally handle concurrency. These implementations implement the ConcurrentMap interface.
ConcurrentSkipListMap
ConcurrentHashMap
The first maintains the keys in sorted order, implementing the NavigableMap interface.
Here is a table I authored to show the characteristics of all the implementations of Map bundled with Java 11.
You might find other third-party implementations of the ConcurrentMap interface.
try moving
if (!stringEnumMap.containsKey(enumClassName))
and the
return stringEnumMap.get(enumClassName).get(symbol);
into the synchronized block.
I try use Set interface as value for hazelcast IMap instance and when I run my test I found that test hung inside ConcurrentMap#compute method.
Why do I have infinite loop when I use hazelcast IMap in this code:
import com.hazelcast.config.Config;
import com.hazelcast.config.MapConfig;
import com.hazelcast.core.Hazelcast;
import com.hazelcast.core.IMap;
import java.io.Serializable;
import java.util.*;
public class Main {
public static void main(String[] args) {
IMap<String, HashSet<StringWrapper>> store = Hazelcast.newHazelcastInstance(
new Config().addMapConfig(new MapConfig("store"))
).getMap("store");
store.compute("user", (k, value) -> {
HashSet<StringWrapper> newValues = Objects.isNull(value) ? new HashSet<>() : new HashSet<>(value);
newValues.add(new StringWrapper("user"));
return newValues;
});
store.compute("user", (k, value) -> {
HashSet<StringWrapper> newValues = Objects.isNull(value) ? new HashSet<>() : new HashSet<>(value);
newValues.add(new StringWrapper("user"));
return newValues;
});
System.out.println(store.keySet());
}
// Data class
public static class StringWrapper implements Serializable {
String value;
public StringWrapper() {}
public StringWrapper(String value) {
this.value = value;
}
public String getValue() {
return value;
}
public void setValue(String value) {
this.value = value;
}
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
if (!super.equals(o)) return false;
StringWrapper value = (StringWrapper) o;
return Objects.equals(this.value, value.value);
}
#Override
public int hashCode() {
return Objects.hash(super.hashCode(), value);
}
}
}
Hazelcast: 3.9.3
Java:build 1.8.0_161-b12
Operating system: macOS High Sierra 10.13.3
#Alykoff I reproduced the issue based on above example & ArrayList version, which is reported as a github issue: https://github.com/hazelcast/hazelcast/issues/12557.
There are 2 seperate problems:
1 - When using HashSet, the problem is how Java deserialize the HashSet/ArrayList (collections) & how compute method works. Inside compute method (since Hazelcast complied with Java 6 & there is no compute method to override, default implementation from ConcurrentMap called ), this block causes the infinite loop:
// replace
if (replace(key, oldValue, newValue)) {
// replaced as expected.
return newValue;
}
// some other value replaced old value. try again.
oldValue = get(key);
this replace method calls IMap replace method. IMap checks if the current value equal to the user-supplied value. But because of a Java Serialization optimization, the check fails. Please check HashSet.readObject method. You'll see that when deserializing the HashSet, since element size is known, it creates the inner HashMap with a capacity:
// Set the capacity according to the size and load factor ensuring that
// the HashMap is at least 25% full but clamping to maximum capacity.
capacity = (int) Math.min(size * Math.min(1 / loadFactor, 4.0f),
HashMap.MAXIMUM_CAPACITY);
But your HashSet, created without an initial capacity, has a default capacity of 16, while the deserialized one has the initial capacity of 1. This changes the serialization, index 51 contains the current capacity & it seems JDK re-calculate it based on size when deserializing the object to minimize the size.
Please see below example:
HazelcastInstance hz = Hazelcast.newHazelcastInstance();
IMap<String, Collection<String>> store = instance.getMap("store");
Collection<String> val = new HashSet<>();
val.add("a");
store.put("a", val);
Collection<String> oldVal = store.get("a");
byte[] dataOld = ((HazelcastInstanceProxy) hz).getSerializationService().toBytes(oldVal);
byte[] dataNew = ((HazelcastInstanceProxy) hz).getSerializationService().toBytes(val);
System.out.println(Arrays.equals(dataNew, dataOld));
This code prints false. But if you create the HashSet with the initial size 1, then both byte arrays are equal. And in your case, you won't get an infinite loop.
2 - When using ArrayList, or any other collection, there's another problem which you pointed above. Due to how compute method implemented in ConcurrentMap, when you assign the old value to the newValue & add a new element, you actually modify the oldValue thus causing replace method fail. But when you change the code to new ArrayList(value), now you're creating a new ArrayList & value collection is not modified. It's a best practice to wrap a collection before using it if you don't want to modify the original one. Same works for HashSet if you create with size 1 due to the first issue I explained.
So in your case, you should use
Collection<String> newValues = Objects.isNull(value) ? new HashSet<>(1) : new HashSet<>(value);
or
Collection<String> newValues = Objects.isNull(value) ? new ArrayList<>() : new ArrayList<>(value);
That HashSet case seems to be a JDK issue, rather than an optimization. I don't know any of these cases can be solved/fixed in Hazelcast, unless Hazalcast overrides the HashXXX collection serialization & overrides the compute method.
I've got a question about synchronization of objects inside a Map (same objects I later change value of). I want to atomically read, do checks and possibly do updates to a value from a map without locking the entire map. Is this a valid way to work with synchronization of objects?
private final Map<String, AtomicInteger> valueMap = new HashMap<>();
public Response addValue(#NotNull String key, #NotNull Integer value) {
AtomicInteger currentValue = valueMap.get(key);
if (currentValue == null) {
synchronized (valueMap) {
// Doublecheck that value hasn't been changed before entering synchronized
currentValue = valueMap.get(key);
if (currentValue == null) {
currentValue = new AtomicInteger(0);
valueMap.put(key, currentValue);
}
}
}
synchronized (valueMap.get(key)) {
// Check that value hasn't been changed when changing synchronized blocks
currentValue = valueMap.get(key);
if (currentValue.get() + value > MAX_LIMIT) {
return OVERFLOW;
}
currentValue.addAndGet(value);
return OK;
}
}
I fail to see much of a difference between your approach and that of a standard ConcurrentHashMap - asides from the fact that ConcurrentHashMap has been heavily tested, and can be configured for minimal overhead with the exact number of threads you want to run the code with.
In a ConcurrentHashMap, you would use the replace(K key, V old, V new) method to atomically update key to new only when the old value has not changed.
The space savings due to removing all those AtomicIntegers and the time savings due to lower synchronization overhead will probably compensate having to wrap the replace(k, old, new) calls within while-loops:
ConcurrentHashMap<String, Integer> valueMap =
new ConcurrentHashMap<>(16, .75f, expectedConcurrentThreadCount);
public Response addToKey(#NotNull String key, #NotNull Integer value) {
if (value > MAX_LIMIT) {
// probably should set value to MAX_LIMIT-1 before failing
return OVERFLOW;
}
boolean updated = false;
do {
Integer old = putIfAbsent(key, value);
if (old == null) {
// it was absent, and now it has been updated to value: ok
updated = true;
} else if (old + value > MAX_LIMIT) {
// probably should set value to MAX_LIMIT-1 before failing
return OVERFLOW;
} else {
updated = valueMap.replace(key, old, old+value);
}
} while (! updated);
return OK;
}
Also, on the plus side, this code works even if the key was removed after checking it (yours throws an NPE in this case).
I was just wondering, what would happen if key of a HashMap is mutable, test program below demonstrate that and I am unable to understand when both equals and hashCode methods returns
true and same value, why does hashmap.containsKey return false.
public class MutableKeyHashMap {
public static void main(String []a){
HashMap<Mutable, String> map = new HashMap<Mutable, String>();
Mutable m1 = new Mutable(5);
map.put(m1, "m1");
Mutable m2 = new Mutable(5);
System.out.println(map.containsKey(m2));
m2.setA(6);
m1.setA(6);
Mutable m3 = map.keySet().iterator().next();
System.out.println(map.containsKey(m2)+" "+m3.hashCode()+" "+m2.hashCode()+" "+m3.equals(m2));
}
}
class Mutable {
int a;
public Mutable(int a) {
this.a = a;
}
#Override
public boolean equals(Object obj) {
Mutable m = (Mutable) obj;
return m.a == this.a ? true : false;
}
#Override
public int hashCode(){
return a;
}
public void setA(int a) {
this.a = a;
}
public int getA() {
return a;
}
}
This the output :
true
false 6 6 true
The javadoc explains it
Note: great care must be exercised if mutable objects are used as map keys. The behavior of a map is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is a key in the map.
Basically, don't use mutable objects as keys in a Map, you're going to get burnt
To extrapolate, because the docs may not appear clear, I believe the pertinent point here is `changed in a manner that affects equals', and you seem to be assuming that equals(Object) is called each time contains is invoked. The docs don't say that, the wording implies they may be allowed to cache computations.
Looking at the source, it seems that because your hashCode returns a different value (was 5, now 6), it's possible that it's being looked up in a different bucket based on implementation details.
You can think of if this way, the Map has 16 buckets. When you give it an object with A == 5, it tosses it into bucket 5. Now you can change A to 6, but it's still in bucket 5. The Map doesn't know you changed A, it doesn't rearrange things internally.
Now you come over with another object with A == 6, and you ask the Map if it has one of those. It goes and looks in bucket 6 and says "Nope, nothing there." It's not going to go and check all the other buckets for you.
Obviously how things get put into buckets is more complicated than that, but that's how it works at the core.
The HashMap puts your object at the location for hash key 5. Then you change the key to 6 and use containsKey to ask the map whether it contains the object. The map looks at position 6 and finds nothing, so it answers false.
So don't do that, then.
When you put "m1" the first time around, hashCode() was 5. Thus the HashMap used 5 to place the value into the appropriate bucket. After changing m2, the hashCode() was 6 so when you tried looking for the value you put in, it the bucket it looked in was different.
A code example to accompany ptomli's answer.
import java.util.*;
class Elem {
private int n;
public Elem(int n) {
this.n = n;
}
public void setN(int n) {
this.n = n;
}
#Override
public int hashCode() {
return n;
}
#Override
public boolean equals(Object e) {
if (this == e)
return true;
if (!(e instanceof Elem))
return false;
Elem an = (Elem) e;
return n == an.n;
}
}
public class MapTest {
public static void main (String [] args) {
Elem e1 = new Elem(1);
Elem e2 = new Elem(2);
HashMap map = new HashMap();
map.put(e1, 100);
map.put(e2, 200);
System.out.println("before modification: " + map.get(e1));
e1.setN(9);
System.out.println("after modification using updated key: " + map.get(e1));
Elem e3 = new Elem(1);
System.out.println("after modification using key which equals to the original key: " + map.get(e3));
}
}
Compiles and runs it. The result is:
before modification: 100
after modification using updated key: null
after modification using key which equals to the original key: null
I am using Java 6 on Linux.
What is the best way to make this snippet thread-safe?
private static final Map<A, B> MAP = new HashMap<A, B>();
public static B putIfNeededAndGet(A key) {
B value = MAP.get(key);
if (value == null) {
value = buildB(...);
MAP.put(key, value);
}
return value;
}
private static B buildB(...) {
// business, can be quite long
}
Here are the few solutions I could think about:
I could use a ConcurrentHashMap, but if I well understood, it just makes the atomic put and get operations thread-safe, i.e. it does not ensure the buildB() method to be called only once for a given value.
I could use Collections.synchronizedMap(new HashMap<A, B>()), but I would have the same issue as the first point.
I could set the whole putIfNeededAndGet() method synchronized, but I can have really many threads accessing this method together, so it could be quite expensive.
I could use the double-checked locking pattern, but there is still the related out-of-order writes issue.
What other solutions may I have?
I know this is a quite common topic on the Web, but I didn't find a clear, full and working example yet.
Use ConcurrentHashMap and the lazy init pattern which you used
public static B putIfNeededAndGet(A key) {
B value = map.get(key);
if (value == null) {
value = buildB(...);
B oldValue = map.putIfAbsent(key, value);
if (oldValue != null) {
value = oldValue;
}
}
return value;
}
This might not be the answer you're looking for, but use the Guava CacheBuilder, it already does all that and more:
private static final LoadingCache<A, B> CACHE = CacheBuilder.newBuilder()
.maximumSize(100) // if necessary
.build(
new CacheLoader<A, B>() {
public B load(A key) {
return buildB(key);
}
});
You can also easily add timed expiration and other features as well.
This cache will ensure that load() (or in your case buildB) will not be called concurrently with the same key. If one thread is already building a B, then any other caller will just wait for that thread.
In the above solution it is possible that many threads will class processB(...) simultaneously hence all will calculate. But in my case i am using Future and a single thread only get the old value as null hence it will only compute the processB rest will wait on f.get().
private static final ConcurrentMap<A, Future<B>> map = new ConcurrentHashMap<A, Future<B>>();
public static B putIfNeededAndGet(A key) {
while (true) {
Future<V> f = map.get(key);
if (f == null) {
Callable<B> eval = new Callable<V>() {
public B call() throws InterruptedException {
return buildB(...);
}
};
FutureTask<V> ft = new FutureTask<V>(eval);
f = map.putIfAbsent(arg, ft);
if (f == null) {
f = ft;
ft.run();
}
}
try {
return f.get();
} catch (CancellationException e) {
cache.remove(arg, f);
} catch (ExecutionException e) {
}
}
}
Thought maybe this will be useful for someone else as well, using java 8 lambdas I created this function which worked great for me:
private <T> T getOrCreate(Object key, Map<Object, T> map,
Function<Object, T> creationFunction) {
T value = map.get(key);
// if the doesn't exist yet - create and add it
if (value == null) {
value = creationFunction.apply(key);
map.put(label, metric);
}
return value;
}
then you can use it like this:
Object o = getOrCreate(key, map, s -> createSpecialObjectWithKey(key));
I created this for something specific but changed the context and code to a more general look, that is why my creationFunction has one parameter, it can also have no parameters...
also you can generify it more by changing Object to a generic type, if it's not clear let me know and I'll add another example.
UPDATE:
I just found out about Map.computeIfAbsent which basically does the same, gotta love java 8 :)