Java thread safe DAO - java

I have a DAOClass which is called from many Threads as below for inserting into a set of tables -
public class DAOClass
{
private HashMap<String, HelperClass> insertBuffer;
public DAOClass()
{
insertBuffer = new HashMap<String, HelperClass>();
}
public int[] createSomeTable(String key, SomeTableRecord someTableRecord)
{
List<SomeTableRecord> someTableRecList;
HelperClass buf = insertBuffer.get(key);
if (buf == null)
{
buf = new HelperClass();
insertBuffer.put(key, buf);
}
someTableRecList = buf.getSomeTableBuffer();
someTableRecList.add(someTableRecord);
if(someTableRecList.size() >= Global.limit())
{
return flushSomeTableInsertCache(key);
}
else
{
return null;
}
}
public int[] flushSomeTableInsertCache(String key)
{
HelperClass buf = insertBuffer.get(key);
int[] retVal = null;
if (buf != null && buf.getSomeTableBuffer() != null)
{
retVal = createSomeTableBuffered(buf.getSomeTableBuffer());
buf.getSomeTableBuffer().clear();
}
return retVal;
}
}
public int[] createSomeTableBuffered(final List<SomeTableRecord> someTableRecordList)
{
INSERT QUERY GOES HERE from LIST..
}
}
Different Threads call createSomeTable method which adds to an ArrayList of a HelperClass. There is a HashMap but the key is overlapping i.e same key is hit by multiple threads simultaneously, thus corrupting HashMap and untimely flushings ..
Helper Class follows -
class HelperClass {
private String key;
private ArrayList<SomeTableRecord> someTableBuffer;
private ArrayList<SomeTable1Record> someTable1Buffer;
HelperClass() {
someTableBuffer = new ArrayList<SomeTableRecord>();
someTable1Buffer = new ArrayList<SomeTable1Record>();
}
public ArrayList<SomeTableRecord> getSomeTableBuffer() {
return someTableBuffer;
}
public ArrayList<SomeTable1Record> getSomeTable1Buffer() {
return someTable1Buffer;
}
}
But, this is apparently not thread safe as key is not disjoint. Can you please suggest some correction in the classes so that it is thread safe.

You should rather use ArrayList<HelperClass> than HashMap. To avoid conflicts, use
public synchronized int[] createSomeTable(String key, SomeTableRecord someTableRecord)
to protect your buffer.
UPDATE:
To protect the buffer even in Spring, add synchronized to flushSomeTableInsertCache as well:
public synchronized int[] flushSomeTableInsertCache(String key)
Actually you don't use keys just to identify the elements.
Otherwise it is not a good strategy to watch key collisions this way, because they can even happen between 2 flushes, so you should either check them in the database, or have a separate HashSet for the keys (if you are sure that you have all the keys in there).

Use class ConcurrentHashMap instead.

insertBuffer is the only state here. Modifying its content in a multi-threaded environment might result in unexpected behavior. You can either synchronize access to it or use ConcurrentHashMap instead of HashMap.

I would use synchronized methods rather than ConcurrentHashMap. However, using ConcurrentHashMap might solve your thread-safe-issue as well.

The simplest way to separate the usage is to create one DAOClass object for each thread.

Change your implementation to ConcurrentHashMap, that will solve your concurrency issue.

Related

How to implement thread-safe HashMap lazy initialization when getting value in Java?

I want to implement a util getting an Enum object by its string value. Here is my implementation.
IStringEnum.java
public interface IStringEnum {
String getValue();
}
StringEnumUtil.java
public class StringEnumUtil {
private volatile static Map<String, Map<String, Enum>> stringEnumMap = new HashMap<>();
private StringEnumUtil() {}
public static <T extends Enum<T>> Enum fromString(Class<T> enumClass, String symbol) {
final String enumClassName = enumClass.getName();
if (!stringEnumMap.containsKey(enumClassName)) {
synchronized (enumClass) {
if (!stringEnumMap.containsKey(enumClassName)) {
System.out.println("aaa:" + stringEnumMap.get(enumClassName));
Map<String, Enum> innerMap = new HashMap<>();
EnumSet<T> set = EnumSet.allOf(enumClass);
for (Enum e: set) {
if (e instanceof IStringEnum) {
innerMap.put(((IStringEnum) e).getValue(), e);
}
}
stringEnumMap.put(enumClassName, innerMap);
}
}
}
return stringEnumMap.get(enumClassName).get(symbol);
}
}
I wrote a unit test in order to test whether it works in multi-thread case.
StringEnumUtilTest.java
public class StringEnumUtilTest {
enum TestEnum implements IStringEnum {
ONE("one");
TestEnum(String value) {
this.value = value;
}
#Override
public String getValue() {
return this.value;
}
private String value;
}
#Test
public void testFromStringMultiThreadShouldOk() {
final int numThread = 100;
CountDownLatch startLatch = new CountDownLatch(1);
CountDownLatch doneLatch = new CountDownLatch(numThread);
List<Boolean> resultList = new LinkedList<>();
for (int i = 0; i < numThread; ++i) {
new Thread(() -> {
try {
startLatch.await();
} catch (Exception e) {
e.printStackTrace();
}
resultList.add(StringEnumUtil.fromString(TestEnum.class, "one") != null);
doneLatch.countDown();
}).start();
}
startLatch.countDown();
try {
doneLatch.await();
} catch (Exception e) {
e.printStackTrace();
}
assertEquals(numThread, resultList.stream().filter(item -> item.booleanValue()).count());
}
}
The testing result is:
aaa:null
java.lang.AssertionError:
Expected :100
Actual :98
It denotes that only one thread execute this line of code:
System.out.println("aaa:" + stringEnumMap.get(enumClassName));
So the initialization codes should be executed by only one thread.
The strange thing is, the result of some thread will be null after executing this line of code:
return stringEnumMap.get(enumClassName).get(symbol);
Since there is no NullPointerException, stringEnumMap.get(enumClassName) must return the reference of innerMap. But why it will get null after calling get(symbol) of innerMap?
Please help, it drive me crazy the whole day!
The problem is due to the line
List<Boolean> resultList = new LinkedList<>();
From JavaDoc of LinkedList:
Note that this implementation is not synchronized.If multiple threads access a linked list concurrently, and at least one of the threads modifies the list structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more elements; merely setting the value of an element is not a structural modification.) This is typically accomplished by synchronizing on some object that naturally encapsulates the list.If no such object exists, the list should be "wrapped" using the Collections.synchronizedListmethod. This is best done at creation time, to prevent accidental unsynchronized access to the list:
List list = Collections.synchronizedList(new LinkedList(...));
As LinkedList is not thread safe, and unexpected behavior may happens during the add operation.
Which cause the resultList size less than the thread count, and hence the expected count is less than the result count.
To get correct result, add Collections.synchronizedList as suggested.
Although you implementation is fine, I suggest you to follow Matt Timmermans answer for simpler and robust solution.
stringEnumMap should be a ConcurrentHashMap<String, Map<String,Enum>>, and use computeIfAbsent to do the lazy initialization.
ConcurrentMap interface
As others noted, if manipulating a Map across threads you must account for concurrency.
You could handle concurrent access yourself. But there is no need. Java comes with two implementations of Map that are built to internally handle concurrency. These implementations implement the ConcurrentMap interface.
ConcurrentSkipListMap
ConcurrentHashMap
The first maintains the keys in sorted order, implementing the NavigableMap interface.
Here is a table I authored to show the characteristics of all the implementations of Map bundled with Java 11.
You might find other third-party implementations of the ConcurrentMap interface.
try moving
if (!stringEnumMap.containsKey(enumClassName))
and the
return stringEnumMap.get(enumClassName).get(symbol);
into the synchronized block.

synchronize a method by achieving better performance?

I have a class that is being called by multiple threads on multi core machine. I want to make it thread safe.
add method will be called by multiple threads. And if key exists, just append the current value to new value otherwise just put key and value in the map.
Now to make it thread safe, I was planning to synchronize add method but it will destroy performance. Is there any better way by which we can achieve better performance without synchronizing add method?
class Test {
private final Map<Integer, Integer> map = new ConcurrentHashMap<>();
public void add(int key, int value) {
if (map.containsKey(key)) {
int val = map.get(key);
map.put(key, val + value);
return;
}
map.put(key, value);
}
public Object getResult() {
return map.toString();
}
}
but it will destroy performance
It likely wouldn't destroy performance. It will reduce it some, with further reduction if there is a high collision rate.
Is there any better way by which we can achieve better performance?
Yes, use merge() (Java 8+). Quoting the javadoc:
If the specified key is not already associated with a value or is associated with null, associates it with the given non-null value. Otherwise, replaces the associated value with the results of the given remapping function, or removes if the result is null.
Example:
public void add(int key, int value) {
map.merge(key, value, (a, b) -> a + b);
}
Or using a method reference to sum(int a, int b) instead of a lambda expression:
public void add(int key, int value) {
map.merge(key, value, Integer::sum);
}
Use merge:
class Test {
final Map<Integer, Integer> map = new ConcurrentHashMap<>();
public void add(int key, int value) {
map.merge(key, value, Integer::sum);
}
public Object getResult() {
return map.toString();
}
}
Java 7 solution if you absolutely can't use synchronized (or, you absolutely cannot lock explicitly):
class Test {
final Map<Integer, AtomicInteger> map = new ConcurrentHashMap<>();
public void add(int key, int value) {
get(key).addAndGet(value);
}
private AtomicInteger get(int key) {
AtomicInteger current = map.get(key);
if (current == null) {
AtomicInteger ai = new AtomicInteger();
current = map.putIfAbsent(key, ai);
if (current == null) {
current = ai;
}
}
return current;
}
public Object getResult() {
return map.toString();
}
}
synchronized causes a bottleneck only when you run an expensive operation holding a lock.
In your case by adding a synchronized you are doing:
1. check a hashmap for existence of a key
2. get the value mapped to that key
3. do an addition and put the result back to the hashmap.
All these operations are super cheap O(1) and unless you are using some strange pattern for the keys which are integers it should be very unlikely that you can get some degenerate performance due to collisions.
I would suggest if you can't use merge as the other answers point out, to just synchronize. You should be considered so much about performance only in critical hotpaths and after you have actually profiled that there is an issue there

Call synchronised on a function in java

Is there a way I can call synchronised (or something similar) on a block of code. For example (pseudo code),
public int getA(int id) {
if (flag) {
return synchronized(fetchA(id))
} else {
return fetchA(id)
}
}
public int fetchA(int id) {
if (map.get(id) == null) {
p = generate(id)
map.put(id, p)
return map.get(id)
} else {
return map.get(id)
}
}
In this case I want the function to take a lock on object map if flag is set to true and not take a lock otherwise. I have read that synchronised take locks only on objects. Is there any something else I can use instead of synchronised?
Synchronising on an Object ist exactly the right thing to do. In your case the shared object is the map.
You can do a
synchronized(map) {
return fetchA(id);
}
The locking is some sort of contract: if you access map, then you'll have to lock it. This can be cumbersome and error prone. Hence the better option is to lock and release within the fetchA() method, like
public int fetchA(int id) {
synchronized(map) {
if (map.get(id) == null) {
p = generate(id)
map.put(id, p)
return map.get(id)
} else {
return map.get(id)
}
}
}
This way any other method can simply call fetchA() without being aware of the need of locking.
An alternative is to declare the function synchronized. That way only a single thread at the time can enter the function.
public synchronized int fetchA(int id) { ... }
Having said this, be careful with nested locking. That's a good way to produce deadlocks.
To answer your direct question: just use a synchronized block:
synchronized (something) {
return fetchA(id);
}
But your approach is not great in the first place. For one thing, there is the computeIfAbsent method which does exactly what your fetchA method does:
public int fetchA(int id) {
return map.computeIfAbsent(id, k -> generate(k));
}
For another, it seem that flag is going to be a constant for the instance, since it doesn't make sense to access it in a synchronized way only some of the time.
So, choose a Map implementation based on the flag in the constructor:
if (flag) {
map = new ConcurrentHashMap<>();
} else {
map = new HashMap<>();
}
and then simply don't worry about whether you need to synchronize in your method:
public int getA(int id) {
return map.computeIfAbsent(id, k -> generate(k));
}
You could use ConcurrentHashMap. But it will throw "ConcurrentModificationException" if one thread tries to modify it while another is iterating over it.
For block of code level there are four types you can use
Instance methods
enter code here
Static methods
Code blocks inside instance methods
Code blocks inside static methods
refer this for more info :Code block level synchronization

java - concurrent updates to multiple objects

I am new in Java concurrency, I have a class holding data (doubles in the sample code below) that should be accessed with a get (like a Map), but with data stored internally in a array for performance reasons.
This run in a multithreaded environment and this index must be updated sometimes.
public class ConcurrencySampleCode {
private static Object lock = new Object();
private Map<String, Integer> map = ...
private double[] array = ...
public Double get(String id) {
synchronized (lock) {
Integer i = map.get(id);
if (i == null) {
return null;
}
return array[i];
}
}
public void update() {
Map<String, Integer> tmpMap = updateMap(...);
double[] tmpArray = updateArray(...);
synchronized (lock) { // should be atomic
map = tmpMap;
array = tmpArray;
}
}
}
I am not sure whether this code is correct or not? Also, is the synchronized keyword needed in the get function ?
Is there a better way of doing this ?
Thanks for your help
There's nothing wrong with your code, but you will need to use the volatile keyword on the map and the array to ensure all threads see the updated values immediately, and I'm not sure you want the lock to be static.
As an alternative you may want to check out the java.util.concurrent.atomic package. It has some handy thread-safe variable. For example you could move your map and array into their own class, then use the AtomicReference to store the object.
public class ConcurrencySampleCode {
private AtomicReference<DoubleMap> atomicMap = new AtomicReference(new DoubleMap());
//Inner class used to hold the map and array pair
public class DoubleMap {
private Map<String, Integer> map = ...
private double[] array = ...
}
public Double get(String id) {
DoubleMap map = atomicMap.get();
...
}
public void update() {
Map<String, Integer> tmpMap = updateMap(...);
double[] tmpArray = updateArray(...);
DoubleMap newMap = new DoubleMap(tmpMap, tmpArray);
atomicMap.set(newMap);
}
}
There is a lot going on in concurrent programming, but for instance your update() method is faulty. In the current state multiple Threads can call ConcurrencySampleCode.update() and every each one of them will initiate both update calls inside the body before the synchronization kicks in. This means that after the round-robin turnover the last Thread with the update call will not have the changes from the previous update calls in the newly update map and array.
Long story, try to use and understand the ConcurrentHashMap

Create and put a map value only if not already present, and get it: thread-safe implementation

What is the best way to make this snippet thread-safe?
private static final Map<A, B> MAP = new HashMap<A, B>();
public static B putIfNeededAndGet(A key) {
B value = MAP.get(key);
if (value == null) {
value = buildB(...);
MAP.put(key, value);
}
return value;
}
private static B buildB(...) {
// business, can be quite long
}
Here are the few solutions I could think about:
I could use a ConcurrentHashMap, but if I well understood, it just makes the atomic put and get operations thread-safe, i.e. it does not ensure the buildB() method to be called only once for a given value.
I could use Collections.synchronizedMap(new HashMap<A, B>()), but I would have the same issue as the first point.
I could set the whole putIfNeededAndGet() method synchronized, but I can have really many threads accessing this method together, so it could be quite expensive.
I could use the double-checked locking pattern, but there is still the related out-of-order writes issue.
What other solutions may I have?
I know this is a quite common topic on the Web, but I didn't find a clear, full and working example yet.
Use ConcurrentHashMap and the lazy init pattern which you used
public static B putIfNeededAndGet(A key) {
B value = map.get(key);
if (value == null) {
value = buildB(...);
B oldValue = map.putIfAbsent(key, value);
if (oldValue != null) {
value = oldValue;
}
}
return value;
}
This might not be the answer you're looking for, but use the Guava CacheBuilder, it already does all that and more:
private static final LoadingCache<A, B> CACHE = CacheBuilder.newBuilder()
.maximumSize(100) // if necessary
.build(
new CacheLoader<A, B>() {
public B load(A key) {
return buildB(key);
}
});
You can also easily add timed expiration and other features as well.
This cache will ensure that load() (or in your case buildB) will not be called concurrently with the same key. If one thread is already building a B, then any other caller will just wait for that thread.
In the above solution it is possible that many threads will class processB(...) simultaneously hence all will calculate. But in my case i am using Future and a single thread only get the old value as null hence it will only compute the processB rest will wait on f.get().
private static final ConcurrentMap<A, Future<B>> map = new ConcurrentHashMap<A, Future<B>>();
public static B putIfNeededAndGet(A key) {
while (true) {
Future<V> f = map.get(key);
if (f == null) {
Callable<B> eval = new Callable<V>() {
public B call() throws InterruptedException {
return buildB(...);
}
};
FutureTask<V> ft = new FutureTask<V>(eval);
f = map.putIfAbsent(arg, ft);
if (f == null) {
f = ft;
ft.run();
}
}
try {
return f.get();
} catch (CancellationException e) {
cache.remove(arg, f);
} catch (ExecutionException e) {
}
}
}
Thought maybe this will be useful for someone else as well, using java 8 lambdas I created this function which worked great for me:
private <T> T getOrCreate(Object key, Map<Object, T> map,
Function<Object, T> creationFunction) {
T value = map.get(key);
// if the doesn't exist yet - create and add it
if (value == null) {
value = creationFunction.apply(key);
map.put(label, metric);
}
return value;
}
then you can use it like this:
Object o = getOrCreate(key, map, s -> createSpecialObjectWithKey(key));
I created this for something specific but changed the context and code to a more general look, that is why my creationFunction has one parameter, it can also have no parameters...
also you can generify it more by changing Object to a generic type, if it's not clear let me know and I'll add another example.
UPDATE:
I just found out about Map.computeIfAbsent which basically does the same, gotta love java 8 :)

Categories

Resources