How to synchronize multiple threads from accessing some common data - java

I have three different threads which creates three different objects to read/manipulate some data which is common for all the threads. Now, I need to ensure that we are giving an access only to one thread at a time.
The example goes something like this.
public interface CommonData {
public void addData(); // adds data to the cache
public void getDataAccessKey(); // Key that will be common across different threads for each data type
}
/*
* Singleton class
*/
public class CommonDataCache() {
private final Map dataMap = new HashMap(); // this takes keys and values as custom objects
}
The implementation class of the interface would look like this
class CommonDataImpl implements CommonData {
private String key;
public CommonDataImpl1(String key) {
this.key = key;
}
public void addData() {
// access the singleton cache class and add
}
public void getDataAccessKey() {
return key;
}
}
Each thread will be invoked as follows:
CommonData data = new CommonDataImpl("Key1");
new Thread(() -> data.addData()).start();
CommonData data1 = new CommonDataImpl("Key1");
new Thread(() -> data1.addData()).start();
CommonData data2 = new CommonDataImpl("Key1");
new Thread(() -> data2.addData()).start();
Now, I need to synchronize those threads if and only if the keys of the data object (passed on to the thread) is the same.
My thought process so far:
I tried to have a class that provides the lock on the fly for a given key which looks something like this.
/*
* Singleton class
*/
public class DataAccessKeyToLockProvider {
private volatile Map<String, ReentrantLock> accessKeyToLockHolder = new ConcurrentHashMap<>();
private DataAccessKeyToLockProvider() {
}
public ReentrantLock getLock(String key) {
return accessKeyToLockHolder.putIfAbsent(key, new ReentrantLock());
}
public void removeLock(BSSKey key) {
ReentrantLock removedLock = accessKeyToLockHolder.remove(key);
}
}
So each thread would call this class and get the lock and use it and remove it once the processing is done. But this can so result in a case where the second thread could get the lock object that was inserted by the first thread and waiting for the first thread to release the lock. Once the first thread removes the lock, now the third thread would get a different lock altogether, so the 2nd thread and the 3rd thread are not in sync anymore.
Something like this:
new Thread(() -> {
ReentrantLock lock = DataAccessKeyToLockProvider.get(data.getDataAccessKey());
lock.lock();
data.addData();
lock.unlock();
DataAccessKeyToLockProvider.remove(data.getDataAccessKey());
).start();
Please let me know if you need any additional details to help me resolve my problem
P.S: Removing the key from the lock provider is kind of mandatory as i will be dealing with some millions of keys (not necessarily strings), so I don't want the lock provider to eat up my memory
Inspired the solution provided #rzwitserloot, I have tried to put some generic code that waits for the other thread to complete its processing before giving the access to the next thread.
public class GenericKeyToLockProvider<K> {
private volatile Map<K, ReentrantLock> keyToLockHolder = new ConcurrentHashMap<>();
public synchronized ReentrantLock getLock(K key) {
ReentrantLock existingLock = keyToLockHolder.get(key);
try {
if (existingLock != null && existingLock.isLocked()) {
existingLock.lock(); // Waits for the thread that acquired the lock previously to release it
}
return keyToLockHolder.put(key, new ReentrantLock()); // Override with the new lock
} finally {
if (existingLock != null) {
existingLock.unlock();
}
}
}
}
But looks like the entry made by the last thread wouldn't be removed. Anyway to solve this?

First, a clarification: You either use ReentrantLock, OR you use synchronized. You don't synchronized on a ReentrantLock instance (you synchronize on any object you want) – or, if you want to go the lock route, you can call the lock lock method on your lock object, using a try/finally guard to always ensure you call unlock later (and don't use synchronized at all).
synchronized is low-level API. Lock, and all the other classes in the java.util.concurrent package are higher level and offer far more abstractions. It's generally a good idea to just peruse the javadoc of all the classes in the j.u.c package from time to time, very useful stuff in there.
The key issue is to remove all references to a lock object (thus ensuring it can be garbage collected), but not until you are certain there are zero active threads locking on it. Your current approach does not know how many classes are waiting. That needs to be fixed. Once you return an instance of a Lock object, it's 'out of your hands' and it is not possible to track if the caller is ever going to call lock on it. Thus, you can't do that. Instead, call lock as part of the job; the getLock method should actually do the locking as part of the operation. That way, YOU get to control the process flow. However, let's first take a step back:
You say you'll have millions of keys. Okay; but it is somewhat unlikely you'll have millions of threads. After all, a thread requires a stack, and even using the -Xss parameter to reduce the stack size to the minimum of 128k or so, a million threads implies you're using up 128GB of RAM just for stacks; seems unlikely.
So, whilst you might have millions of keys, the number of 'locked' keys is MUCH smaller. Let's focus on those.
You could make a ConcurrentHashMap which maps your string keys to lock objects. Then:
To acquire a lock:
Create a new lock object (literally: Object o = new Object(); - we are going to be using synchronized) and add it to the map using putIfAbsent. If you managed to create the key/value pair (compare the returned object using == to the one you made; if they are the same, you were the one to add it), you got it, go, run the code. Once you're done, acquire the sync lock on your object, send a notification, release, and remove:
public void doWithLocking(String key, Runnable op) {
Object locker = new Object();
Object o = concurrentMap.putIfAbsent(key, locker);
if (o == locker) {
op.run();
synchronized (locker) {
locker.notifyAll(); // wake up everybody waiting.
concurrentMap.remove(key); // this has to be inside!
}
} else {
...
}
}
To wait until the lock is available, first acquire a lock on the locker object, THEN check if the concurrentMap still contains it. If not, you're now free to retry this operation. If it's still in, then we now wait for a notification. In any case we always just retry from scratch. Thus:
public void performWithLocking(String key, Runnable op) throws InterruptedException {
while (true) {
Object locker = new Object();
Object o = concurrentMap.putIfAbsent(key, locker);
if (o == locker) {
try {
op.run();
} finally {
// We want to lock even if the operation throws!
synchronized (locker) {
locker.notifyAll(); // wake up everybody waiting.
concurrentMap.remove(key); // this has to be inside!
}
}
return;
} else {
synchronized (o) {
if (concurrentMap.containsKey(key)) o.wait();
}
}
}
}
}
Instead of this setup where you pass the operation to execute along with the lock key, you could have tandem 'lock' and 'unlock' methods but now you run the risk of writing code that forgets to call unlock. Hence why I wouldn't advise it!
You can call this with, for example:
keyedLockSupportThingie.doWithLocking("mykey", () -> {
System.out.println("Hello, from safety!");
});

Related

Concurrently find and remove element from collection or wait

public class MyClass {
private List<Integer> resources = new ArrayList<>();
public synchronized Integer getAndRemoveResourceOrWait(Integer requestedResource) throws InterruptedException {
while(resources.stream().anyMatch((r) -> { return r >= requestedResource; })) {
wait();
}
Integer found = resources.stream().findFirst((r) -> {
return r >= requestedResource;
}).get();
resources.remove(found);
return found;
}
public void addResource(Integer resource) {
resources.add(resource);
notifyAll();
}
}
Thread "A" episodically invokes addResource with random value.
A few another threads actively invokes getAndRemoveResourceOrWait.
What I need to do to let method getAndRemoveResourceOrWait work concurrently?
For example, thread "X" invokes getAndRemoveResourceOrWait with variable 128 which does not exists in resources collection. So, it become waiting for it. While it is waiting, thread "Y" invokes getAndRemoveResourceOrWait with variable 64 and it exists in resources collection. Thread "Y" should not wait for thread "X" to complete.
What I need to do to let method getAndRemoveResourceOrWait work concurrently?
It simply needs to run on a different thread to the one that calls addResource(resource).
Note that getAndRemoveResource is a blocking (synchronous) operation in the sense that the thread making the call is blocked until it gets the answer. However one thread that is calling getAndRemoveResource does not block another thread calling getAndRemoveResource. The key is that the wait() call releases the mutex, and then reacquires it when the mutex is notified. What will happen here is that a notifyAll will cause all waiting threads to way up, one at a time.
However, there is a bug on your addResource method. The method needs to be declared as synchronized. If you don't call notifyAll() while the current thread holds the mutex for on this, you will get an exception. (And this is also necessary to ensure that the updates to the shared resources object are visible ... in both directions.)
Also, this implementation is not going to scale well:
Each waiting thread will scan the entire resource list on every update; i.e. on every call to addResource.
When a waiting thread finds a resource, it will scan the list twice more to remove it.
All of this is done while holding the mutex on the shared MyClass instance ... which blocks addResource as well.
UPDATE - Assuming that the Resource values are unique, a better solution would be to use replace ArrayList with TreeSet. This should work:
public class MyClass {
private TreetSet<Integer> resources = new TreeSet<>();
public synchronized Integer getAndRemoveResourceOrWait(
Integer resource) throws InterruptedException {
while (true) {
Integer found = resources.tailSet(resource, true).pollFirst();
if (found != null) {
return found;
}
wait();
}
}
public synchronized void addResource(Integer resource) {
resources.add(resource);
notifyAll();
}
}
(I also tried ConcurrentSkipListSet but I couldn't figure out a way to avoid using a mutex while adding and removing. If you were trying to remove an equal resource, it could be done ...)

Java How to implement lock on ConcurrentHashMap read

TL;DR: in Java I have N threads, each using a shared collection. ConcurrentHashMap allows me to lock on write, but not on read. What I need is to lock a specific item of the collection, read the previous data, do some computation, and update the values. If two threads receive two messages from the same sender, the second thread has to wait for the first one to finish, before doing its stuff.
Long version:
These threads are receiving chronologically ordered messages, and they have to update the collection basing on a messageSenderID.
My code simplified is as follow:
public class Parent {
private Map<String, MyObject> myObjects;
ExecutorService executor;
List<Future<?>> runnables = new ArrayList<Future<?>>();
public Parent(){
myObjects= new ConcurrentHashMap<String, MyObject>();
executor = Executors.newFixedThreadPool(10);
for (int i = 0; i < 10; i++) {
WorkerThread worker = new WorkerThread("worker_" + i);
Future<?> future = executor.submit(worker);
runnables.add(future);
}
}
private synchronized String getMessageFromSender(){
// Get a message from the common source
}
private synchronized MyObject getMyObject(String id){
MyObject myObject = myObjects.get(id);
if (myObject == null) {
myObject = new MyObject(id);
myObjects.put(id, myObject);
}
return myObject;
}
private class WorkerThread implements Runnable {
private String name;
public WorkerThread(String name) {
this.name = name;
}
#Override
public void run() {
while(!isStopped()) {
JSONObject message = getMessageFromSender();
String id = message.getString("id");
MyObject myObject = getMyObject(id);
synchronized (myObject) {
doLotOfStuff(myObject);
}
}
}
}
}
So basically I have one producer and N consumers, to speed-up processing, but the N consumers have to deal with a common base of data and chronological order has to be respected.
I am currently using a ConcurrentHashMap, but I'm willing to change it if needed.
The code seems to work if messages with same ID arrive enough apart (> 1 second), but if I get two messages with the same ID in the distance of microseconds, I get two threads dealing with the same item in the collection.
I GUESS that my desired behavior is:
Thread 1 Thread 2
--------------------------------------------------------------
read message 1
find ID
lock that ID in collection
do computation and update
read message 2
find ID
lock that ID in collection
do computation and update
While I THINK that this is what happens:
Thread 1 Thread 2
--------------------------------------------------------------
read message 1
read message 2
find ID
lock that ID in collection
do computation and update
find ID
lock that ID in collection
do computation and update
I thought about doing something like
JSONObject message = getMessageFromSender();
synchronized(message){
String id = message.getString("id");
MyObject myObject = getMyObject(id);
synchronized (myObject) {
doLotOfStuff(myObject);
} // well maybe this inner synchronized is superfluous, at this point
}
But I think that would kill the whole purpose of having a multithreaded structure, since I would read one message at a time, and the workers are not doing anything else; and it would be like if I was using a SynchronizedHashMap instead of a ConcurrentHashMap.
For the record, I report here the solution I implemented eventually. I'm not sure it is optimal and I still have to test for performances, but at least the input is handed properly.
public class Parent implements Runnable {
private final static int NUM_WORKERS = 10;
ExecutorService executor;
List<Future<?>> futures = new ArrayList<Future<?>>();
List<WorkerThread> workers = new ArrayList<WorkerThread>();
#Override
public void run() {
executor = Executors.newFixedThreadPool(NUM_WORKERS);
for (int i = 0; i < NUM_WORKERS; i++) {
WorkerThread worker = new WorkerThread("worker_" + i);
Future<?> future = executor.submit(worker);
futures.add(future);
workers.add(worker);
}
while(!isStopped()) {
byte[] message = getMessageFromSender();
byte[] id = getId(message);
int n = Integer.valueOf(Byte.toString(id[id.length-1])) % NUM_WORKERS;
if(n >= 0 && n <= (NUM_WORKERS-1)){
workers.get(n).addToQueue(line);
}
}
}
private class WorkerThread implements Runnable {
private String name;
private Map<String, MyObject> myObjects;
private LinkedBlockingQueue<byte[]> queue;
public WorkerThread(String name) {
this.name = name;
}
public void addToQueue(byte[] line) {
queue.add(line);
}
#Override
public void run() {
while(!isStopped()) {
byte[] message= queue.poll();
if(line != null) {
String id = getId(message);
MyObject myObject = getMyObject(id);
doLotOfStuff(myObject);
}
}
}
}
}
Conceptually this is kind of routing problem. What you need to is:
Get your your main thread (single thread) reading messages of the queue and push the data to a FIFO queue per id.
Get a single thread to consume messages from each queue.
Locking examples will (probably) not work as after the second message order is not guaranteed even if fair=true.
From Javadoc:
Even when this lock has been set to use a fair ordering policy, a call to tryLock() will immediately acquire the lock if it is available, whether or not other threads are currently waiting for the lock.
One thing for you to decide is if you want to create a a thread per queue (which will exit once the queue is empty) or keep the fixed size thread pool and manage get the extra bits to assign threads to queues.
So, you get a single thread reading from the original queue and writing to the per-id-queues and the you also get one thread per id reading from individual queues. This will ensure task serialization.
In terms of performance, you should see significant speed-up as long as the incoming messages have a nice distribution (id-wise). If you get mostly same-id messages then task will be serialized and also include the overhead for control object creation and synchronization.
You could use a separate Map for your locks. There's also a WeakHashMap that will automatically discard entries when the key is no longer present.
static final Map<String, Lock> locks = Collections.synchronizedMap(new WeakHashMap<>());
public void lock(String id) throws InterruptedException {
// Grab a Lock out of the map.
Lock l = locks.computeIfAbsent(id, k -> new ReentrantLock());
// Lock it.
l.lockInterruptibly();
}
public void unlock(String id) throws InterruptedException {
// Is it locked?
Lock l = locks.get(id);
if ( l != null ) {
l.unlock();
}
}
I think you have the right idea with your synchronized blocks, except you mis-analyze a bit and go too far in any case. The outer synchronized block shouldn't force you into dealing with only one message at a time, it just keeps multiple threads from accessing the same message at once. But you don't need it. You really only need that inner synchronized block, on the MyObject instance. That will ensure that only one thread at a time can access any given MyObject instance, while enabling other threads to access messages, the Map and other MyObject instances as much as they want.
JSONObject message = getMessageFromSender();
String id = message.getString("id");
MyObject myObject = getMyObject(id);
synchronized (myObject) {
doLotOfStuff(myObject);
}
If you don't like that, and the updates to the MyObject instances all involve single-method invocations, then you could just synchronize all of those methods. You still retain concurrency in the Map, but you're protecting the MyObject itself from concurrent updates.
class MyObject {
public synchronize void updateFoo() {
// ...
}
public synchronize void updateBar() {
// ...
}
}
When any Thread accesses any updateX() method it will automatically lock out any other Thread from accessing that or any other synchronized method. That would be simplest, if your updates match that pattern.
If not, then you'll need to make all of your worker Threads cooperate by using some sort of locking protocol. The ReentrantLock that OldCurmudgeon suggests is a good choice, but I would put it on MyObject itself. To keep things ordered properly, you should use the fairness parameter (see http://docs.oracle.com/javase/8/docs/api/java/util/concurrent/locks/ReentrantLock.html#ReentrantLock-boolean-). "When set true, under contention, locks favor granting access to the longest-waiting thread."
class MyObject {
private final ReentrantLock lock = new ReentrantLock(true);
public void lock() {
lock.lock();
}
public void unlock() {
lock.unlock();
}
public void updateFoo() {
// ...
}
public void updateBar() {
// ...
}
}
Then you could update things like this:
JSONObject message = getMessageFromSender();
String id = message.getString("id");
MyObject myObject = getMyObject(id);
myObject.lock();
try {
doLotOfStuff(myObject);
}
finally {
myObject.unlock();
}
The important takeaway is that you don't need to control access to the messages, nor the Map. All you need to do is ensure that any given MyObject is being updated by at most one thread at a time.
Actually here is a design idea: when a consumer takes a request to work on your Object it should actually remove the object with that ID from your list of Objects and then re-insert it back once the processing is done. Then any other consumer getting request to work on the object with the same id should be in blocking mode waiting for the object with that ID to re-appear in your list. You will need to add a management to keep record of all existing objects so when you can distinguish between the object that exists already but is not currently in the list (i.e. being processed by some other consumer) and the object that does not exist yet.
You could get some speedup if you split up the JSON parsing from the doLotsOfStuff(). One thread listens for messages, parses them, then puts the parsed message on a Queue to maintain chronological order. A second thread reads from that Queue and doesLotsOfStuff with no need for locking.
However, since you apparently need more than a 2X speedup this is probably insufficient.
Added
Another possibility is multiple HashMaps. For example, if all the IDs are ints, make 10 HashMaps for IDs ending with 0,1,2... Incoming messages get directed to one of 10 threads, which parse the JSON and update their relevant Map. Order is maintained within each Map, and there are no locking or contention issues. Assuming the message IDs are randomly distributed this yields up to a 10x speedup, though there is one extra layer of overhead to get at your Map. e.g.
Thread JSON Threads 0-9
--------------------------------------------------------------
while (notInterrupted) {
read / parse next JSON message
mapToUse = ID % 10
pass JSON to that Thread's queue
}
while (notInterrupted) {
take JSON off queue
// I'm the only one with writing to Map#N
do computation and update ID
}

Java Concurrency: thread-safe modification of values in maps

I'm having a bit of trouble concerning concurrency and maps in Java.
Basically I have multiple threads using (reading and modifying) their own maps, however each of these maps is a part of a larger map which is being read and modified by a further thread:
My main method creates all threads, the threads create their respective maps which are then put into the "main" map:
Map<String, MyObject> mainMap = new HashMap<String, Integer>();
FirstThread t1 = new FirstThread();
mainMap.putAll(t1.getMap());
t1.start();
SecondThread t2 = new SecondThread();
mainMap.putAll(t2.getMap());
t2.start();
ThirdThread t3 = new ThirdThread(mainMap);
t3.start();
The problem I'm facing now is that the third (main) thread sees arbitrary values in the map, depending on when one or both of the other threads update "their" items.
I must however guarantee that the third thread can iterate over - and use the values of - the map without having to fear that a part of what is being read is "old":
FirstThread (analogue to SecondThread):
for (MyObject o : map.values()) {
o.setNewValue(getNewValue());
}
ThirdThread:
for (MyObject o : map.values()) {
doSomethingWith(o.getNewValue());
}
Any ideas? I've considered using a globally accessible (static final Object through a static class) lock which will be synchronized in each thread when the map must be modified.
Or are there specific Map implementations that assess this particular problem which I could use?
Thanks in advance!
Edit:
As suggested by #Pyranja, it would be possible to synchronize the getNewValue() method. However I forgot to mention that I am in fact trying to do something along the lines of transactions, where t1 and t2 modify multiple values before/after t3 works with said values. t3 is implemented in such a way that doSomethingWith() will not actually do anything with the value if it hasn't changed.
To synchronize at a higher level than the individual value objects, you need locks to handle the synchronization between the various threads. One way to do this, without changing your code too much, is a ReadWriteLock. Thread 1 and Thread 2 are writers, Thread 3 is a reader.
You can either do this with two locks, or one. I've sketched out below doing it with one lock, two writer threads, and one reader thread, without worrying about what happens with an exception during data update (ie, transaction rollback...).
All that said, this sounds like a classic producer-consumer scenario. You should consider using something like a BlockingQueue for communication between threads, as is outlined in this question.
There's other things you may want to consider changing as well, like using Runnable instead of extending Thread.
private static final class Value {
public void update() {
}
}
private static final class Key {
}
private final class MyReaderThread extends Thread {
private final Map<Key, Value> allValues;
public MyReaderThread(Map<Key, Value> allValues) {
this.allValues = allValues;
}
#Override
public void run() {
while (!isInterrupted()) {
readData();
}
}
private void readData() {
readLock.lock();
try {
for (Value value : allValues.values()) {
// Do something
}
}
finally {
readLock.unlock();
}
}
}
private final class WriterThread extends Thread {
private final Map<Key, Value> data = new HashMap<Key, Value>();
#Override
public void run() {
while (!isInterrupted()) {
writeData();
}
}
private void writeData() {
writeLock.lock();
try {
for (Value value : data.values()) {
value.update();
}
}
finally {
writeLock.unlock();
}
}
}
private final ReentrantReadWriteLock lock = new ReentrantReadWriteLock();
private final ReadLock readLock;
private final WriteLock writeLock;
public Thing() {
readLock = lock.readLock();
writeLock = lock.writeLock();
}
public void doStuff() {
WriterThread thread1 = new WriterThread();
WriterThread thread2 = new WriterThread();
Map<Key, Value> allValues = new HashMap<Key, Value>();
allValues.putAll(thread1.data);
allValues.putAll(thread2.data);
MyReaderThread thread3 = new MyReaderThread(allValues);
thread1.start();
thread2.start();
thread3.start();
}
ConcurrentHashMap from java.util.concurrent - a thread-safe implementation of Map, which provides a much higher degree of concurrency than synchronizedMap. Just a lot of reads can almost always be performed in parallel, simultaneous reads and writes can usually be done in parallel, and multiple simultaneous recordings can often be done in parallel. (The class ConcurrentReaderHashMap offers a similar parallelism for multiple read operations, but allows only one active write operation.) ConcurrentHashMapis designed to optimize the retrieval operations.
Your example code may be misleading. In your first example you create a HashMap<String,Integer> but the second part iterates the map values which in this case are MyObject. The key to synchronization is to understand where and which mutable state is shared.
An Integer is immutable. It can be shared freely (but the reference to an Integer is mutable - it must be safely publicated and/or synchronized). But your code example suggests that the maps are populated with mutable MyObject instances.
Given that the map entries (key -> MyObject references) are not changed by any thread and all maps are created and safely publicated before any thread starts it would be in my opinion sufficient to synchronize the modification of MyObject. E.g.:
public class MyObject {
private Object value;
synchronized Object getNewValue() {
return value;
}
synchronized void setNewValue(final Object newValue) {
this.value = newValue;
}
}
If my assumptions are not correct, clarify your question / code example and also consider #jacobm's comment and #Alex answer.

Java synchronizing based on a parameter (named mutex/lock)

I'm looking for a way to synchronize a method based on the parameter it receives, something like this:
public synchronized void doSomething(name){
//some code
}
I want the method doSomething to be synchronized based on the name parameter like this:
Thread 1: doSomething("a");
Thread 2: doSomething("b");
Thread 3: doSomething("c");
Thread 4: doSomething("a");
Thread 1 , Thread 2 and Thread 3 will execute the code without being synchronized , but Thread 4 will wait until Thread 1 has finished the code because it has the same "a" value.
Thanks
UPDATE
Based on Tudor explanation I think I'm facing another problem:
here is a sample of the new code:
private HashMap locks=new HashMap();
public void doSomething(String name){
locks.put(name,new Object());
synchronized(locks.get(name)) {
// ...
}
locks.remove(name);
}
The reason why I don't populate the locks map is because name can have any value.
Based on the sample above , the problem can appear when adding / deleting values from the hashmap by multiple threads in the same time, since HashMap is not thread-safe.
So my question is if I make the HashMap a ConcurrentHashMap which is thread safe, will the synchronized block stop other threads from accessing locks.get(name) ??
TL;DR:
I use ConcurrentReferenceHashMap from the Spring Framework. Please check the code below.
Although this thread is old, it is still interesting. Therefore, I would like to share my approach with Spring Framework.
What we are trying to implement is called named mutex/lock. As suggested by Tudor's answer, the idea is to have a Map to store the lock name and the lock object. The code will look like below (I copy it from his answer):
Map<String, Object> locks = new HashMap<String, Object>();
locks.put("a", new Object());
locks.put("b", new Object());
However, this approach has 2 drawbacks:
The OP already pointed out the first one: how to synchronize the access to the locks hash map?
How to remove some locks which are not necessary anymore? Otherwise, the locks hash map will keep growing.
The first problem can be solved by using ConcurrentHashMap. For the second problem, we have 2 options: manually check and remove locks from the map, or somehow let the garbage collector knows which locks are no longer used and the GC will remove them. I will go with the second way.
When we use HashMap, or ConcurrentHashMap, it creates strong references. To implement the solution discussed above, weak references should be used instead (to understand what is a strong/weak reference, please refer to this article or this post).
So, I use ConcurrentReferenceHashMap from the Spring Framework. As described in the documentation:
A ConcurrentHashMap that uses soft or weak references for both keys
and values.
This class can be used as an alternative to
Collections.synchronizedMap(new WeakHashMap<K, Reference<V>>()) in
order to support better performance when accessed concurrently. This
implementation follows the same design constraints as
ConcurrentHashMap with the exception that null values and null keys
are supported.
Here is my code. The MutexFactory manages all the locks with <K> is the type of the key.
#Component
public class MutexFactory<K> {
private ConcurrentReferenceHashMap<K, Object> map;
public MutexFactory() {
this.map = new ConcurrentReferenceHashMap<>();
}
public Object getMutex(K key) {
return this.map.compute(key, (k, v) -> v == null ? new Object() : v);
}
}
Usage:
#Autowired
private MutexFactory<String> mutexFactory;
public void doSomething(String name){
synchronized(mutexFactory.getMutex(name)) {
// ...
}
}
Unit test (this test uses the awaitility library for some methods, e.g. await(), atMost(), until()):
public class MutexFactoryTests {
private final int THREAD_COUNT = 16;
#Test
public void singleKeyTest() {
MutexFactory<String> mutexFactory = new MutexFactory<>();
String id = UUID.randomUUID().toString();
final int[] count = {0};
IntStream.range(0, THREAD_COUNT)
.parallel()
.forEach(i -> {
synchronized (mutexFactory.getMutex(id)) {
count[0]++;
}
});
await().atMost(5, TimeUnit.SECONDS)
.until(() -> count[0] == THREAD_COUNT);
Assert.assertEquals(count[0], THREAD_COUNT);
}
}
Use a map to associate strings with lock objects:
Map<String, Object> locks = new HashMap<String, Object>();
locks.put("a", new Object());
locks.put("b", new Object());
// etc.
then:
public void doSomething(String name){
synchronized(locks.get(name)) {
// ...
}
}
The answer of Tudor is fine, but it's static and not scalable. My solution is dynamic and scalable, but it goes with increased complexity in the implementation. The outside world can use this class just like using a Lock, as this class implements the interface. You get an instance of a parameterized lock by the factory method getCanonicalParameterLock.
package lock;
import java.lang.ref.Reference;
import java.lang.ref.WeakReference;
import java.util.Map;
import java.util.WeakHashMap;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.locks.Condition;
import java.util.concurrent.locks.Lock;
import java.util.concurrent.locks.ReentrantLock;
public final class ParameterLock implements Lock {
/** Holds a WeakKeyLockPair for each parameter. The mapping may be deleted upon garbage collection
* if the canonical key is not strongly referenced anymore (by the threads using the Lock). */
private static final Map<Object, WeakKeyLockPair> locks = new WeakHashMap<>();
private final Object key;
private final Lock lock;
private ParameterLock (Object key, Lock lock) {
this.key = key;
this.lock = lock;
}
private static final class WeakKeyLockPair {
/** The weakly-referenced parameter. If it were strongly referenced, the entries of
* the lock Map would never be garbage collected, causing a memory leak. */
private final Reference<Object> param;
/** The actual lock object on which threads will synchronize. */
private final Lock lock;
private WeakKeyLockPair (Object param, Lock lock) {
this.param = new WeakReference<>(param);
this.lock = lock;
}
}
public static Lock getCanonicalParameterLock (Object param) {
Object canonical = null;
Lock lock = null;
synchronized (locks) {
WeakKeyLockPair pair = locks.get(param);
if (pair != null) {
canonical = pair.param.get(); // could return null!
}
if (canonical == null) { // no such entry or the reference was cleared in the meantime
canonical = param; // the first thread (the current thread) delivers the new canonical key
pair = new WeakKeyLockPair(canonical, new ReentrantLock());
locks.put(canonical, pair);
}
}
// the canonical key is strongly referenced now...
lock = locks.get(canonical).lock; // ...so this is guaranteed not to return null
// ... but the key must be kept strongly referenced after this method returns,
// so wrap it in the Lock implementation, which a thread of course needs
// to be able to synchronize. This enforces a thread to have a strong reference
// to the key, while it isn't aware of it (as this method declares to return a
// Lock rather than a ParameterLock).
return new ParameterLock(canonical, lock);
}
#Override
public void lock() {
lock.lock();
}
#Override
public void lockInterruptibly() throws InterruptedException {
lock.lockInterruptibly();
}
#Override
public boolean tryLock() {
return lock.tryLock();
}
#Override
public boolean tryLock(long time, TimeUnit unit) throws InterruptedException {
return lock.tryLock(time, unit);
}
#Override
public void unlock() {
lock.unlock();
}
#Override
public Condition newCondition() {
return lock.newCondition();
}
}
Of course you'd need a canonical key for a given parameter, otherwise threads would not be synchronized as they would be using a different Lock. Canonicalization is the equivalent of the internalization of Strings in Tudor's solution. Where String.intern() is itself thread-safe, my 'canonical pool' is not, so I need extra synchronization on the WeakHashMap.
This solution works for any type of Object. However, make sure to implement equals and hashCode correctly in custom classes, because if not, threading issues will arise as multiple threads could be using different Lock objects to synchronize on!
The choice for a WeakHashMap is explained by the ease of memory management it brings. How else could one know that no thread is using a particular Lock anymore? And if this could be known, how could you safely delete the entry out of the Map? You would need to synchronize upon deletion, because you have a race condition between an arriving thread wanting to use the Lock, and the action of deleting the Lock from the Map. All these things are just solved by using weak references, so the VM does the work for you, and this simplifies the implementation a lot. If you inspected the API of WeakReference, you would find that relying on weak references is thread-safe.
Now inspect this test program (you need to run it from inside the ParameterLock class, due to private visibility of some fields):
public static void main(String[] args) {
Runnable run1 = new Runnable() {
#Override
public void run() {
sync(new Integer(5));
System.gc();
}
};
Runnable run2 = new Runnable() {
#Override
public void run() {
sync(new Integer(5));
System.gc();
}
};
Thread t1 = new Thread(run1);
Thread t2 = new Thread(run2);
t1.start();
t2.start();
try {
t1.join();
t2.join();
while (locks.size() != 0) {
System.gc();
System.out.println(locks);
}
System.out.println("FINISHED!");
} catch (InterruptedException ex) {
// those threads won't be interrupted
}
}
private static void sync (Object param) {
Lock lock = ParameterLock.getCanonicalParameterLock(param);
lock.lock();
try {
System.out.println("Thread="+Thread.currentThread().getName()+", lock=" + ((ParameterLock) lock).lock);
// do some work while having the lock
} finally {
lock.unlock();
}
}
Chances are very high that you would see that both threads are using the same lock object, and so they are synchronized. Example output:
Thread=Thread-0, lock=java.util.concurrent.locks.ReentrantLock#8965fb[Locked by thread Thread-0]
Thread=Thread-1, lock=java.util.concurrent.locks.ReentrantLock#8965fb[Locked by thread Thread-1]
FINISHED!
However, with some chance it might be that the 2 threads do not overlap in execution, and therefore it is not required that they use the same lock. You could easily enforce this behavior in debugging mode by setting breakpoints at the right locations, forcing the first or second thread to stop wherever necessary. You will also notice that after the Garbage Collection on the main thread, the WeakHashMap will be cleared, which is of course correct, as the main thread waited for both worker threads to finish their job by calling Thread.join() before calling the garbage collector. This indeed means that no strong reference to the (Parameter)Lock can exist anymore inside a worker thread, so the reference can be cleared from the weak hashmap. If another thread now wants to synchronize on the same parameter, a new Lock will be created in the synchronized part in getCanonicalParameterLock.
Now repeat the test with any pair that has the same canonical representation (= they are equal, so a.equals(b)), and see that it still works:
sync("a");
sync(new String("a"))
sync(new Boolean(true));
sync(new Boolean(true));
etc.
Basically, this class offers you the following functionality:
Parameterized synchronization
Encapsulated memory management
The ability to work with any type of object (under the condition that equals and hashCode is implemented properly)
Implements the Lock interface
This Lock implementation has been tested by modifying an ArrayList concurrently with 10 threads iterating 1000 times, doing this: adding 2 items, then deleting the last found list entry by iterating the full list. A lock is requested per iteration, so in total 10*1000 locks will be requested. No ConcurrentModificationException was thrown, and after all worker threads have finished the total amount of items was 10*1000. On every single modification, a lock was requested by calling ParameterLock.getCanonicalParameterLock(new String("a")), so a new parameter object is used to test the correctness of the canonicalization.
Please note that you shouldn't be using String literals and primitive types for parameters. As String literals are automatically interned, they always have a strong reference, and so if the first thread arrives with a String literal for its parameter then the lock pool will never be freed from the entry, which is a memory leak. The same story goes for autoboxing primitives: e.g. Integer has a caching mechanism that will reuse existing Integer objects during the process of autoboxing, also causing a strong reference to exist. Addressing this, however, this is a different story.
Check out this framework. Seems you're looking for something like this.
public class WeatherServiceProxy {
...
private final KeyLockManager lockManager = KeyLockManagers.newManager();
public void updateWeatherData(String cityName, Date samplingTime, float temperature) {
lockManager.executeLocked(cityName, new LockCallback() {
public void doInLock() {
delegate.updateWeatherData(cityName, samplingTime, temperature);
}
});
}
https://code.google.com/p/jkeylockmanager/
I've created a tokenProvider based on the IdMutexProvider of McDowell.
The manager uses a WeakHashMap which takes care of cleaning up unused locks.
You could find my implementation here.
I've found a proper answer through another stackoverflow question: How to acquire a lock by a key
I copied the answer here:
Guava has something like this being released in 13.0; you can get it out of HEAD if you like.
Striped more or less allocates a specific number of locks, and then assigns strings to locks based on their hash code. The API looks more or less like
Striped<Lock> locks = Striped.lock(stripes);
Lock l = locks.get(string);
l.lock();
try {
// do stuff
} finally {
l.unlock();
}
More or less, the controllable number of stripes lets you trade concurrency against memory usage, because allocating a full lock for each string key can get expensive; essentially, you only get lock contention when you get hash collisions, which are (predictably) rare.
Just extending on to Triet Doan's answer, we also need to take care of if the MutexFactory can be used at multiple places, as with currently suggested code we will end up with same MutexFactory at all places of its usage.
For example:-
#Autowired
MutexFactory<CustomObject1> mutexFactory1;
#Autowired
MutexFactory<CustomObject2> mutexFactory2;
Both mutexFactory1 & mutexFactory2 will refer to the same instance of factory even if their type differs, this is due to the fact that a single instance of MutexFactory is created by spring during application startup and same is used for both mutexFactory1 & mutexFactory2.
So here is the extra Scope annotation that needs to be put in to avoid above case-
#Component
#Scope(ConfigurableBeanFactory.SCOPE_PROTOTYPE)
public class MutexFactory<K> {
private ConcurrentReferenceHashMap<K, Object> map;
public MutexFactory() {
this.map = new ConcurrentReferenceHashMap<>();
}
public Object getMutex(K key) {
return this.map.compute(key, (k, v) -> v == null ? new Object() : v);
}
}
I've used a cache to store lock objects. The my cache will expire objects after a period, which really only needs to be longer that the time it takes the synchronized process to run
`
import com.google.common.cache.Cache;
import com.google.common.cache.CacheBuilder;
...
private final Cache<String, Object> mediapackageLockCache = CacheBuilder.newBuilder().expireAfterWrite(DEFAULT_CACHE_EXPIRE, TimeUnit.SECONDS).build();
...
public void doSomething(foo) {
Object lock = mediapackageLockCache.getIfPresent(foo.toSting());
if (lock == null) {
lock = new Object();
mediapackageLockCache.put(foo.toString(), lock);
}
synchronized(lock) {
// execute code on foo
...
}
}
`
I have a much simpler, scalable implementation akin to #timmons post taking advantage of guavas LoadingCache with weakValues. You will want to read the help files on "equality" to understand the suggestion I have made.
Define the following weakValued cache.
private final LoadingCache<String,String> syncStrings = CacheBuilder.newBuilder().weakValues().build(new CacheLoader<String, String>() {
public String load(String x) throws ExecutionException {
return new String(x);
}
});
public void doSomething(String x) {
x = syncStrings.get(x);
synchronized(x) {
..... // whatever it is you want to do
}
}
Now! As a result of the JVM, we do not have to worry that the cache is growing too large, it only holds the cached strings as long as necessary and the garbage manager/guava does the heavy lifting.

Java - threads + action

I'm new to Java so I have a simple question that I don't know where to start from -
I need to write a function that accepts an Action, at a multi-threads program , and only the first thread that enter the function do the action, and all the other threads wait for him to finish, and then return from the function without doing anything.
As I said - I don't know where to begin because,
first - there isn't a static var at the function (static like as in c / c++ ) so how do I make it that only the first thread would start the action, and the others do nothing ?
second - for the threads to wait, should I use
public synchronized void lala(Action doThis)
{....}
or should i write something like that inside the function
synchronized (this)
{
...
notify();
}
Thanks !
If you want all threads arriving at a method to wait for the first, then they must synchronize on a common object. It could be the same instance (this) on which the methods are invoked, or it could be any other object (an explicit lock object).
If you want to ensure that the first thread is the only one that will perform the action, then you must store this fact somewhere, for all other threads to read, for they will execute the same instructions.
Going by the previous two points, one could lock on this 'fact' variable to achieve the desired outcome
static final AtomicBoolean flag = new AtomicBoolean(false); // synchronize on this, and also store the fact. It is static so that if this is in a Runnable instance will not appear to reset the fact. Don't use the Boolean wrapper, for the value of the flag might be different in certain cases.
public void lala(Action doThis)
{
synchronized (flag) // synchronize on the flag so that other threads arriving here, will be forced to wait
{
if(!flag.get()) // This condition is true only for the first thread.
{
doX();
flag.set(true); //set the flag so that other threads will not invoke doX.
}
}
...
doCommonWork();
...
}
If you're doing threading in any recent version of Java, you really should be using the java.util.concurrent package instead of using Threads directly.
Here's one way you could do it:
private final ExecutorService executor = Executors.newCachedThreadPool();
private final Map<Runnable, Future<?>> submitted
= new HashMap<Runnable, Future<?>>();
public void executeOnlyOnce(Runnable action) {
Future<?> future = null;
// NOTE: I was tempted to use a ConcurrentHashMap here, but we don't want to
// get into a possible race with two threads both seeing that a value hasn't
// been computed yet and both starting a computation, so the synchronized
// block ensures that no other thread can be submitting the runnable to the
// executor while we are checking the map. If, on the other hand, it's not
// a problem for two threads to both create the same value (that is, this
// behavior is only intended for caching performance, not for correctness),
// then it should be safe to use a ConcurrentHashMap and use its
// putIfAbsent() method instead.
synchronized(submitted) {
future = submitted.get(action);
if(future == null) {
future = executor.submit(action);
submitted.put(action, future);
}
}
future.get(); // ignore return value because the runnable returns void
}
Note that this assumes that your Action class (I'm assuming you don't mean javax.swing.Action, right?) implements Runnable and also has a reasonable implementation of equals() and hashCode(). Otherwise, you may need to use a different Map implementation (for example, IdentityHashMap).
Also, this assumes that you may have multiple different actions that you want to execute only once. If that's not the case, then you can drop the Map entirely and do something like this:
private final ExecutorService executor = Executors.newCachedThreadPool();
private final Object lock = new Object();
private volatile Runnable action;
private volatile Future<?> future = null;
public void executeOnlyOnce(Runnable action) {
synchronized(lock) {
if(this.action == null) {
this.action = action;
this.future = executor.submit(action);
} else if(!this.action.equals(action)) {
throw new IllegalArgumentException("Unexpected action");
}
}
future.get();
}
public synchronized void foo()
{
...
}
is equivalent to
public void foo()
{
synchronized(this)
{
...
}
}
so either of the two options should work. I personally like the synchronized method option.
Synchronizing the whole method can sometimes be overkill if there is only a certain part of the code that deals with shared data (for example, a common variable that each thread is updating).
Best approach for performance is to only use the synchronized keyword just around the shared data. If you synchronized the whole method when it is not entirely necessarily then a lot of threads will be waiting when they can still do work within their own local scope.
When a thread enters the synchronize it acquires a lock (if you use the this object it locks on the object itself), the other will wait till the lock-acquiring thread has exited. You actually don't need a notify statement in this situation as the threads will release the lock when they exit the synchronize statement.

Categories

Resources