Cheap read-write lock with hashmap

Cheap read-write lock with hashmap - java

static volatile Map currentMap = null; // this must be volatile
static Object lockbox = new Object();
public static void buildNewMap() { // this is called by the producer
Map newMap = new HashMap(); // when the data needs to be updated
synchronized (lockbox) { // this must be synchronized because
// of the Java memory model
// .. do stuff to put things in newMap
newMap.put(....);
newMap.put(....);
}
/* After the above synchronization block, everything that is in the HashMap is
visible outside this thread */
/* Now make the updated set of values available to the consumer threads.
As long as this write operation can complete without being interrupted,
and is guaranteed to be written to shared memory, and the consumer can
live with the out of date information temporarily, this should work fine */
currentMap = newMap;
}
public static Object getFromCurrentMap(Object key) {
Map m = null;
Object result = null;
m = currentMap; // no locking around this is required
if (m != null) { // should only be null during initialization
Object result = m.get(key); // get on a HashMap is not synchronized
// Do any additional processing needed using the result
}
return(result);
}
This is a code sample from this article https://www.ibm.com/developerworks/library/j-hashmap/index.html
I still don't understand why we need a synchronized block in buildNewMap method. What additional visibility guarantees it produces beside that volatile publishing at currentMap = newMap; does.
When we read map reference at m = currentMap; we rely upon volatile read-write and reading thread doesn't even know about syncronization in producer thread....

If the hashmap is only modified before it is written to the 'currentMap' its content is guaranteed to be visible to other threads. This is because there is a happens before edge between writing map content and writing to currentMap (program order); and there is a happens-before edge (volatile variable) between reading the concurrentMap, and there is a happens before edge between reading the variable and reading the content (program order). Since happens before is transitive, there is a happens beforge edge between writing the content and reading the content.
The synchronized block doesn't seem to serve any purpose.

Java memory model provides strong guarantees regarding volatile writes, according to this article:
http://tutorials.jenkov.com/java-concurrency/volatile.html
In particular:
If Thread A writes to a volatile variable and Thread B subsequently reads the same volatile variable, then all variables visible to Thread A before writing the volatile variable, will also be visible to Thread B after it has read the volatile variable.
If Thread A reads a volatile variable, then all all variables visible to Thread A when reading the volatile variable will also be re-read from main memory.
So it looks like the synchronized block is unnecessary.

Related

Check if lock is held but not lock it if it is free

I have scenario where i have multiple counters object. Each counters object can be incremented by multiple threads at the same time so there is a set of ReentrantLock for all objects and this works good - each object can be modified only by one thread at given moment.
Here's the catch: there is a process which runs every 15 minutes and collects all counters object, do some calculations and clears counters. This thread is not locking anything so there are situations as below:
incrementing_thread is fetching counters object and incrementing some counters
clearing_thread is fetching all counter objects, do some calculations and clears counters
clearing_thread saves counters objects to cache
incrementing_thread saves counters object to cache
In such situations one of the counters object is messed up because at the end clearing operation is discarded and the state of counters are same as before clearing.
What i want to:
All incrementing_threads are locking on specific counters object so each object can be modified only by one thread but at the same time independent object can be modified by multiple thread and this work already great.
When clearing_thread starts it sets some sort of flag which is read by all incrementing_threads and they have to wait until the flag is dismissed.
I have backup plans:
clearing_thread lock on all objects but i don't like this idea because it can take too long and if it block on one of the object it could potentially block all threads.
I could clear counters in for loop for each object but then while clearing one object the other objects can be modified and this is not ideal for me.
As you can see i have some options but i'm wondering if there is better way to do this.
UPDATE
I was asked for the code so there it is.
Below example of one of the methods that increments counters on object.
public void sipIncomingCall(String objName) {
try {
lock(objName);
Stats stat = getStatisticsForObj(objName);
long l = stat.getSipIncomingConnections().incrementAndGet();
stat.getSipConnectionsSum().incrementAndGet();
LOGGER.debug("incrementing sip incoming connections to {}, objName {}", l, objName);
putStatisticsForObj(objName, stat);
}finally {
unlock(objName);
}
}
lock() and unlock() methods:
private Map<String,ReentrantLock> locks = new ConcurrentHashMap<>();
protected void lock(String key) {
ReentrantLock lock = locks.getOrDefault(key, new ReentrantLock());
lock.lock();
}
protected void unlock(String key){
ReentrantLock lock = locks.get(key);
if(lock!=null){
lock.unlock();
}
}
methods getStatisticsForObj() and putStatisticsForObj():
private MgcfStats getStatisticsForObj(String tgName) {
//get object from local cache (or hazelcast)
return Cluster.getTgStatistics(tgName);
}
private void putStatisticsForObj(String tgName,MgcfStats stats){
//saving to local cache and hazelcast
Cluster.putTgStatistics(tgName,stats);
}
Below is fragment from "clearing_thread" which copies all statistics objects to local map and then clearing statistics in Cluster:
statisticsData.setObjStats(new HashMap<>(Cluster.getTgStatistics()));
Cluster.clearTgStatistics();

You may use ReadWriteLock.
Incrementing threads acquire read lock before incrementing value.
Cleaning thread acquire write lock.
You still need individual locks for each counter.

How to synchronize multiple threads from accessing some common data

I have three different threads which creates three different objects to read/manipulate some data which is common for all the threads. Now, I need to ensure that we are giving an access only to one thread at a time.
The example goes something like this.
public interface CommonData {
public void addData(); // adds data to the cache
public void getDataAccessKey(); // Key that will be common across different threads for each data type
}
/*
* Singleton class
*/
public class CommonDataCache() {
private final Map dataMap = new HashMap(); // this takes keys and values as custom objects
}
The implementation class of the interface would look like this
class CommonDataImpl implements CommonData {
private String key;
public CommonDataImpl1(String key) {
this.key = key;
}
public void addData() {
// access the singleton cache class and add
}
public void getDataAccessKey() {
return key;
}
}
Each thread will be invoked as follows:
CommonData data = new CommonDataImpl("Key1");
new Thread(() -> data.addData()).start();
CommonData data1 = new CommonDataImpl("Key1");
new Thread(() -> data1.addData()).start();
CommonData data2 = new CommonDataImpl("Key1");
new Thread(() -> data2.addData()).start();
Now, I need to synchronize those threads if and only if the keys of the data object (passed on to the thread) is the same.
My thought process so far:
I tried to have a class that provides the lock on the fly for a given key which looks something like this.
/*
* Singleton class
*/
public class DataAccessKeyToLockProvider {
private volatile Map<String, ReentrantLock> accessKeyToLockHolder = new ConcurrentHashMap<>();
private DataAccessKeyToLockProvider() {
}
public ReentrantLock getLock(String key) {
return accessKeyToLockHolder.putIfAbsent(key, new ReentrantLock());
}
public void removeLock(BSSKey key) {
ReentrantLock removedLock = accessKeyToLockHolder.remove(key);
}
}
So each thread would call this class and get the lock and use it and remove it once the processing is done. But this can so result in a case where the second thread could get the lock object that was inserted by the first thread and waiting for the first thread to release the lock. Once the first thread removes the lock, now the third thread would get a different lock altogether, so the 2nd thread and the 3rd thread are not in sync anymore.
Something like this:
new Thread(() -> {
ReentrantLock lock = DataAccessKeyToLockProvider.get(data.getDataAccessKey());
lock.lock();
data.addData();
lock.unlock();
DataAccessKeyToLockProvider.remove(data.getDataAccessKey());
).start();
Please let me know if you need any additional details to help me resolve my problem
P.S: Removing the key from the lock provider is kind of mandatory as i will be dealing with some millions of keys (not necessarily strings), so I don't want the lock provider to eat up my memory
Inspired the solution provided #rzwitserloot, I have tried to put some generic code that waits for the other thread to complete its processing before giving the access to the next thread.
public class GenericKeyToLockProvider<K> {
private volatile Map<K, ReentrantLock> keyToLockHolder = new ConcurrentHashMap<>();
public synchronized ReentrantLock getLock(K key) {
ReentrantLock existingLock = keyToLockHolder.get(key);
try {
if (existingLock != null && existingLock.isLocked()) {
existingLock.lock(); // Waits for the thread that acquired the lock previously to release it
}
return keyToLockHolder.put(key, new ReentrantLock()); // Override with the new lock
} finally {
if (existingLock != null) {
existingLock.unlock();
}
}
}
}
But looks like the entry made by the last thread wouldn't be removed. Anyway to solve this?

First, a clarification: You either use ReentrantLock, OR you use synchronized. You don't synchronized on a ReentrantLock instance (you synchronize on any object you want) – or, if you want to go the lock route, you can call the lock lock method on your lock object, using a try/finally guard to always ensure you call unlock later (and don't use synchronized at all).
synchronized is low-level API. Lock, and all the other classes in the java.util.concurrent package are higher level and offer far more abstractions. It's generally a good idea to just peruse the javadoc of all the classes in the j.u.c package from time to time, very useful stuff in there.
The key issue is to remove all references to a lock object (thus ensuring it can be garbage collected), but not until you are certain there are zero active threads locking on it. Your current approach does not know how many classes are waiting. That needs to be fixed. Once you return an instance of a Lock object, it's 'out of your hands' and it is not possible to track if the caller is ever going to call lock on it. Thus, you can't do that. Instead, call lock as part of the job; the getLock method should actually do the locking as part of the operation. That way, YOU get to control the process flow. However, let's first take a step back:
You say you'll have millions of keys. Okay; but it is somewhat unlikely you'll have millions of threads. After all, a thread requires a stack, and even using the -Xss parameter to reduce the stack size to the minimum of 128k or so, a million threads implies you're using up 128GB of RAM just for stacks; seems unlikely.
So, whilst you might have millions of keys, the number of 'locked' keys is MUCH smaller. Let's focus on those.
You could make a ConcurrentHashMap which maps your string keys to lock objects. Then:
To acquire a lock:
Create a new lock object (literally: Object o = new Object(); - we are going to be using synchronized) and add it to the map using putIfAbsent. If you managed to create the key/value pair (compare the returned object using == to the one you made; if they are the same, you were the one to add it), you got it, go, run the code. Once you're done, acquire the sync lock on your object, send a notification, release, and remove:
public void doWithLocking(String key, Runnable op) {
Object locker = new Object();
Object o = concurrentMap.putIfAbsent(key, locker);
if (o == locker) {
op.run();
synchronized (locker) {
locker.notifyAll(); // wake up everybody waiting.
concurrentMap.remove(key); // this has to be inside!
}
} else {
...
}
}
To wait until the lock is available, first acquire a lock on the locker object, THEN check if the concurrentMap still contains it. If not, you're now free to retry this operation. If it's still in, then we now wait for a notification. In any case we always just retry from scratch. Thus:
public void performWithLocking(String key, Runnable op) throws InterruptedException {
while (true) {
Object locker = new Object();
Object o = concurrentMap.putIfAbsent(key, locker);
if (o == locker) {
try {
op.run();
} finally {
// We want to lock even if the operation throws!
synchronized (locker) {
locker.notifyAll(); // wake up everybody waiting.
concurrentMap.remove(key); // this has to be inside!
}
}
return;
} else {
synchronized (o) {
if (concurrentMap.containsKey(key)) o.wait();
}
}
}
}
}
Instead of this setup where you pass the operation to execute along with the lock key, you could have tandem 'lock' and 'unlock' methods but now you run the risk of writing code that forgets to call unlock. Hence why I wouldn't advise it!
You can call this with, for example:
keyedLockSupportThingie.doWithLocking("mykey", () -> {
System.out.println("Hello, from safety!");
});

Will mutations of a volatile variable be visible to all threads?

Let's say I have a volatile reference c to MyClass, and MyClass has an integer field x. If one thread changes the value of x, will the new value be guaranteed visible to all other threads, or does x have to be volatile too?
In other words, is the example below guaranteed to print 2?
public class MyClass {
private static volatile MyClass c;
private int x = 1;
public static void main(String[] args) {
c = new MyClass();
Thread thread = new Thread(new Runnable() {
#Override
public void run() {
c.x = 2;
}
});
thread.start();
try {
thread.join();
System.out.println(c.x);
} catch (InterruptedException ex) {
//
}
}
If not, what if I want to manipulate an object whose source code I don't control, such as a Collection? How can I ensure that changes to the Collection object are visible to all threads?

Varialbe x must be volatile too for your example.
If so, what if I want to manipulate an object whose source code I
don't control, such as a Collection? How can I ensure that changes to
the collection object are visible to all threads?
To see the changes in a collection (assuming it is not a concurrent collection, let's say it is a plain ArrayList), you should provide a monitor by yourself.
Object monitor = new Object();
synchronized(monitor) {
// change collection
}
synchronized(monitor) {
// read collection
}
If read and write operations will be synchronized on monitor, they will work correct. However, if you have code you don't control, and this code modifies collection without synchronization, you can do nothing.
Issue number 2: even with read/write synchronization on monitor, you still can get some ConcurrentModificationExceptions, if you iterate on collection in one thread and modify it in another thread. So read in my example is not a reference read, but a value read.

For your first question, yes. Volatile makes sure that writes to the volatile field are seen by other threads' read operations. It doesn't cascade however, so volatile doesn't fit into all use cases (i.e. just because a reference is volatile doesn't mean all the fields of the referred object would magically become volatile).
In most cases you need to synchronize the access to make sure that all writes are seen by subsequent reads.

Is it a good practice to create and use String objects for locking in thread synchronization?

Sorry for my English
I don't use any of fields for the locking because so I shouldn't think about could or couldn't some field have value null.
I always create special fields used only for locking in thread synchronization.
For example:
public class Worker {
private static final List<Toilet> TOILETS = Arrays.asList(
new Toilet(1),
new Toilet(2),
// ...
new Toilet(NUMBER_OF_FLOORS)
);
// here it is:
private static final List<String> LOCK_TOILETS = Arrays.asList(
"LOCK TOILET #1",
"LOCK TOILET #2",
// ...
"LOCK TOILET #" + NUMBER_OF_FLOORS
);
private final int floorNumber;
public void spendWorkingHours() {
for (int i = 0; i < X; ++i) {
doWork();
snackSomething();
String lockToilet = LOCK_TOILETS.get(floorNumber);
Toilet theOnlyToiletOnTheFloor = TOILETS.get(floorNumber);
synchronized (lockToilet) {
goToToilet(theOnlyToiletOnTheFloor);
}
}
}
}

You should not use Strings for lock objects especially not string literals.
String literals are from the String pool and each String literal which is the same string is the same reference. This means if 2 different threads use 2 "different" string literals are actually the same and hence deadlock can easily occur.
To demonstrate:
// Thread #1
String LOCK1 = "mylock";
synchronized (LOCK1) {
}
// Thread #2
String LOCK2 = "mylock";
synchronized (LOCK2) {
// This is actually the SAME lock,
// might cause deadlock between the 2 synchronized blocks!
// Because LOCK1==LOCK2!
}
Best would be to synchronize on private objects which are not accessible from the "outside". If you use an Object for lock which is visible from "outside" (or returned by a method), that object is available to anyone to also use as a lock which you have no control over and may cause a deadlock with your internal synchronized block.
For example you can synchronize on the object you whish to guard if it is private, or create a private, internal lock Object:
private final Object LOCK = new Object();
// Later:
synchronized (LOCK) {
// LOCK is not known to any "outsiders", safe to use it as internal lock
}

Using a String may not be the best idea, because this class gets a bit of special treatment, and strings with the same contents may be reused (so locking a toilet on the first floor would also lock the toilet with the same number on the other floors).
Your best choice here is locking the actual toilet.

There is no need for lockToilet why don't you just use a synchronized statement over each TOILET resource?
Toilet t;
syncrhonized(TOILETS)
{
t = = TOILETS.get(floorNumber);
}
synchronized (t) {
goToToilet(t);
}
syncrhonized In this code means that any use of the object between parentheses is thread exclussive within the scope between brakets thus this object becoming a lock.

Answers do cover your question regarding the use of Strings for locking (See String interning for more details) so I will just mention a few other considerations:
Although you have defined the List as final (Cannot assign another list instance) and initialized with .asList(..) (Cannot change size) this doesn't make read-only or thread-safe, so if someone changes elements in that list you might get into an unstable state. Consider using a read-only list.
You also need to clarify the scope of locking. What are you trying to lock against? If goToToilet changes the object attributes, then the point of synchronization would be better placed in the method that changes the state of the Object. (This is a design recommendation; The code would work but would also be prone to errors when changing the code in the future)
Finally, I would also have a look in java concurrent structures as you might find concurrent collections and locking mechanisms useful.

Safe publication of java.util.concurrent collections

Is volatile redundant in this code?
public class Test {
private volatile Map<String, String> map = null;
public void resetMap() { map = new ConcurrentHashMap<>(); }
public Map<String, String> getMap() { return map; }
}
In other words, does map = new ConcurrentHashMap<>(); provide any visibility guarantees?
As far as I can see, the only guarantee provided by ConcurrentMap is:
Actions in a thread prior to placing an object into a ConcurrentMap as a key or value happen-before actions subsequent to the access or removal of that object from the ConcurrentMap in another thread.
How about other thread safe collections in java.util.concurrent (CopyOnWriteArrayList, etc.)?

volatile is not redundant as you are changing the reference to the map. i.e. ConcurrentMap only provides guarentees about the contents of the collection, not references to it.
An alternative would be
public class Test {
private final Map<String, String> map = new ConcurrentHashMap<>();
public void resetMap() { map.clear(); }
public Map<String, String> getMap() { return map; }
}
How about other thread safe collections in java.util.concurrent (CopyOnWriteArrayList, etc.)?
Only the behaviour of the collection is thread safe. Reference to the collection are not thread safe, elements in the collection are not made thread safe by adding them to the collection.

volatile is necessary here. It applies to the reference, not to what it references to. In other words it doesn't matter that an object is thread safe, other threads won't see the new value of map field (e.g. might see previously referenced concurrent map or null).
Moreover, even if your object was immutable (e.g. String) you would still need volatile, not to mention other thread-safe collections like CopyOnWriteArrayList.

This is not just about the references. Generally, without the volatile modifier other threads may observe a new reference to an object, but observe the object in a partially constructed state. In general, it is not easy to know, even after consulting the documentation, which objects are safe for publication by a data race. An interesting note is that the JLS does guarantee this for thread-safe immutable objects, so if the docs mention those two properties it should be enough.
ConcurrentHashMap is obviously not an immutable object, so that doesn't apply, and the docs don't mention anything about publication by a data race. By careful inspection of the source code we may conclude that it is indeed safe, however I wouldn't recommend relying on such findings without this property being clearly documented.

Memory Consistency Properties
A write to a volatile field happens-before every subsequent read of that same field. Writes and reads of volatile fields have similar memory consistency effects as entering and exiting monitors, but do not entail mutual exclusion locking.
Actions in a thread prior to placing an object into any concurrent collection happen-before actions subsequent to the access or removal of that element from the collection in another thread.

OK - I was able to construct an example that breaks (on my machine: JDK 1.7.06 / Win 7 64 bits) if the field is not volatile - the program never prints Loop exited if map is not volatile - it does print Loop exited if map is volatile. QED.
public class VolatileVisibility extends Thread {
Map<String, String> stop = null;
public static void main(String[] args) throws InterruptedException {
VolatileVisibility t = new VolatileVisibility();
t.start();
Thread.sleep(100);
t.stop = new ConcurrentHashMap<>(); //write of reference
System.out.println("In main: " + t.stop); // read of reference
System.out.println("Waiting for run to finish");
Thread.sleep(200);
System.out.println("Still waiting");
t.stop.put("a", "b"); //write to the map
Thread.sleep(200);
System.exit(0);
}
public void run() {
System.out.println("In run: " + stop); // read of reference
while (stop == null) {
}
System.out.println("Loop exited");
}
}

My impression is that Doug Lea's concurrent objects can be safely published by data race, so that they are "thread-safe" even if misused. Though he probably wouldn't advertise that publicly.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Cheap read-write lock with hashmap - java

Related

Check if lock is held but not lock it if it is free

How to synchronize multiple threads from accessing some common data

Will mutations of a volatile variable be visible to all threads?

Is it a good practice to create and use String objects for locking in thread synchronization?

Safe publication of java.util.concurrent collections

Categories

Resources