double check lock without volatile is wrong? - java

i use jdk1.8. i think that double check lock without volatile is right.
I use countdownlatch test many times and the object is singleton.
How to prove that it must need “volatile”?
update 1
Sorry, my code is not formatted, because I can’t receive some JavaScript
public class DCLTest {
private static /*volatile*/ Singleton instance = null;
static class Singleton {
public String name;
public Singleton(String name) {
try {
//We can delete this sentence, just to simulate various situations
Thread.sleep(1);
} catch (InterruptedException e) {
e.printStackTrace();
}
this.name = name;
}
}
public static Singleton getInstance() {
if (null == instance) {
synchronized (Singleton.class) {
if (null == instance) {
instance = new Singleton(Thread.currentThread().getName());
}
}
}
return instance;
}
public static void test() throws InterruptedException {
int count = 1;
while (true){
int size = 5000;
final String[] strs = new String[size];
final CountDownLatch countDownLatch = new CountDownLatch(1);
for (int i = 0; i < size; i++) {
final int index = i;
new Thread(()->{
try {
countDownLatch.await();
} catch (InterruptedException e) {
e.printStackTrace();
}
Singleton instance = getInstance();
strs[index] = instance.name;
}).start();
}
Thread.sleep(100);
countDownLatch.countDown();
Thread.sleep(1000);
for (int i = 0; i < size-1; i++) {
if(!(strs[i].equals(strs[i+1]))){
System.out.println("i = " + strs[i] + ",i+1 = "+strs[i+1]);
System.out.println("need volatile");
return;
}
}
System.out.println(count++ + " times");
}
}
public static void main(String[] args) throws InterruptedException {
test();
}
}

The key problem that you are not seeing is that instructions can be reordered. So the order they are in the source code, isn't the same as they are applied on memory. CPU's and compilers are the cause or this reordering.
I'm not going through the whole example of example of double checked locking because many examples are available, but will provide you just enough information to do some more research.
if you would have the following code:
if(singleton == null){
synchronized{
if(singleton == null){
singleton = new Singleton("foobar")
}
}
}
Then under the hood something like this will happen.
if(singleton == null){
synchronized{
if(singleton == null){
tmp = alloc(Singleton.class)
tmp.value = "foobar"
singleton = tmp
}
}
}
Till so far, all is good. But the following reordering is legal:
if(singleton == null){
synchronized{
if(singleton == null){
tmp = alloc(Singleton.class)
singleton = tmp
tmp.value = "foobar"
}
}
}
So this means that a singleton that hasn't been completely constructed (the value has not yet been set) has been written to the singleton global variable. If a different thread would read this variable, it could see a partially created object.
There are other potential problems like atomicity (e.g. if the value field would be a long, it could be fragmented e.g. torn read/write). And also visibility; e.g. the compiler could optimize the code so that the load/store from memory is optimized-out. Keep in mind that thinking in term of reading from memory instead of cache, is fundamentally flawed and the most frequently encountered misunderstandings I see on SO; even many seniors get this wrong. Atomicity, visibility and reordering are part of the Java memory model, and making the singleton' variable volatile, resolves all these problems. It removes the data race (you can look it up for more details).
If you want to be really hardcore, it would be sufficient to place a [storestore] barrier between the creation of an object and the assignment to the singleton and a [loadload] barrier on the reading side and make sure you use a VarHandle with opaque for the singleton.
But this goes well beyond what most engineers understand and it won't make much of a performance difference in most situations.
If you want to check if something can break, please check out JCStress:
https://github.com/openjdk/jcstress
It is a great tool and can help you help you to show that your code is broken.

How to prove that it must need “volatile”?
As a general rule, you cannot prove correctness of a multi-threaded application by testing. You may be able to prove incorrectness, but even that is not guaranteed. As you are observing.
The fact that you haven't succeeded in making your application fail is not a proof that it is correct.
The way to prove correctness is to do a formal (i.e. mathematical) happens before analysis.
It is fairly straightforward to show that when the singleton is not volatile there are executions in which there is a missing happens before. This may lead to an incorrect outcome such as the initialization happening more than once. But it is not guaranteed that you will get an incorrect outcome.
The flip-side is that if a volatile is used, the happens before relationships combined with the logic of the code are sufficient to construct a formal (mathematical) proof that you will always get a correct outcome.
(I am not going to construct the proofs here. It is too much effort.)

Related

Share local variable value between barrier threads in java

I've been working on implementing a custom Cyclic Barrier which adds values passed into the await method and returns the sum to all threads when after notify is called.
The code:
public class Barrier {
private final int parties;
private int partiesArrived = 0;
private volatile int sum = 0;
private volatile int oldSum = 0;
public Barrier(int parties) {
if (parties < 1) throw new IllegalArgumentException("Number of parties has to be 1 or higher.");
this.parties = parties;
}
public int getParties() { return parties; }
public synchronized int waitBarrier(int value) throws InterruptedException {
partiesArrived += 1;
sum += value;
if (partiesArrived != parties) {
wait();
}
else {
oldSum = sum;
sum = 0;
partiesArrived = 0;
notifyAll();
}
return oldSum;
}
public int getNumberWaiting() { return partiesArrived; }
}
This works, but I hear that there is a way to change the values sum and oldSum (or at least oldSum) into local variables of the waitBarrier method. However, after racking my head over it, I don't see a way.
Is it possible and , if yes, how?
However, after racking my head over it, I don't see a way.
Quite so.
Is it possible and , if yes, how?
it is not possible.
For some proof:
Try marking a local var as volatile. It won't work: The compiler doesn't allow it. Why doesn't it? Because volatile is neccessarily a no-op: local vars simply cannot be shared with another thread.
One might think this is 'sharing' a local:
void test() {
int aLocalVar = 10;
Thread t = new Thread(() -> {
System.out.println("Wow, we're sharing it! " + aLocalVar);
});
t.start();
}
But it's some syntax sugar tripping you up there: Actually (and you can confirm this with javap -c -v to show the bytecode that javac makes for this code), a copy of the local var is handed to the block here. This then explains why, in java, the above fails to compile unless the variable you're trying to share is either [A] marked final or [B] could have been so marked without error (this is called 'the variable is effectively final'). Had java allowed you to access non-(effectively) finals like this, and had java used the copy mechanism that is available, that would be incredibly confusing.
Of course, in java, all non-primitives are references. Pointers, in the parlance of some other languages. Thus, you can 'share' (not really, it'll be a copy) a local var and nevertheless get what you want (share state between 2 threads), because whilst you get a copy of the variable, the variable is just a pointer. It's like this: If I have a piece of paper and it is mine, but I can toss it in a photocopier and give you a copy too, we can't, seemingly, share state. Whatever I scratch on my paper won't magically appear on yours; it's not voodoo paper. But, if there is an address to a house on my paper and I copy it and hand you a copy, it feels like we're sharing that: If you walk over to the house and, I dunno, toss a brick through a window, and I walk over later, I can see it.
Many objects in java are immutable (impervious to bricks), and the primitives aren't references. One solution is to use the AtomicX family which are just simplistic wrappers around a primitive or reference, making them mutable:
AtomicInteger v = new AtomicInteger();
Thread t = new Thread(() -> {v.set(10);});
t.start();
t.yield();
System.out.println(t.get());
// prints 10
But no actual sharing of a local happened here. The thread got a -copy- of the reference to a single AtomicInteger instance that lives on the heap, and both threads ended up 'walking over to the house', here.
You can return sum and have the first party clear it:
public synchronized int waitBarrier(int value) throws InterruptedException {
if (partiesArrived == 0) {
sum = 0;
}
partiesArrived++;
sum += value;
if (partiesArrived == parties) {
notifyAll();
} else {
while (partiesArrived < parties) {
wait();
}
}
return sum;
}
Note that the wait condition should always be checked in a loop in case of spurious wakeups. Also, sum doesn't need to be volatile if it's not accessed outside the synchronized block.

Unit test the thread safety of a singleton class in Java?

Let's imagine I have the following java class :
static class Singleton {
static Singleton i;
static Singleton getInstance() {
if (i == null) {
i = new Singleton();
}
return i;
}
}
Now, we all know this will work, but - it apparently is not thread safe - I am not actually trying to fix the thread safety - this is more of a demo, my other class is identical, but uses a mutex and synchronization - the unit test will be ran against each to show that one is thread safe, and the other, is not. What might the unit test which would fail if getInstance is not thread safe look like?
Well, race conditions are by nature probabilistic so there's no deterministic way to truly generate a race condition. Any possible way against your current code would need to be run many times until the desired outcome is achieved. You can enforce a loose ordering of access on i by making a mock singleton to test against to simulate what a certain condition might look like, though. Rule of thumb with synchronization is preventative measures beat trying to test and figure out what's wrong after bad code is mangled in a code base.
static class Singleton {
static Singleton i;
static Singleton getInstance(int tid) {
if (i == null) {
if (tid % 2 == 0) i = new Singleton()
}
return i;
}
}
So certain threads will write to i and other threads will read i as if they reached "return i" before "the even thread id's were able to check and initialize i" (sort of, not exactly, but it simulates the behavior). Still, there's a race between the even threads in this case because the even threads may still write to i after another reads null. To improve, you'd need to implement thread safety to force the condition where one thread reads i, gets null, while the other thread sets i to new Singleton() a thread-unsafe condition. But at that point you're better off just solving the underlying issue (just make getInstance thread safe!)
TLDR: there are infinitely many race conditions that can occur in a unsafe function call. You can mock the code to generate a mock of a specific race condition (say, between just two threads) but it's not feasible to just blanket test for "race conditions"
This code worked for me.
The trick is that it is probabilistic like said by other users.
So, the approach that should be taken is to run for a number of times.
public class SingletonThreadSafety {
public static final int CONCURRENT_THREADS = 4;
private void single() {
// Allocate an array for the singletons
final Singleton[] singleton = new Singleton[CONCURRENT_THREADS];
// Number of threads remaining
final AtomicInteger count = new AtomicInteger(CONCURRENT_THREADS);
// Create the threads
for(int i=0;i<CONCURRENT_THREADS;i++) {
final int l = i; // Capture this value to enter the inner thread class
new Thread() {
public void run() {
singleton[l] = Singleton.getInstance();
count.decrementAndGet();
}
}.start();
}
// Ensure all threads are done
// The sleep(10) is to be somewhat performant, (if just loop,
// this will be a lot slow. We could use some other threading
// classes better, like CountdownLatch or something.)
try { Thread.sleep(10); } catch(InterruptedException ex) { }
while(count.get() >= 1) {
try { Thread.sleep(10); } catch(InterruptedException ex) { }
}
for( int i=0;i<CONCURRENT_THREADS - 1;i++) {
assertTrue(singleton[i] == singleton[i + 1]);
}
}
#Test
public void test() {
for(int i=0;i<1000;i++) {
Singleton.i = null;
single();
System.out.println(i);
}
}
}
This have to make some change in the Singleton design pattern. That the instance variable is now accessible in the Test class. So that we could reset the Singleton instance available to null again every time the test is repeated, then we repeat the test 1000 times (if you have more time, you could make it more, sometimes finding an odd threading problem require that).
In some cases this solution works. Unfortunately its hard to test singleton to provoke thread unsafe.
#Test
public void checkThreadUnSafeSingleton() throws InterruptedException {
int threadsAmount = 500;
Set<Singleton> singletonSet = Collections.newSetFromMap(new ConcurrentHashMap<>());
ExecutorService executorService = Executors.newFixedThreadPool(threadsAmount);
for (int i = 0; i < threadsAmount; i++) {
executorService.execute(() -> {
Singleton singleton = Singleton.getInstance();
singletonSet.add(singleton);
});
}
executorService.shutdown();
executorService.awaitTermination(1, TimeUnit.MINUTES);
Assert.assertEquals(2, singletonSet.size());
}

List concurrency failing

I have an Arraylist that I am constantly adding to and removing from in separate threads.
One thread adds, and the other removes.
This is the class that contains the changing list:
public class DataReceiver {
private static final String DEBUG_TAG = "DataReceiver";
// Class variables
private volatile ArrayList<Byte> buffer;
//private volatile Semaphore dataAmount;
public DataReceiver() {
this.buffer = new ArrayList<Byte>();
//this.dataAmount = new Semaphore(0, true);
}
// Adds a data sample to the data buffer.
public final void addData(byte[] newData, int bytes) {
int newDataPos = 0;
// While there is still data
while(newDataPos < bytes) {
// Fill data buffer array with new data
buffer.add(newData[newDataPos]);
newDataPos++;
//dataAmount.release();
}
return;
}
public synchronized byte getDataByte() {
/*
try {
dataAmount.acquire();
}
catch(InterruptedException e) {
return 0;
}
*/
while(buffer.size() == 0) {
try {
Thread.sleep(250);
}
catch(Exception e) {
Log.d(DEBUG_TAG, "getDataByte: failed to sleep");
}
}
return buffer.remove(0);
}
}
The problem is I get a null pointer every so often exception when trying to buffer.remove(0). As you can tell form the comments in the code, I tried using a semaphore at one point but it still intermittently threw nullpointer exceptions, so I created my own type of sleep-poll as a semi-proof-of-concept.
I do not understand why a null pointer exception would occur and/or how to fix it.
If you are handling the object initialization in a different thread it is possible that the constructor is not finished before the
public synchronized byte getDataByte()
is called therefore causing the NullPointerException because
this.buffer = new ArrayList<Byte>();
was never called.
I have a guess as to an explanation. I would do it in comments, but I don't have enough reputation, so hopefully this answer is helpful.
First of all, if you were to declare the addData() function as synchronized, would your problem go away? My guess is that it would.
My theory is that although you declared buffer as volatile, that is not sufficient protection for your use case. Imagine this case:
addData() gets called and is calling buffer.add()
at the same time, getDataByte() is checking buffer.size() == 0
My theory is that buffer.add() is not an atomic operation. Somewhere during the buffer.add() operation, it's internal size counter increments, enabling your getDataByte() call to buffer.size() == 0 to return false. On occasion, getDataByte() continues with its buffer.remove() call before your buffer.add() call completes.
This is based on an excerpt I read here:
https://www.ibm.com/developerworks/java/library/j-jtp06197/
"While the increment operation (x++) may look like a single operation, it is really a compound read-modify-write sequence of operations that must execute atomically -- and volatile does not provide the necessary atomicity."

Synchronizing var instantiation upon multiple threads

I'm trying to sync a var instantiation like this one:
Object o = new Object();
String s = null;
void getS() {
if (s != null) {
return s;
}
// multiple threads stopping here
// maybe using readwritelock? write.lock?
syncronize(o) {
// if previous thread stopped by sync block
// completed before, bypass this
if (s != null) {
return s;
}
// no one before, instantiate s
s = "abc";
}
return s;
}
Is there a better way to handle a single time instantiation of var s? Maybe using locks?
Declare s volatile
volatile String s;
and we'll get a classic double-checked locking design pattern implementation. Patterns are formalized best practices so you dont need to try and further improve this code.
BTW the example with lazy String initialization makes no sense, it should be an expensive to create Object
The simplest to write:
private Foo foo;
public synchronized Foo getFoo() {
if (foo == null) {
foo = new Foo();
}
return foo;
}
The downside is that you have synchronization happen every time you access this property, even though synchronization is only needed the first time.
Google "double checked locking in java" for lots of information about ways that you can accomplish the same thing, but with less locking (and therefore potentially better performance).

Synchronizing on String objects in Java

I have a webapp that I am in the middle of doing some load/performance testing on, particularily on a feature where we expect a few hundred users to be accessing the same page and hitting refresh about every 10 seconds on this page. One area of improvement that we found we could make with this function was to cache the responses from the web service for some period of time, since the data is not changing.
After implementing this basic caching, in some further testing I found out that I didn't consider how concurrent threads could access the Cache at the same time. I found that within the matter of ~100ms, about 50 threads were trying to fetch the object from the Cache, finding that it had expired, hitting the web service to fetch the data, and then putting the object back in the cache.
The original code looked something like this:
private SomeData[] getSomeDataByEmail(WebServiceInterface service, String email) {
final String key = "Data-" + email;
SomeData[] data = (SomeData[]) StaticCache.get(key);
if (data == null) {
data = service.getSomeDataForEmail(email);
StaticCache.set(key, data, CACHE_TIME);
}
else {
logger.debug("getSomeDataForEmail: using cached object");
}
return data;
}
So, to make sure that only one thread was calling the web service when the object at key expired, I thought I needed to synchronize the Cache get/set operation, and it seemed like using the cache key would be a good candidate for an object to synchronize on (this way, calls to this method for email b#b.com would not be blocked by method calls to a#a.com).
I updated the method to look like this:
private SomeData[] getSomeDataByEmail(WebServiceInterface service, String email) {
SomeData[] data = null;
final String key = "Data-" + email;
synchronized(key) {
data =(SomeData[]) StaticCache.get(key);
if (data == null) {
data = service.getSomeDataForEmail(email);
StaticCache.set(key, data, CACHE_TIME);
}
else {
logger.debug("getSomeDataForEmail: using cached object");
}
}
return data;
}
I also added logging lines for things like "before synchronization block", "inside synchronization block", "about to leave synchronization block", and "after synchronization block", so I could determine if I was effectively synchronizing the get/set operation.
However it doesn't seem like this has worked. My test logs have output like:
(log output is 'threadname' 'logger name' 'message')
http-80-Processor253 jsp.view-page - getSomeDataForEmail: about to enter synchronization block
http-80-Processor253 jsp.view-page - getSomeDataForEmail: inside synchronization block
http-80-Processor253 cache.StaticCache - get: object at key [SomeData-test#test.com] has expired
http-80-Processor253 cache.StaticCache - get: key [SomeData-test#test.com] returning value [null]
http-80-Processor263 jsp.view-page - getSomeDataForEmail: about to enter synchronization block
http-80-Processor263 jsp.view-page - getSomeDataForEmail: inside synchronization block
http-80-Processor263 cache.StaticCache - get: object at key [SomeData-test#test.com] has expired
http-80-Processor263 cache.StaticCache - get: key [SomeData-test#test.com] returning value [null]
http-80-Processor131 jsp.view-page - getSomeDataForEmail: about to enter synchronization block
http-80-Processor131 jsp.view-page - getSomeDataForEmail: inside synchronization block
http-80-Processor131 cache.StaticCache - get: object at key [SomeData-test#test.com] has expired
http-80-Processor131 cache.StaticCache - get: key [SomeData-test#test.com] returning value [null]
http-80-Processor104 jsp.view-page - getSomeDataForEmail: inside synchronization block
http-80-Processor104 cache.StaticCache - get: object at key [SomeData-test#test.com] has expired
http-80-Processor104 cache.StaticCache - get: key [SomeData-test#test.com] returning value [null]
http-80-Processor252 jsp.view-page - getSomeDataForEmail: about to enter synchronization block
http-80-Processor283 jsp.view-page - getSomeDataForEmail: about to enter synchronization block
http-80-Processor2 jsp.view-page - getSomeDataForEmail: about to enter synchronization block
http-80-Processor2 jsp.view-page - getSomeDataForEmail: inside synchronization block
I wanted to see only one thread at a time entering/exiting the synchronization block around the get/set operations.
Is there an issue in synchronizing on String objects? I thought the cache-key would be a good choice as it is unique to the operation, and even though the final String key is declared within the method, I was thinking that each thread would be getting a reference to the same object and therefore would synchronization on this single object.
What am I doing wrong here?
Update: after looking further at the logs, it seems like methods with the same synchronization logic where the key is always the same, such as
final String key = "blah";
...
synchronized(key) { ...
do not exhibit the same concurrency problem - only one thread at a time is entering the block.
Update 2: Thanks to everyone for the help! I accepted the first answer about intern()ing Strings, which solved my initial problem - where multiple threads were entering synchronized blocks where I thought they shouldn't, because the key's had the same value.
As others have pointed out, using intern() for such a purpose and synchronizing on those Strings does indeed turn out to be a bad idea - when running JMeter tests against the webapp to simulate the expected load, I saw the used heap size grow to almost 1GB in just under 20 minutes.
Currently I'm using the simple solution of just synchronizing the entire method - but I really like the code samples provided by martinprobst and MBCook, but since I have about 7 similar getData() methods in this class currently (since it needs about 7 different pieces of data from a web service), I didn't want to add almost-duplicate logic about getting and releasing locks to each method. But this is definitely very, very valuable info for future usage. I think these are ultimately the correct answers on how best to make an operation like this thread-safe, and I'd give out more votes to these answers if I could!
Without putting my brain fully into gear, from a quick scan of what you say it looks as though you need to intern() your Strings:
final String firstkey = "Data-" + email;
final String key = firstkey.intern();
Two Strings with the same value are otherwise not necessarily the same object.
Note that this may introduce a new point of contention, since deep in the VM, intern() may have to acquire a lock. I have no idea what modern VMs look like in this area, but one hopes they are fiendishly optimised.
I assume you know that StaticCache still needs to be thread-safe. But the contention there should be tiny compared with what you'd have if you were locking on the cache rather than just the key while calling getSomeDataForEmail.
Response to question update:
I think that's because a string literal always yields the same object. Dave Costa points out in a comment that it's even better than that: a literal always yields the canonical representation. So all String literals with the same value anywhere in the program would yield the same object.
Edit
Others have pointed out that synchronizing on intern strings is actually a really bad idea - partly because creating intern strings is permitted to cause them to exist in perpetuity, and partly because if more than one bit of code anywhere in your program synchronizes on intern strings, you have dependencies between those bits of code, and preventing deadlocks or other bugs may be impossible.
Strategies to avoid this by storing a lock object per key string are being developed in other answers as I type.
Here's an alternative - it still uses a singular lock, but we know we're going to need one of those for the cache anyway, and you were talking about 50 threads, not 5000, so that may not be fatal. I'm also assuming that the performance bottleneck here is slow blocking I/O in DoSlowThing() which will therefore hugely benefit from not being serialised. If that's not the bottleneck, then:
If the CPU is busy then this approach may not be sufficient and you need another approach.
If the CPU is not busy, and access to server is not a bottleneck, then this approach is overkill, and you might as well forget both this and per-key locking, put a big synchronized(StaticCache) around the whole operation, and do it the easy way.
Obviously this approach needs to be soak tested for scalability before use -- I guarantee nothing.
This code does NOT require that StaticCache is synchronized or otherwise thread-safe. That needs to be revisited if any other code (for example scheduled clean-up of old data) ever touches the cache.
IN_PROGRESS is a dummy value - not exactly clean, but the code's simple and it saves having two hashtables. It doesn't handle InterruptedException because I don't know what your app wants to do in that case. Also, if DoSlowThing() consistently fails for a given key this code as it stands is not exactly elegant, since every thread through will retry it. Since I don't know what the failure criteria are, and whether they are liable to be temporary or permanent, I don't handle this either, I just make sure threads don't block forever. In practice you may want to put a data value in the cache which indicates 'not available', perhaps with a reason, and a timeout for when to retry.
// do not attempt double-check locking here. I mean it.
synchronized(StaticObject) {
data = StaticCache.get(key);
while (data == IN_PROGRESS) {
// another thread is getting the data
StaticObject.wait();
data = StaticCache.get(key);
}
if (data == null) {
// we must get the data
StaticCache.put(key, IN_PROGRESS, TIME_MAX_VALUE);
}
}
if (data == null) {
// we must get the data
try {
data = server.DoSlowThing(key);
} finally {
synchronized(StaticObject) {
// WARNING: failure here is fatal, and must be allowed to terminate
// the app or else waiters will be left forever. Choose a suitable
// collection type in which replacing the value for a key is guaranteed.
StaticCache.put(key, data, CURRENT_TIME);
StaticObject.notifyAll();
}
}
}
Every time anything is added to the cache, all threads wake up and check the cache (no matter what key they're after), so it's possible to get better performance with less contentious algorithms. However, much of that work will take place during your copious idle CPU time blocking on I/O, so it may not be a problem.
This code could be commoned-up for use with multiple caches, if you define suitable abstractions for the cache and its associated lock, the data it returns, the IN_PROGRESS dummy, and the slow operation to perform. Rolling the whole thing into a method on the cache might not be a bad idea.
Synchronizing on an intern'd String might not be a good idea at all - by interning it, the String turns into a global object, and if you synchronize on the same interned strings in different parts of your application, you might get really weird and basically undebuggable synchronization issues such as deadlocks. It might seem unlikely, but when it happens you are really screwed. As a general rule, only ever synchronize on a local object where you're absolutely sure that no code outside of your module might lock it.
In your case, you can use a synchronized hashtable to store locking objects for your keys.
E.g.:
Object data = StaticCache.get(key, ...);
if (data == null) {
Object lock = lockTable.get(key);
if (lock == null) {
// we're the only one looking for this
lock = new Object();
synchronized(lock) {
lockTable.put(key, lock);
// get stuff
lockTable.remove(key);
}
} else {
synchronized(lock) {
// just to wait for the updater
}
data = StaticCache.get(key);
}
} else {
// use from cache
}
This code has a race condition, where two threads might put an object into the lock table after each other. This should however not be a problem, because then you only have one more thread calling the webservice and updating the cache, which shouldn't be a problem.
If you're invalidating the cache after some time, you should check whether data is null again after retrieving it from the cache, in the lock != null case.
Alternatively, and much easier, you can make the whole cache lookup method ("getSomeDataByEmail") synchronized. This will mean that all threads have to synchronize when they access the cache, which might be a performance problem. But as always, try this simple solution first and see if it's really a problem! In many cases it should not be, as you probably spend much more time processing the result than synchronizing.
Strings are not good candidates for synchronization. If you must synchronize on a String ID, it can be done by using the string to create a mutex (see "synchronizing on an ID"). Whether the cost of that algorithm is worth it depends on whether invoking your service involves any significant I/O.
Also:
I hope the StaticCache.get() and set() methods are threadsafe.
String.intern() comes at a cost (one that varies between VM implementations) and should be used with care.
Here is a safe short Java 8 solution that uses a map of dedicated lock objects for synchronization:
private static final Map<String, Object> keyLocks = new ConcurrentHashMap<>();
private SomeData[] getSomeDataByEmail(WebServiceInterface service, String email) {
final String key = "Data-" + email;
synchronized (keyLocks.computeIfAbsent(key, k -> new Object())) {
SomeData[] data = StaticCache.get(key);
if (data == null) {
data = service.getSomeDataForEmail(email);
StaticCache.set(key, data);
}
}
return data;
}
It has a drawback that keys and lock objects would retain in map forever.
This can be worked around like this:
private SomeData[] getSomeDataByEmail(WebServiceInterface service, String email) {
final String key = "Data-" + email;
synchronized (keyLocks.computeIfAbsent(key, k -> new Object())) {
try {
SomeData[] data = StaticCache.get(key);
if (data == null) {
data = service.getSomeDataForEmail(email);
StaticCache.set(key, data);
}
} finally {
keyLocks.remove(key); // vulnerable to race-conditions
}
}
return data;
}
But then popular keys would be constantly reinserted in map with lock objects being reallocated.
Update: And this leaves race condition possibility when two threads would concurrently enter synchronized section for the same key but with different locks.
So it may be more safe and efficient to use expiring Guava Cache:
private static final LoadingCache<String, Object> keyLocks = CacheBuilder.newBuilder()
.expireAfterAccess(10, TimeUnit.MINUTES) // max lock time ever expected
.build(CacheLoader.from(Object::new));
private SomeData[] getSomeDataByEmail(WebServiceInterface service, String email) {
final String key = "Data-" + email;
synchronized (keyLocks.getUnchecked(key)) {
SomeData[] data = StaticCache.get(key);
if (data == null) {
data = service.getSomeDataForEmail(email);
StaticCache.set(key, data);
}
}
return data;
}
Note that it's assumed here that StaticCache is thread-safe and wouldn't suffer from concurrent reads and writes for different keys.
Others have suggested interning the strings, and that will work.
The problem is that Java has to keep interned strings around. I was told it does this even if you're not holding a reference because the value needs to be the same the next time someone uses that string. This means interning all the strings may start eating up memory, which with the load you're describing could be a big problem.
I have seen two solutions to this:
You could synchronize on another object
Instead of the email, make an object that holds the email (say the User object) that holds the value of email as a variable. If you already have another object that represents the person (say you already pulled something from the DB based on their email) you could use that. By implementing the equals method and the hashcode method you can make sure Java considers the objects the same when you do a static cache.contains() to find out if the data is already in the cache (you'll have to synchronize on the cache).
Actually, you could keep a second Map for objects to lock on. Something like this:
Map<String, Object> emailLocks = new HashMap<String, Object>();
Object lock = null;
synchronized (emailLocks) {
lock = emailLocks.get(emailAddress);
if (lock == null) {
lock = new Object();
emailLocks.put(emailAddress, lock);
}
}
synchronized (lock) {
// See if this email is in the cache
// If so, serve that
// If not, generate the data
// Since each of this person's threads synchronizes on this, they won't run
// over eachother. Since this lock is only for this person, it won't effect
// other people. The other synchronized block (on emailLocks) is small enough
// it shouldn't cause a performance problem.
}
This will prevent 15 fetches on the same email address at one. You'll need something to prevent too many entries from ending up in the emailLocks map. Using LRUMaps from Apache Commons would do it.
This will need some tweaking, but it may solve your problem.
Use a different key
If you are willing to put up with possible errors (I don't know how important this is) you could use the hashcode of the String as the key. ints don't need to be interned.
Summary
I hope this helps. Threading is fun, isn't it? You could also use the session to set a value meaning "I'm already working on finding this" and check that to see if the second (third, Nth) thread needs to attempt to create the or just wait for the result to show up in the cache. I guess I had three suggestions.
You can use the 1.5 concurrency utilities to provide a cache designed to allow multiple concurrent access, and a single point of addition (i.e. only one thread ever performing the expensive object "creation"):
private ConcurrentMap<String, Future<SomeData[]> cache;
private SomeData[] getSomeDataByEmail(final WebServiceInterface service, final String email) throws Exception {
final String key = "Data-" + email;
Callable<SomeData[]> call = new Callable<SomeData[]>() {
public SomeData[] call() {
return service.getSomeDataForEmail(email);
}
}
FutureTask<SomeData[]> ft; ;
Future<SomeData[]> f = cache.putIfAbsent(key, ft= new FutureTask<SomeData[]>(call)); //atomic
if (f == null) { //this means that the cache had no mapping for the key
f = ft;
ft.run();
}
return f.get(); //wait on the result being available if it is being calculated in another thread
}
Obviously, this doesn't handle exceptions as you'd want to, and the cache doesn't have eviction built in. Perhaps you could use it as a basis to change your StaticCache class, though.
Use a decent caching framework such as ehcache.
Implementing a good cache is not as easy as some people believe.
Regarding the comment that String.intern() is a source of memory leaks, that is actually not true.
Interned Strings are garbage collected,it just might take longer because on certain JVM'S (SUN) they are stored in Perm space which is only touched by full GC's.
Your main problem is not just that there might be multiple instances of String with the same value. The main problem is that you need to have only one monitor on which to synchronize for accessing the StaticCache object. Otherwise multiple threads might end up concurrently modifying StaticCache (albeit under different keys), which most likely doesn't support concurrent modification.
The call:
final String key = "Data-" + email;
creates a new object every time the method is called. Because that object is what you use to lock, and every call to this method creates a new object, then you are not really synchronizing access to the map based on the key.
This further explain your edit. When you have a static string, then it will work.
Using intern() solves the problem, because it returns the string from an internal pool kept by the String class, that ensures that if two strings are equal, the one in the pool will be used. See
http://java.sun.com/j2se/1.4.2/docs/api/java/lang/String.html#intern()
This question seems to me a bit too broad, and therefore it instigated equally broad set of answers. So I'll try to answer the question I have been redirected from, unfortunately that one has been closed as duplicate.
public class ValueLock<T> {
private Lock lock = new ReentrantLock();
private Map<T, Condition> conditions = new HashMap<T, Condition>();
public void lock(T t){
lock.lock();
try {
while (conditions.containsKey(t)){
conditions.get(t).awaitUninterruptibly();
}
conditions.put(t, lock.newCondition());
} finally {
lock.unlock();
}
}
public void unlock(T t){
lock.lock();
try {
Condition condition = conditions.get(t);
if (condition == null)
throw new IllegalStateException();// possibly an attempt to release what wasn't acquired
conditions.remove(t);
condition.signalAll();
} finally {
lock.unlock();
}
}
Upon the (outer) lock operation the (inner) lock is acquired to get an exclusive access to the map for a short time, and if the correspondent object is already in the map, the current thread will wait,
otherwise it will put new Condition to the map, release the (inner) lock and proceed,
and the (outer) lock is considered obtained.
The (outer) unlock operation, first acquiring an (inner) lock, will signal on Condition and then remove the object from the map.
The class does not use concurrent version of Map, because every access to it is guarded by single (inner) lock.
Please notice, the semantic of lock() method of this class is different that of ReentrantLock.lock(), the repeated lock() invocations without paired unlock() will hang current thread indefinitely.
An example of usage that might be applicable to the situation, the OP described
ValueLock<String> lock = new ValueLock<String>();
// ... share the lock
String email = "...";
try {
lock.lock(email);
//...
} finally {
lock.unlock(email);
}
This is rather late, but there is quite a lot of incorrect code presented here.
In this example:
private SomeData[] getSomeDataByEmail(WebServiceInterface service, String email) {
SomeData[] data = null;
final String key = "Data-" + email;
synchronized(key) {
data =(SomeData[]) StaticCache.get(key);
if (data == null) {
data = service.getSomeDataForEmail(email);
StaticCache.set(key, data, CACHE_TIME);
}
else {
logger.debug("getSomeDataForEmail: using cached object");
}
}
return data;
}
The synchronization is incorrectly scoped. For a static cache that supports a get/put API, there should be at least synchronization around the get and getIfAbsentPut type operations, for safe access to the cache. The scope of synchronization will be the cache itself.
If updates must be made to the data elements themselves, that adds an additional layer of synchronization, which should be on the individual data elements.
SynchronizedMap can be used in place of explicit synchronization, but care must still be observed. If the wrong APIs are used (get and put instead of putIfAbsent) then the operations won't have the necessary synchronization, despite the use of the synchronized map. Notice the complications introduced by the use of putIfAbsent: Either, the put value must be computed even in cases when it is not needed (because the put cannot know if the put value is needed until the cache contents are examined), or requires a careful use of delegation (say, using Future, which works, but is somewhat of a mismatch; see below), where the put value is obtained on demand if needed.
The use of Futures is possible, but seems rather awkward, and perhaps a bit of overengineering. The Future API is at it's core for asynchronous operations, in particular, for operations which may not complete immediately. Involving Future very probably adds a layer of thread creation -- extra probably unnecessary complications.
The main problem of using Future for this type of operation is that Future inherently ties in multi-threading. Use of Future when a new thread is not necessary means ignoring a lot of the machinery of Future, making it an overly heavy API for this use.
Latest update 2019,
If you are searching for new ways of implementing synchronization in JAVA, this answer is for you.
I found this amazing blog by Anatoliy Korovin this will help you understand the syncronized deeply.
How to Synchronize Blocks by the Value of the Object in Java.
This helped me hope new developers will find this useful too.
Why not just render a static html page that gets served to the user and regenerated every x minutes?
I'd also suggest getting rid of the string concatenation entirely if you don't need it.
final String key = "Data-" + email;
Is there other things/types of objects in the cache that use the email address that you need that extra "Data-" at the beginning of the key?
if not, i'd just make that
final String key = email;
and you avoid all that extra string creation too.
In case others have a similar problem, the following code works, as far as I can tell:
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.function.Supplier;
public class KeySynchronizer<T> {
private Map<T, CounterLock> locks = new ConcurrentHashMap<>();
public <U> U synchronize(T key, Supplier<U> supplier) {
CounterLock lock = locks.compute(key, (k, v) ->
v == null ? new CounterLock() : v.increment());
synchronized (lock) {
try {
return supplier.get();
} finally {
if (lock.decrement() == 0) {
// Only removes if key still points to the same value,
// to avoid issue described below.
locks.remove(key, lock);
}
}
}
}
private static final class CounterLock {
private AtomicInteger remaining = new AtomicInteger(1);
private CounterLock increment() {
// Returning a new CounterLock object if remaining = 0 to ensure that
// the lock is not removed in step 5 of the following execution sequence:
// 1) Thread 1 obtains a new CounterLock object from locks.compute (after evaluating "v == null" to true)
// 2) Thread 2 evaluates "v == null" to false in locks.compute
// 3) Thread 1 calls lock.decrement() which sets remaining = 0
// 4) Thread 2 calls v.increment() in locks.compute
// 5) Thread 1 calls locks.remove(key, lock)
return remaining.getAndIncrement() == 0 ? new CounterLock() : this;
}
private int decrement() {
return remaining.decrementAndGet();
}
}
}
In the case of the OP, it would be used like this:
private KeySynchronizer<String> keySynchronizer = new KeySynchronizer<>();
private SomeData[] getSomeDataByEmail(WebServiceInterface service, String email) {
String key = "Data-" + email;
return keySynchronizer.synchronize(key, () -> {
SomeData[] existing = (SomeData[]) StaticCache.get(key);
if (existing == null) {
SomeData[] data = service.getSomeDataForEmail(email);
StaticCache.set(key, data, CACHE_TIME);
return data;
}
logger.debug("getSomeDataForEmail: using cached object");
return existing;
});
}
If nothing should be returned from the synchronized code, the synchronize method can be written like this:
public void synchronize(T key, Runnable runnable) {
CounterLock lock = locks.compute(key, (k, v) ->
v == null ? new CounterLock() : v.increment());
synchronized (lock) {
try {
runnable.run();
} finally {
if (lock.decrement() == 0) {
// Only removes if key still points to the same value,
// to avoid issue described below.
locks.remove(key, lock);
}
}
}
}
I've added a small lock class that can lock/synchronize on any key, including strings.
See implementation for Java 8, Java 6 and a small test.
Java 8:
public class DynamicKeyLock<T> implements Lock
{
private final static ConcurrentHashMap<Object, LockAndCounter> locksMap = new ConcurrentHashMap<>();
private final T key;
public DynamicKeyLock(T lockKey)
{
this.key = lockKey;
}
private static class LockAndCounter
{
private final Lock lock = new ReentrantLock();
private final AtomicInteger counter = new AtomicInteger(0);
}
private LockAndCounter getLock()
{
return locksMap.compute(key, (key, lockAndCounterInner) ->
{
if (lockAndCounterInner == null) {
lockAndCounterInner = new LockAndCounter();
}
lockAndCounterInner.counter.incrementAndGet();
return lockAndCounterInner;
});
}
private void cleanupLock(LockAndCounter lockAndCounterOuter)
{
if (lockAndCounterOuter.counter.decrementAndGet() == 0)
{
locksMap.compute(key, (key, lockAndCounterInner) ->
{
if (lockAndCounterInner == null || lockAndCounterInner.counter.get() == 0) {
return null;
}
return lockAndCounterInner;
});
}
}
#Override
public void lock()
{
LockAndCounter lockAndCounter = getLock();
lockAndCounter.lock.lock();
}
#Override
public void unlock()
{
LockAndCounter lockAndCounter = locksMap.get(key);
lockAndCounter.lock.unlock();
cleanupLock(lockAndCounter);
}
#Override
public void lockInterruptibly() throws InterruptedException
{
LockAndCounter lockAndCounter = getLock();
try
{
lockAndCounter.lock.lockInterruptibly();
}
catch (InterruptedException e)
{
cleanupLock(lockAndCounter);
throw e;
}
}
#Override
public boolean tryLock()
{
LockAndCounter lockAndCounter = getLock();
boolean acquired = lockAndCounter.lock.tryLock();
if (!acquired)
{
cleanupLock(lockAndCounter);
}
return acquired;
}
#Override
public boolean tryLock(long time, TimeUnit unit) throws InterruptedException
{
LockAndCounter lockAndCounter = getLock();
boolean acquired;
try
{
acquired = lockAndCounter.lock.tryLock(time, unit);
}
catch (InterruptedException e)
{
cleanupLock(lockAndCounter);
throw e;
}
if (!acquired)
{
cleanupLock(lockAndCounter);
}
return acquired;
}
#Override
public Condition newCondition()
{
LockAndCounter lockAndCounter = locksMap.get(key);
return lockAndCounter.lock.newCondition();
}
}
Java 6:
public class DynamicKeyLock implements Lock
{
private final static ConcurrentHashMap locksMap = new ConcurrentHashMap();
private final T key;
public DynamicKeyLock(T lockKey) {
this.key = lockKey;
}
private static class LockAndCounter {
private final Lock lock = new ReentrantLock();
private final AtomicInteger counter = new AtomicInteger(0);
}
private LockAndCounter getLock()
{
while (true) // Try to init lock
{
LockAndCounter lockAndCounter = locksMap.get(key);
if (lockAndCounter == null)
{
LockAndCounter newLock = new LockAndCounter();
lockAndCounter = locksMap.putIfAbsent(key, newLock);
if (lockAndCounter == null)
{
lockAndCounter = newLock;
}
}
lockAndCounter.counter.incrementAndGet();
synchronized (lockAndCounter)
{
LockAndCounter lastLockAndCounter = locksMap.get(key);
if (lockAndCounter == lastLockAndCounter)
{
return lockAndCounter;
}
// else some other thread beat us to it, thus try again.
}
}
}
private void cleanupLock(LockAndCounter lockAndCounter)
{
if (lockAndCounter.counter.decrementAndGet() == 0)
{
synchronized (lockAndCounter)
{
if (lockAndCounter.counter.get() == 0)
{
locksMap.remove(key);
}
}
}
}
#Override
public void lock()
{
LockAndCounter lockAndCounter = getLock();
lockAndCounter.lock.lock();
}
#Override
public void unlock()
{
LockAndCounter lockAndCounter = locksMap.get(key);
lockAndCounter.lock.unlock();
cleanupLock(lockAndCounter);
}
#Override
public void lockInterruptibly() throws InterruptedException
{
LockAndCounter lockAndCounter = getLock();
try
{
lockAndCounter.lock.lockInterruptibly();
}
catch (InterruptedException e)
{
cleanupLock(lockAndCounter);
throw e;
}
}
#Override
public boolean tryLock()
{
LockAndCounter lockAndCounter = getLock();
boolean acquired = lockAndCounter.lock.tryLock();
if (!acquired)
{
cleanupLock(lockAndCounter);
}
return acquired;
}
#Override
public boolean tryLock(long time, TimeUnit unit) throws InterruptedException
{
LockAndCounter lockAndCounter = getLock();
boolean acquired;
try
{
acquired = lockAndCounter.lock.tryLock(time, unit);
}
catch (InterruptedException e)
{
cleanupLock(lockAndCounter);
throw e;
}
if (!acquired)
{
cleanupLock(lockAndCounter);
}
return acquired;
}
#Override
public Condition newCondition()
{
LockAndCounter lockAndCounter = locksMap.get(key);
return lockAndCounter.lock.newCondition();
}
}
Test:
public class DynamicKeyLockTest
{
#Test
public void testDifferentKeysDontLock() throws InterruptedException
{
DynamicKeyLock<Object> lock = new DynamicKeyLock<>(new Object());
lock.lock();
AtomicBoolean anotherThreadWasExecuted = new AtomicBoolean(false);
try
{
new Thread(() ->
{
DynamicKeyLock<Object> anotherLock = new DynamicKeyLock<>(new Object());
anotherLock.lock();
try
{
anotherThreadWasExecuted.set(true);
}
finally
{
anotherLock.unlock();
}
}).start();
Thread.sleep(100);
}
finally
{
Assert.assertTrue(anotherThreadWasExecuted.get());
lock.unlock();
}
}
#Test
public void testSameKeysLock() throws InterruptedException
{
Object key = new Object();
DynamicKeyLock<Object> lock = new DynamicKeyLock<>(key);
lock.lock();
AtomicBoolean anotherThreadWasExecuted = new AtomicBoolean(false);
try
{
new Thread(() ->
{
DynamicKeyLock<Object> anotherLock = new DynamicKeyLock<>(key);
anotherLock.lock();
try
{
anotherThreadWasExecuted.set(true);
}
finally
{
anotherLock.unlock();
}
}).start();
Thread.sleep(100);
}
finally
{
Assert.assertFalse(anotherThreadWasExecuted.get());
lock.unlock();
}
}
}
In your case you could use something like this (this doesn't leak any memory):
private Synchronizer<String> synchronizer = new Synchronizer();
private SomeData[] getSomeDataByEmail(WebServiceInterface service, String email) {
String key = "Data-" + email;
return synchronizer.synchronizeOn(key, () -> {
SomeData[] data = (SomeData[]) StaticCache.get(key);
if (data == null) {
data = service.getSomeDataForEmail(email);
StaticCache.set(key, data, CACHE_TIME);
} else {
logger.debug("getSomeDataForEmail: using cached object");
}
return data;
});
}
to use it you just add a dependency:
compile 'com.github.matejtymes:javafixes:1.3.0'
You should be very careful using short lived objects with synchronization. Every Java object has an attached monitor and by default this monitor is deflated; however if 2 threads contend on acquiring the monitor, the monitor gets inflated. If the object would be long lived, this isn't a problem. However if the object is short lived, then cleaning up this inflated monitor can be a serious hit on GC times (so higher latencies and reduced throughput). And it can even be tricky to spot on the GC times since it isn't always listed.
If you do want to synchronize, you could use a java.util.concurrent.Lock. Or make use of a manually crafted striped lock and use the hash of the string as an index on that striped lock. This striped lock you keep around so you don't get the GC problems.
So something like this:
static final Object[] locks = newLockArray();
Object lock = locks[hashToIndex(key.hashcode(),locks.length];
synchronized(lock){
....
}
int hashToIndex(int hash, int length) {
if (hash == Integer.MIN_VALUE return 0;
return abs(hash) % length;
}
other way synchronizing on string object :
String cacheKey = ...;
Object obj = cache.get(cacheKey)
if(obj==null){
synchronized (Integer.valueOf(Math.abs(cacheKey.hashCode()) % 127)){
obj = cache.get(cacheKey)
if(obj==null){
//some cal obtain obj value,and put into cache
}
}
}
You can safely use String.intern for synchronize if you can reasonably guarantee that the string value is unique across your system. UUIDS are a good way to approach this. You can associate a UUID with your actual string key, either via a cache, a map, or maybe even store the uuid as a field on your entity object.
#Service
public class MySyncService{
public Map<String, String> lockMap=new HashMap<String, String>();
public void syncMethod(String email) {
String lock = lockMap.get(email);
if(lock==null) {
lock = UUID.randomUUID().toString();
lockMap.put(email, lock);
}
synchronized(lock.intern()) {
//do your sync code here
}
}

Categories

Resources