Thread Safety: Maximum efficiency

Thread Safety: Maximum efficiency - java

Good Evening,
I am trying to understand how I am using multi-threading and how to implement thread-safety in the context.
When I want to achieve maximum speed of my threads do I use:
public void addMarketOrder(MarketOrder marketOrder) {
if (marketOrder.id != this.id) {
return;
}
synchronized (this) {
ordered += marketOrder.ordered;
}
}
or just synchronized the entire method?
public synchronized void addMarketOrder(MarketOrder marketOrder) {
if (marketOrder.id != this.id) {
return;
}
ordered += marketOrder.ordered;
}

Assuming the ids do not change, the first case is preferable to the second. The first case avoids synchronization if the ids do not match. The second case synchronizes even if there isn't any write operation.

If you want an efficient multi-threaded system, then do not let threads communicate with each other. If the threads contend for the same data, you can get significant slows downs even if you use fast alternatives like volatiles or Atomics.
I'm not sure which parts of your code need to be thread-safe. If it is only a matter of atomically increasing the counter, then making the 'ordered' field an AtomicLong and calling getAndAdd would be a reasonably fast solution that doesn't make use of any locks.

What you are hinting towards is double check locking. The correct form is:
public void addMarketOrder(MarketOrder marketOrder) {
if (marketOrder.id != this.id) {
return;
}
synchronized (this) {
if (marketOrder.id != this.id) {
ordered += marketOrder.ordered;
}
}
}
Because you shouldn't assume that because the condition became true that it will continue to be true.
Also if you read the id without synchronization it should be volatile because the compiler may optimize away memory reads under certain circumstances and the value that is being held in one thread could be different from another. Also the when not volatile the compiler can change to order of operations assuming a single thread that can make your code misbehave when running with multiple threads.
Volatility rule: any variable that is accessed outside of a synchronized block by multiple threads MUST be final or volatile. Or you will get thread visibility problems in certain circumstances.
You can also synchronize the whole method without any problem. But synchronizing the whole method will make it less efficient (which could matter if you have a lot of processors and this is a highly contested method).

Related

Which one is a better singleton and why?

I was asked in an interview to check which of the following 2 ways of declaring singleton class is better with proper reasons.Can anybody please share some ideas
public static Singleton getSingleInstance() {
if (singleInstance == null) {
synchronized (Singleton.class) {
if (singleInstance == null) {
singleInstance = new Singleton();
}
}
}
return singleInstance;
}
OR
public static synchronized Singleton getSingleInstance() {
if (singleInstance == null) {
singleInstance = new Singleton();
}
return singleInstance;
}

As this is an interview question I think their intention goes to #1, as it avoids synchronization for most of the time.

The answer depends on what is more important, performance or code legibility. For performance, the first is better [2], but for legibility, the second is better as the fewer lines of code are easier to understand and prove are free of bugs.
[2] Performance will depend on the speed of volatile reads compared to acquiring a monitor on the class defining the singleton method, assuming the method is called more than once. I don't have any quantitative proof, but I would assume the former is more better.
For what it's worth, the Apache commons LazyInitializer uses the first (http://commons.apache.org/proper/commons-lang/apidocs/src-html/org/apache/commons/lang3/concurrent/LazyInitializer.html):
#Override
public T get() throws ConcurrentException {
// use a temporary variable to reduce the number of reads of the
// volatile field
T result = object;
if (result == null) {
synchronized (this) {
result = object;
if (result == null) {
object = result = initialize();
}
}
}
return result;
}

Depends entirely on your needs.
In this case -- outside of the problem the first has of not properly protecting the test (your if should be inside the synchronization, or you risk race conditions where someone else changes it after you've tested it but before you've finished acting on the result of the test), so you could wind up with two instances -- the two are probably equivalent after the JIT is done with them.
In general, however, the problem with synchronized methods tends to be that they're oversynchronized. Many of them hold onto their lock longer than absolutely necessary, and may cost you performance by forcing other threads to wait longer. Synchronization should be done for the minimum time necessary to make the operation atomic, and often that can be limited to a few key lines rather than the whole method.
On The Other Hand... sometimes it's best not to make the class threadsafe at all. Hashtables were threadsafe; HashMaps (the newer alternative) aren't. That's because getting the lock is a relatively expensive process, especially in multiprocessor environments, and in real-world code if threadsafety is needed it generally wants to cover more than the single hashmap access so locking at this level is more often wasteful than helpful.

It's acceptable to always use 'this' as monitor lock?

For example I have a class with 2 counters (in multi-threaded environment):
public class MyClass {
private int counter1;
private int counter2;
public synchronized void increment1() {
counter1++;
}
public synchronized void increment2() {
counter2++;
}
}
Theres 2 increment operations not related with each other. But I use same object for lock (this).
It is true that if clients simultaneously calls increment1() and increment2() methods, then increment2 invocation will be blocked until increment1() releases the this monitor?
If it's true, does it mean that I need to provide different monitor locks for each operation (for performance reasons)?

It is true that if clients simultaneously calls increment1() and increment2() methods, then increment2 invocation will be blocked until increment1() releases the this monitor?
If they're called on the same instance, then yes.
If it's true, does it mean that I need to provide different monitor locks for each operation (for performance reasons)?
Only you can know that. We don't know your performance requirements. Is this actually a problem in your real code? Are your real operations long-lasting? Do they occur very frequently? Have you performed any diagnostics to estimate the impact of this? Have you profiled your application to find out how much time is being spent waiting for the monitor at all, let alone when it's unnecessary?
I would actually suggest not synchronizing on this for entirely different reasons. It's already hard enough to reason about threading when you do control everything - but when you don't know everything which can acquire a monitor, you're on a hiding to nothing. When you synchronize on this, it means that any other code which has a reference to your object can also synchronize on the same monitor. For example, a client could use:
synchronized (myClass) {
// Do something entirely different
}
This can lead to deadlocks, performance issues, all kinds of things.
If you use a private final field in your class instead, with an object created just to be a monitor, then you know that the only code acquiring that monitor will be your code.

1) yes it's true that increment1() blocks increment2() and vice versa because they both are implicitly synchronizing on this
2) if you need a better performance consider the lock-free java.util.concurrent.atomic.AtomicInteger class
private AtomicInteger counter1 = new AtomicInteger();
private AtomicInteger counter2 = new AtomicInteger();
public void increment1() {
counter1.getAndIncrement();
}
public void increment2() {
counter2.getAndIncrement();
}

If you synchonize on the method, as what you did here, you lock the whole object, so two thread accessing a different variable from this same object would block each other anyway.
If you want to syncrhonize only a counter at a time so two thread won't block each other while accessing different variables, you have to add the two counters here in two synchronized block, and use different variables as the "lock" of the two blocks.

You are right it will be a performance bottleneck if you use same Object. You can use different lock for individual counter or use java.util.concurrent.atomic.AtomicInteger for concurrent counter.
Like:
public class Counter {
private AtomicInteger count = new AtomicInteger(0);
public void incrementCount() {
count.incrementAndGet();
}
public int getCount() {
return count.get();
}
}

Yes the given code is identical to the following:
public void increment1() {
synchronized(this) {
counter1++;
}
}
public oid increment2() {
synchronized(this) {
counter2++;
}
}
which means that only one method can be executed at the same time. You should either provide different locks (and locking on this is a bad idea to begin with), or some other solution. The second one is the one you actually want here: AtomicInteger

Yes if multiple threads try to call methods on your object they will wait trying to get the lock (although the order of who gets the lock isn't guaranteed.) As with everything there is no reason to optimise until you know this is the bottle neck in you code.

If you need the performance benefits that can be had from being able to call both operations in parallel, then yes, you do not to provide different monitor objects for the different operations.
However, there is something to be said for premature optimization and that you should make sure that you need it before making your program more complex to accommodate it.

volatile synchronized combination for performance

When Synchronization is used there is a performance impact. Can volatile be used in combination with synchronized to reduce the performance overhead ? For example, instance of Counter will be shared among many threads and each thread can access Counter's public methods. In the below code volatile is used for getter and synchronized is used for setter
public class Counter
{
private volatile int count;
public Counter()
{
count = 0;
}
public int getCount()
{
return count;
}
public synchronized void increment()
{
++count;
}
}
Please let me know in which scenario this might break ?

Yes, you definitely can. In fact, if you look at the source code of AtomicInteger, it's essentially what they do. AtomicInteger.get simply returns value, which is a volatile int (link). The only real difference from what you've done and what they do is that they use a CAS for the increment instead of synchronization. On modern hardware, a CAS can eliminate any mutual exclusion; on older hardware, the JVM will put some sort of mutex around the increment.
Volatile reads are about as fast as non-volatile ones, so the reads will be quite fast.
Not only that, but volatile fields are guaranteed not to tear: see JLS 17.7, which specifies that volatile longs and doubles are not subject to word tearing. So your code would work with a long just as well as an int.
As Diego Frehner points out, you might not see the result of an increment if you get the value "right as" the increment happens -- you'll either see the before or the after. Of course, if get were synchronized you'd have exactly the same behavior from the read thread -- you'd either see the before-increment or post-increment value. So it's really the same either way. In other words, it doesn't make sense to say that you won't see the value as it's happening -- unless you meant word tearing, which (a) you won't get and (b) you would never want.

1. I have personally used this mechanism of volatile combined with synchronized.
2. You can alone use synchronized, and you will always get a consistent result, but using
only volatile alone will Not yield the same result always.
3. This is because volatile keyword is not a synchronization primitive. It merely prevents caching of the value on the thread, but it does not prevent two threads from modifying the same value and writing it back concurrently.
4. volatile give concurrent access to threads without lock, but then using synchronized will allow only one thread to get access to this and all the synchronized methods in the class.
5. And using both volatile and synchronized will do this....
volatile - will reflect the changed values to thread, and prevent caching,
synchronized - But using synchronized keyword, will make sure that only one thread gets the access to the synchronized methods of the class.

You will not always get the most actual count when calling getCount(). An AtomicInteger could be appropriate for you.

There wouldn't be a performance gain from using both. Volatile guarantees that the value of a variable will be consistent when reading/writing to the variable across threads executing in parallel by preventing caching. Synchronized, when applied to a method (as you do in your example), only allows a single thread to enter that method at a time and blocks others until execution is complete.

Java: How exactly do synchronized operations relate to volatility?

Sorry this is such a long question.
Ive been doing lots of research lately into multi-threading as I slowly implement it into a personal project. However, probably due to an abundance of slightly incorrect examples, the use of synchronized blocks and volatility in certain situations is still a bit unclear to me.
My core question is this: Are changes to references and primitives automatically volatile (that is, performed on the main memory and not a cache) when a thread is inside a synchronized block, or does the read also have to be synchronized for it to work properly?
If so What is the purpose of synchronizing a simple getter method? (see example 1 ) Also, are ALL changes sent to main memory as long as the thread has synchronized on anything? eg if it is sent off to do loads of work all over the place inside a very high level sync will every single change then made be to main memory, and nothing ever to cache, until its unlocked again?
If not Does the change have to be explicitly inside a synchronized block, or can java actually pick up on, for example, uses of the Lock object? (see example 3)
If either Does the synchronized object need to be related to the reference/primitive being changed in any way (eg the immediate object that contains it)? Can I write by syncing on one object and read with another if its otherwise safe? (see example 2)
(please note for the following examples that I know that synchronized methods and synchronized(this) are frowned upon and why, but discussion about that is beyond the scope of my question)
Example 1:
class Counter{
int count = 0;
public synchronized void increment(){
count++;
}
public int getCount(){
return count;
}
}
In this example, increment() needs to be synchronized since ++ is not an atomic operation. As such, two threads incremending at the same time may result in a overall increase of 1 to the count. The count primitive needs to be atomic (eg not long/double/reference), and it is so thats fine.
Does getCount() need to be synchronized here and why exactly? The explanation I have heard the most is that I will have no guarantee whether the count returned will be the pre- or post-increment. However, this seems like the explanation for something slightly different, thats found itself in the wrong place. I mean if I were to synchronize getCount(), then I still see no guarantee - its now down to not knowing the locking order, insead of not knowing whether the actual read happens to be before/after the actual write.
Example 2:
Is the following example threadsafe, if you assume that through trickery not shown here that none of these methods will never be called at the same time? Will count increment in an expected way if its done so using a random method each time, and then be read properly, or does the lock have to be the same object? (btw I fully realise how rediculous this example is but Im more interested in theory than practice)
class Counter{
private final Object lock1 = new Object();
private final Object lock2 = new Object();
private final Object lock3 = new Object();
int count = 0;
public void increment1(){
synchronized(lock1){
count++;
}
}
public void increment2(){
synchronized(lock2){
count++;
}
}
public int getCount(){
synchronized(lock3){
return count;
}
}
}
Example 3:
Is the happens-before relationship simply a java concept, or is it an actual thing built into the JVM? Even though I can guarantee a conceptual happens-before relationship for this next example, is java smart enough to pick it up if its a built in thing? I am assuming it is not, but is this example actually threadsafe? If its threadsafe, what about if getCount() did no locking?
class Counter{
private final Lock lock = new Lock();
int count = 0;
public void increment(){
lock.lock();
count++;
lock.unlock();
}
public int getCount(){
lock.lock();
int count = this.count;
lock.unlock();
return count;
}
}

Yes, the read has to be synchronized as well. This page says:
The results of a write by one thread are guaranteed to be visible to a
read by another thread only if the write operation happens-before the
read operation.
[...]
An unlock (synchronized block or method exit) of a monitor
happens-before every subsequent lock (synchronized block or method
entry) of that same monitor
The same page says:
Actions prior to "releasing" synchronizer methods such as Lock.unlock,
Semaphore.release, and CountDownLatch.countDown happen-before actions
subsequent to a successful "acquiring" method such as Lock.lock
So locks offer the same visibility guarantees as synchronized blocks.
Whether you use synchronized blocks or locks, the visibility is only guaranteed if the reader thread uses the same monitor or lock as the writer thread.
Your Example 1 is incorrect: the getter must be synchronized as well if you want to see the latest value of the count.
Your example 2 is incorrect because it uses different locks to guard the same count.
Your example 3 is OK. If the getter did not lock, you could see an older value of the count. The happens-before is something that is guaranteed by the JVM. The JVM has to respect the rules specified, by flushing caches to the main memory for example.

Try to view it in terms of two distinct, simple operations:
Locking (mutual exclusion),
Memory barrier (cache sync, instruction reordering barrier).
Entering a synchronized block entails both locking and memory barrier; leaving the synchronized block entails unlocking + memory barrier; reading/writing a volatile field entails memory barrier only. Thinking in these terms I think you can clarify for yourself all the question above.
As for Example 1, the reading thread will not have any kind of memory barrier. It's not just between seeing the value before/after read, it's about never observing any change to the var after a thread is started.
Example 2. is the most interesting issue you raise. You are indeed given no guarantees by the JLS in this case. In practice you won't be given any ordering guarantees (it's as if the locking aspect wasn't there at all), but you'll still have the benefit of the memory barriers so you will observe changes, unlike the first example. Basically, this is exactly the same as removing synchronized and tagging the int as volatile (apart from the runtime costs of acquiring locks).
Regarding Example 3, by "just a Java thing" I feel you have generics with erasure in mind, something that only the static code checking is aware of. This is not like that -- both locks and memory barriers are pure runtime artifacts. In fact, the compiler can't reason about them at all.

Questions on Concurrency from Java Guide

So I've been reading on concurrency and have some questions on the way (guide I followed - though I'm not sure if its the best source):
Processes vs. Threads: Is the difference basically that a process is the program as a whole while a thread can be a (small) part of a program?
I am not exactly sure why there is a interrupted() method and a InterruptedException. Why should the interrupted() method even be used? It just seems to me that Java just adds an extra layer of indirection.
For synchronization (and specifically about the one in that link), how does adding the synchronize keyword even fix the problem? I mean, if Thread A gives back its incremented c and Thread B gives back the decremented c and store it to some other variable, I am not exactly sure how the problem is solved. I mean this may be answering my own question, but is it supposed to be assumed that after one of the threads return an answer, terminate? And if that is the case, why would adding synchronize make a difference?
I read (from some random PDF) that if you have two Threads start() subsequently, you cannot guarantee that the first thread will occur before the second thread. How would you guarantee it, though?
In synchronization statements, I am not completely sure whats the point of adding synchronized within the method. What is wrong with leaving it out? Is it because one expects both to mutate separately, but to be obtained together? Why not just have the two non-synchronized?
Is volatile just a keyword for variables and is synonymous with synchronized?
In the deadlock problem, how does synchronize even help the situation? What makes this situation different from starting two threads that change a variable?
Moreover, where is the "wait"/lock for the other person to bowBack? I would have thought that bow() was blocked, not bowBack().
I'll stop here because I think if I went any further without these questions answered, I will not be able to understand the later lessons.

Answers:
Yes, a process is an operating system process that has an address space, a thread is a unit of execution, and there can be multiple units of execution in a process.
The interrupt() method and InterruptedException are generally used to wake up threads that are waiting to either have them do something or terminate.
Synchronizing is a form of mutual exclusion or locking, something very standard and required in computer programming. Google these terms and read up on that and you will have your answer.
True, this cannot be guaranteed, you would have to have some mechanism, involving synchronization that the threads used to make sure they ran in the desired order. This would be specific to the code in the threads.
See answer to #3
Volatile is a way to make sure that a particular variable can be properly shared between different threads. It is necessary on multi-processor machines (which almost everyone has these days) to make sure the value of the variable is consistent between the processors. It is effectively a way to synchronize a single value.
Read about deadlocking in more general terms to understand this. Once you first understand mutual exclusion and locking you will be able to understand how deadlocks can happen.
I have not read the materials that you read, so I don't understand this one. Sorry.

I find that the examples used to explain synchronization and volatility are contrived and difficult to understand the purpose of. Here are my preferred examples:
Synchronized:
private Value value;
public void setValue(Value v) {
value = v;
}
public void doSomething() {
if(value != null) {
doFirstThing();
int val = value.getInt(); // Will throw NullPointerException if another
// thread calls setValue(null);
doSecondThing(val);
}
}
The above code is perfectly correct if run in a single-threaded environment. However with even 2 threads there is the possibility that value will be changed in between the check and when it is used. This is because the method doSomething() is not atomic.
To address this, use synchronization:
private Value value;
private Object lock = new Object();
public void setValue(Value v) {
synchronized(lock) {
value = v;
}
}
public void doSomething() {
synchronized(lock) { // Prevents setValue being called by another thread.
if(value != null) {
doFirstThing();
int val = value.getInt(); // Cannot throw NullPointerException.
doSecondThing(val);
}
}
}
Volatile:
private boolean running = true;
// Called by Thread 1.
public void run() {
while(running) {
doSomething();
}
}
// Called by Thread 2.
public void stop() {
running = false;
}
To explain this requires knowledge of the Java Memory Model. It is worth reading about in depth, but the short version for this example is that Threads have their own copies of variables which are only sync'd to main memory on a synchronized block and when a volatile variable is reached. The Java compiler (specifically the JIT) is allowed to optimise the code into this:
public void run() {
while(true) { // Will never end
doSomething();
}
}
To prevent this optimisation you can set a variable to be volatile, which forces the thread to access main memory every time it reads the variable. Note that this is unnecessary if you are using synchronized statements as both keywords cause a sync to main memory.
I haven't addressed your questions directly as Francis did so. I hope these examples can give you an idea of the concepts in a better way than the examples you saw in the Oracle tutorial.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.