Java Thread Synchronization, best concurrent utility, read operation

Java Thread Synchronization, best concurrent utility, read operation - java

I have a java threads related question.
To take a very simple example, lets say I have 2 threads.
Thread A running StockReader Class instance
Thread B running StockAvgDataCollector Class instance
In Thread B, StockAvgDataCollector collects some market Data continuously, does some heavy averaging/manipulation and updates a member variable spAvgData
In Thread A StockReader has access to StockAvgDataCollector instance and its member spAvgData using getspAvgData() method.
So Thread A does READ operation only and Thread B does READ/WRITE operations.
Questions
Now, do I need synchronization or atomic functionality or locking or any concurrency related stuff in this scenario? It doesnt matter if Thread A reads an older value.
Since Thread A is only going READ and not update anything and only Thread B does any WRITE operations, will there be any deadlock scenarios?
I've pasted a paragraph below from the following link. From that paragraph, it seems like I do need to worry about some sort of locking/synchronizing.
http://java.sun.com/developer/technicalArticles/J2SE/concurrency/
Reader/Writer Locks
When using a thread to read data from an object, you do not necessarily need to prevent another thread from reading data at the same time. So long as the threads are only reading and not changing data, there is no reason why they cannot read in parallel. The J2SE 5.0 java.util.concurrent.locks package provides classes that implement this type of locking. The ReadWriteLock interface maintains a pair of associated locks, one for read-only and one for writing. The readLock() may be held simultaneously by multiple reader threads, so long as there are no writers. The writeLock() is exclusive. While in theory, it is clear that the use of reader/writer locks to increase concurrency leads to performance improvements over using a mutual exclusion lock. However, this performance improvement will only be fully realized on a multi-processor and the frequency that the data is read compared to being modified as well as the duration of the read and write operations.
Which concurrent utility would be less expensive and suitable in my example?
java.util.concurrent.atomic ?
java.util.concurrent.locks ?
java.util.concurrent.ConcurrentLinkedQueue ? - In this case StockAvgDataCollector will add and StockReader will remove. No getspAvgData() method will be exposed.
Thanks
Amit

Well, the whole ReadWriteLock thing really makes sense when you have many readers and at least one writer... So you guarantee liveliness (you won't be blocking any reader threads if no one other thread is writing). However, you have only two threads.
If you don't mind thread B reading an old (but not corrupted) value of spAvgData, then I would go for an AtomicDouble (or AtomicReference, depending on what spAvgData's datatype).
So the code would look like this
public class A extends Thread {
// spAvgData
private final AtomicDouble spAvgData = new AtomicDouble(someDefaultValue);
public void run() {
while (compute) {
// do intensive work
// ...
// done with work, update spAvgData
spAvgData.set(resultOfComputation);
}
}
public double getSpAvgData() {
return spAvgData.get();
}
}
// --------------
public class B {
public void someMethod() {
A a = new A();
// after A being created, spAvgData contains a valid value (at least the default)
a.start();
while(read) {
// loll around
a.getSpAvgData();
}
}
}

Yes, synchronization is important and you need to consider two parameters: visibility of the spAvgData variable and atomicity of its update. In order to guarantee visibility of the spAvgData variable in thread B by thread A, the variable can be declared volatile or as an AtomicReference. Also you need to guard that the action of the update is atomic in case there are more invariants involved or the update action is a compound action, using synchronization and locking. If only thread B is updating that variable then you don't need synchronization and visibility should be enough for thread A to read the most up-to-date value of the variable.

If you don't mind that Thread A can read complete nonsense (including partially updated data) then no, you don't need any synchronisation. However, I suspect that you should mind.
If you just use a single mutex, or ReentrantReadWriteLock and don't suspend or sleep without timeout while holding locks then there will be no deadlock. If you do perform unsafe thread operations, or try to roll your own synchronisation solution, then you will need to worry about it.
If you use a blocking queue then you will also need a constantly-running ingestion loop in StockReader. ReadWriteLock is still of benefit on a single core processor - the issues are the same whether the threads are physically running at the same time, or just interleaved by context switches.
If you don't use at least some form of synchronisation (e.g. a volatile) then your reader may never see any change at all.

Related

Why do I need to use synchronized for multiple threads over volatile?

Some people says if multiple threads are reading/writing then you need to use synchronized and if one thread is reading/writing and another one is only reading then you must use volatile. I don't get the difference between this situations.
Basically, the value of a volatile field becomes visible to all readers (other threads in particular) after a write operation completes on it.
Then If I define a variable as volatile, first threadA will read its value, threadA will update its value and write it to memory.After that variable will become visible to threadB. Then why do I need to synchronized block?

Some people says if multiple threads are reading/writing then you need to use synchronized and if one thread is reading/writing and another one is only reading then you must use volatile. I don't get the difference between this situations.
There really isn't a hard and fast rule with this. Choosing whether or not to use synchronized or volatile has more to do with how the objects are being updated as opposed to how many readers or writers there are.
For example, you can achieve multiple readers and writers with an AtomicLong which wraps a volatile long.
private AtomicLong counter = new AtomicLong();
...
// many threads can get/set this counter without synchronized
counter.incrementAndGet();
And there are circumstances where you would need a synchronized block even with a single reader/writer.
synchronized (status) {
status.setNumTransactions(dao.getNumTransactions());
// we don't want the reader thread to see `status` partially updated here
status.setTotalMoney(dao.getTotalMoney());
}
In the above example, since we are making multiple calls to update the status object we may need to ensure that other threads don't see it when the num-transactions has been updated but not the total-money. Yes, AtomicReference handles some of these cases but not all.
To be clear, marking a field volatile ensures memory synchronization. When you read a volatile field you cross a read memory barrier and when you write it you cross a write memory barrier. A synchronized block has a read memory barrier at the start and a write barrier at the end of the block and is has mutex locking to ensure only one thread can enter the block at once.
Sometimes you just need memory barriers to achieve proper sharing of data between threads and sometimes you need locking.

As comments suggest, you might do some further reading. But to give you an idea you can take a look at this stackoverflow question and think for example about the following scenario:
You have couple of variables which need to be in the right state. But although you make them all volatile you need time to update them by some code executing.
Exactly this code may be executed almost at the same time by a different thread. The first variables could be "OK" and somehow synchronized but some other maybe dependent on the first ones and are not correct yet. Thus you need a synchronized block in that case.
To add one more post for further reading about volatile look here

The primary difference between volatile and synchronized is that volatile only guarantees visibility whereas synchronized guarantees both visibility and locking.
If there are multiple read threads and one write thread then volatile usage can ensure that changes by the write thread to the volatile variable are visible to other threads immediately. But you see in this case locking isn't a problem because you only have 1 writing thread.
There are certain rules of thumb for a volatile:
Don't use volatile when its value depends on its previous value
Don't use volatile when it participates in interactions with other invariants
Don't use volatile when there are multiple write threads that update value of volatile variable.
In general, use of volatile should be limited to only those cases where it's relatively easy to reason about its state such as in the case of status flags.
In all other cases where you have shared mutable state always use synchronized wherever shared mutable state is being touched unless declared final and modified only in the constructor without unsafe publication. Volatile is a replacement for synchronized only in special cases as described in my 3 points.

Does explicit lock automatically provide memory visibility?

Sample code:
class Sample{
private int v;
public void setV(){
Lock a=new Lock();
a.lock();
try{
v=1;
}finally{
a.unlock();
}
}
public int getV(){
return v;
}
}
If I have a thread constantly invoke getV and I just do setV once in another thread, Is that reading thread guaranteed to see the new value right after writing? Or do I need to make "V" volatile or AtomicReference?
If the answer is no, then should I change it into:
class Sample{
private int v;
private Lock a=new Lock();
public void setV(){
a.lock();
try{
v=1;
}finally{
a.unlock();
}
}
public int getV(){
a.lock();
try{
int r=v;
}finally{
a.unlock();
}
return r;
}
}

From the documentation:
All Lock implementations must enforce the same memory synchronization semantics as provided by the built-in monitor lock:
A successful lock operation acts like a successful monitorEnter action
A successful unlock operation acts like a successful monitorExit action
If you use Lock in both threads (i.e. the reading and the writing ones), the reading thread will see the new value, because monitorEnter flushes the cache. Otherwise, you need to declare the variable volatile to force a read from memory in the reading thread.

As per Brian's Law...
If you are writing a variable that might next be read by another
thread, or reading a variable that might have last been written by
another thread, you must use synchronization, and further, both the
reader and the writer must synchronize using the same monitor lock.
So it would be appropriate to synchronize both the setter and getter......
Or
Use AtomicInteger.incrementAndGet() instead if you want to avoid the lock-unlock block (ie. synchronized block)

If I have a thread constantly invoke getV and I just do setV once in
another thread, Is that reading thread guaranteed to see the new value
right after writing?
NO, the reading thread may just read its own copy (cached automatically by the CPU Core which the reading thread is running on) of V's value, and thus not get the latest value.
Or do I need to make "V" volatile or AtomicReference?
YES, they both works.
Making V volatile simply stop CPU Core from caching V's value, i.e. every read/write operation to variable V must access the main memory, which is slower (about 100x times slower than read from L1 Cache, see interaction_latency for details)
Using V = new AtomicInteger() works because AtomicInteger use a private volatile int value; internally to provide visiblity.
And, it also works if you use lock (Lock object, synchronized block or method; they all works) on reading and writing thread (as your second code segment does), because (according to the Second Edition of The Java ® Virtual Machine Specification section 8.9)
...Locking any lock conceptually flushes all variables from a thread's
working memory, and unlocking any lock forces the writing out to main
memory of all variables that the thread has assigned...
...If a thread uses a particular shared variable only after locking a
particular lock and before the corresponding unlocking of that same
lock, then the thread will read the shared value of that variable from
main memory after the lock operation, if necessary, and will copy back
to main memory the value most recently assigned to that variable
before the unlock operation. This, in conjunction with the mutual
exclusion rules for locks, suffices to guarantee that values are
correctly transmitted from one thread to another through shared
variables...
P.S. the AtomicXXX classes also provide CAS (Compare And Swap) operations which is useful for mutlthread access.
P.P.S. The jvm specification on this topic has not changed since Java 6, so they are not included in jvm specification for java 7, 8, and 9.
P.P.P.S. According to this article, CPU caches are always coherent, whatever from each core's view. The situation in your question is caused by the 'Memory Ordering Buffers', in which the store & load instructions (which are used to write and read data from memory, accordingly) could be re-ordered for performance. In detail, the buffer allows a load instruction to get ahead of an older store instruction, which exactly cause the problem (getV() is put ahead so it read the value before you change it in the other thread). However, in my opinion, this is more difficult to understand, so "cache for different core" (as JVM specs did) could be a better conceptual model.

You should make volatile or an AtomicInteger. That will insure that the reading thread will eventually see the change, and close enough to "right after" for most purposes. And technically you don't need the Lock for a simple atomic update like this. Take a close look at AtomicInteger's API. set(), compareAndSet(), etc... will all set the value to be visible by reading threads atomically.

Explicit locks, synchronized, atomic reference and volatile, all provide memory visibility. Lock and synchronized do so for code block they surround and atomic reference and volatile to the particular variable declared so. However for the visibility to work correctly, both reading and writing methods should be protected by the same locking object.
It will not work in your case because your getter method is not projected by the lock which protects the setter method. If you make the change it will work as required. Also just declaring the variable as volatile or AtomicInteger or AtomicReference<Integer> will work too.

Implementing a Mutex in Java

I have a multi-threaded application (a web app in Tomcat to be exact). In it there is a class that almost every thread will have its own instance of. In that class there is a section of code in one method that only ONE thread (user) can execute at a time. My research has led me to believe that what I need here is a mutex (which is a semaphore with a count of 1, it would seem).
So, after a bit more research, I think what I should do is the following. Of importance is to note that my lock Object is static.
Am I doing it correctly?
public Class MyClass {
private static Object lock = new Object();
public void myMethod() {
// Stuff that multiple threads can execute simultaneously.
synchronized(MyClass.lock) {
// Stuff that only one thread may execute at a time.
}
}
}

In your code, myMethod may be executed in any thread, but only in one at a time. That means that there can never be two threads executing this method at the same time. I think that's what you want - so: Yes.

Typically, the multithreading problem comes from mutability - where two or more threads are accessing the same data structure and one or more of them modifies it.
The first instinct is to control the access order using locking, as you've suggested - however you can quickly run into lock contention where your application looses a lot of processing time to context switching as your threads are parked on lock monitors.
You can get rid of most of the problem by moving to immutable data structures - so you return a new object from the setters, rather than modifying the existing one, as well as utilising concurrent collections, such a ConcurrentHashMap / CopyOnWriteArrayList.
Concurrent programming is something you'll need to get your head around, especially as throughput comes from parallelisation in todays modern computing world.

This will allow one thread at a time through the block. Other thread will wait, but no queue as such, there is no guarantee that threads will get the lock in a fair manner. In fact with Biased lock, its unlikely to be fair. ;)
Your lock should be final If there is any reason it can't its probably a bug. BTW: You might be able to use synchronized(MyClass.class) instead.

Some questions on java multithreading,

I have a set of questions regarding Java multithreading issues. Please provide me with as much help as you can.
0) Assume we have 2 banking accounts and we need to transfer money between them in a thread-safe way.
i.e.
accountA.money += transferSum;
accountB.money -= transferSum;
Two requirements exist:
no one should be able to see the intermediate results of the operation (i.e. one acount sum is increased, but others is not yet decreased)
reading access should not be blocked during the operation (i.e. old values of account sums should be shown during the operation goes on)
Can you suggest some ideas on this?
1) Assume 2 threads modify some class field via synchronized method or utilizing an explicit lock. Regardless of synchronization, there are no guarantee that this field will be visible to threads, that read it via NOT synchronized method. - is it correct?
2) How long a thread that is awoken by notify method can wait for a lock? Assume we have a code like this:
synchronized(lock) {
lock.notifyall();
//do some very-very long activity
lock.wait() //or the end of synchronized block
}
Can we state that at least one thread will succeed and grab the lock? Can a signal be lost due to some timeout?
3) A quotation from Java Concurrency Book:
"Single-threaded executors also provide sufficient internal synchronization to guarantee that any memory writes made by tasks are visible to subsequent tasks; this means that objects can be safely confined to the "task thread" even though that thread may be replaced with another from time to time."
Does this mean that the only thread-safety issue that remains for a code being executed in single-threaded executor is data race and we can abandon the volatile variables and overlook all visibility issues? It looks like a universal way to solve a great part of concurrency issues.
4) All standard getters and setters are atomic. They need not to be synchronized if the field is marked as volatile. - is it correct?
5) The initiation of static fields and static blocks is accomplished by one thread and thus need not to be synchronized. - is it correct?
6) Why a thread needs to notify others if it leaves the lock with wait() method, but does not need to do this if it leaves the lock by exiting the synchronized block?

0: You can't.
Assuring an atomic update is easy: you synchronize on whatever object holds the bank accounts. But then you either block all readers (because they synchronize as well), or you can't guarantee what the reader will see.
BUT, in a large-scale system such as a banking system, locking on frequently-accessed objects is a bad idea, as it introduces waits into the system. In the specific case of changing two values, this might not be an issue: it will happen so fast that most accesses will be uncontended.
There are certainly ways to avoid such race conditions. Databases do a pretty good job for ba nk accounts (although ultimately they rely on contended access to the end of a transaction).
1) To the best of my knowledge, there are no guarantees other than those established by synchronized or volatile. If one thread makes a synchronized access and one thread does not, the unsynchronized access does not have a memory barrier. (if I'm wrong, I'm sure that I'll be corrected or at least downvoted)
2) To quote that JavaDoc: "The awakened threads will not be able to proceed until the current thread relinquishes the lock on this object." If you decide to throw a sleep into that synchronized block, you'll be unhappy.
3) I'd have to read that quote several times to be sure, but I believe that "single-threaded executor" is the key phrase. If the executor is running only a single thread, then there is a strict happens-before relationship for all operations on that thread. It does not mean that other threads, running in other executors, can ignore synchronization.
4) No. long and double are not atomic (see the JVM spec). Use an AtomicXXX object if you want unsynchronized access to member variables.
5) No. I couldn't find an exact reference in the JVM spec, but section 2.17.5 implies that multiple threads may initialize classes.
6) Because all threads wait until one thread does a notify. If you're in a synchronized block, and leave it with a wait and no notify, every thread will be waiting for a notification that will never happen.

0) Is a difficult problem because you don't want intermediate results to be visible or to lock readers during the operation. To be honest I'm not sure it's possible at all, in order to ensure no thread sees intermediate results you need to block readers while doing both writes.
If you dont want intermediate results visible then you have to lock both back accounts before doing your writing. The best way to do this is to make sure you get and release the locks in the same order each time (otherwise you get a deadlock). E.G. get the lock on the lower account number first and then the greater.
1) Correct, all access must be via a lock/synchronized or use volatile.
2) Forever
3) Using a Single Threaded Executor means that as long as all access is doen by tasks run by that executor you dont need to worry about thread safety/visibilty.
4) Not sure what you mean by standard getters and setters but writes to most variable types (except double and long) are atomic and so don't need sync, just volatile for visibility. Try using the Atomic variants instead.
5) No, it is possible for two threads to try an init some static code, making naive implementations of Singleton unsafe.
6) Sync and Wait/Notify are two different but related mechanisms. Without wait/notify you'd have to spin lock (i.e. keep getting a lock and polling )on a object to get updates

5) The initiation of static fields and static blocks is accomplished by one thread and thus need not to be synchronized. - is it correct?
VM executes static initialization in a synchronized(clazz) block.
static class Foo {
static {
assert Thread.holdsLock(Foo.class); // true
synchronized(Foo.class){ // redundant, already under the lock
....

0) The only way I can see to do this to to store accountA and accountB in an object stored in an AtomicReference. You then make a copy of the object, modify it, and update the reference if it is still the same as the original reference.
AtomicReference<Accounts> accountRef;
Accounts origRef;
Accounts newRef;
do {
origRef = accountRef.get();
// make a deep copy of origRef
newRef.accountA.money += transferSum;
newRef.accountB.money -= transferSum;
} while(accountRef.compareAndSet(origRef, newRef);

Are static variables shared between threads?

My teacher in an upper level Java class on threading said something that I wasn't sure of.
He stated that the following code would not necessarily update the ready variable. According to him, the two threads don't necessarily share the static variable, specifically in the case when each thread (main thread versus ReaderThread) is running on its own processor and therefore doesn't share the same registers/cache/etc and one CPU won't update the other.
Essentially, he said it is possible that ready is updated in the main thread, but NOT in the ReaderThread, so that ReaderThread will loop infinitely.
He also claimed it was possible for the program to print 0 or 42. I understand how 42 could be printed, but not 0. He mentioned this would be the case when the number variable is set to the default value.
I thought perhaps it is not guaranteed that the static variable is updated between the threads, but this strikes me as very odd for Java. Does making ready volatile correct this problem?
He showed this code:
public class NoVisibility {
private static boolean ready;
private static int number;
private static class ReaderThread extends Thread {
public void run() {
while (!ready) Thread.yield();
System.out.println(number);
}
}
public static void main(String[] args) {
new ReaderThread().start();
number = 42;
ready = true;
}
}

There isn't anything special about static variables when it comes to visibility. If they are accessible any thread can get at them, so you're more likely to see concurrency problems because they're more exposed.
There is a visibility issue imposed by the JVM's memory model. Here's an article talking about the memory model and how writes become visible to threads. You can't count on changes one thread makes becoming visible to other threads in a timely manner (actually the JVM has no obligation to make those changes visible to you at all, in any time frame), unless you establish a happens-before relationship.
Here's a quote from that link (supplied in the comment by Jed Wesley-Smith):
Chapter 17 of the Java Language Specification defines the happens-before relation on memory operations such as reads and writes of shared variables. The results of a write by one thread are guaranteed to be visible to a read by another thread only if the write operation happens-before the read operation. The synchronized and volatile constructs, as well as the Thread.start() and Thread.join() methods, can form happens-before relationships. In particular:
Each action in a thread happens-before every action in that thread that comes later in the program's order.
An unlock (synchronized block or method exit) of a monitor happens-before every subsequent lock (synchronized block or method entry) of that same monitor. And because the happens-before relation is transitive, all actions of a thread prior to unlocking happen-before all actions subsequent to any thread locking that monitor.
A write to a volatile field happens-before every subsequent read of that same field. Writes and reads of volatile fields have similar memory consistency effects as entering and exiting monitors, but do not entail mutual exclusion locking.
A call to start on a thread happens-before any action in the started thread.
All actions in a thread happen-before any other thread successfully returns from a join on that thread.

He was talking about visibility and not to be taken too literally.
Static variables are indeed shared between threads, but the changes made in one thread may not be visible to another thread immediately, making it seem like there are two copies of the variable.
This article presents a view that is consistent with how he presented the info:
http://jeremymanson.blogspot.com/2008/11/what-volatile-means-in-java.html
First, you have to understand a little something about the Java memory model. I've struggled a bit over the years to explain it briefly and well. As of today, the best way I can think of to describe it is if you imagine it this way:
Each thread in Java takes place in a separate memory space (this is clearly untrue, so bear with me on this one).
You need to use special mechanisms to guarantee that communication happens between these threads, as you would on a message passing system.
Memory writes that happen in one thread can "leak through" and be seen by another thread, but this is by no means guaranteed. Without explicit communication, you can't guarantee which writes get seen by other threads, or even the order in which they get seen.
...
But again, this is simply a mental model to think about threading and volatile, not literally how the JVM works.

Basically it's true, but actually the problem is more complex. Visibility of shared data can be affected not only by CPU caches, but also by out-of-order execution of instructions.
Therefore Java defines a Memory Model, that states under which circumstances threads can see consistent state of the shared data.
In your particular case, adding volatile guarantees visibility.

They are "shared" of course in the sense that they both refer to the same variable, but they don't necessarily see each other's updates. This is true for any variable, not just static.
And in theory, writes made by another thread can appear to be in a different order, unless the variables are declared volatile or the writes are explicitly synchronized.

Within a single classloader, static fields are always shared. To explicitly scope data to threads, you'd want to use a facility like ThreadLocal.

When you initialize static primitive type variable java default assigns a value for static variables
public static int i ;
when you define the variable like this the default value of i = 0;
thats why there is a possibility to get you 0.
then the main thread updates the value of boolean ready to true. since ready is a static variable , main thread and the other thread reference to the same memory address so the ready variable change. so the secondary thread get out from while loop and print value.
when printing the value initialized value of number is 0. if the thread process has passed while loop before main thread update number variable. then there is a possibility to print 0

#dontocsata
you can go back to your teacher and school him a little :)
few notes from the real world and regardless what you see or be told.
Please NOTE, the words below are regarding this particular case in the exact order shown.
The following 2 variable will reside on the same cache line under virtually any know architecture.
private static boolean ready;
private static int number;
Thread.exit (main thread) is guaranteed to exit and exit is guaranteed to cause a memory fence, due to the thread group thread removal (and many other issues). (it's a synchronized call, and I see no single way to be implemented w/o the sync part since the ThreadGroup must terminate as well if no daemon threads are left, etc).
The started thread ReaderThread is going to keep the process alive since it is not a daemon one!
Thus ready and number will be flushed together (or the number before if a context switch occurs) and there is no real reason for reordering in this case at least I can't even think of one.
You will need something truly weird to see anything but 42. Again I do presume both static variables will be in the same cache line. I just can't imagine a cache line 4 bytes long OR a JVM that will not assign them in a continuous area (cache line).

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.