I have a set of questions regarding Java multithreading issues. Please provide me with as much help as you can.
0) Assume we have 2 banking accounts and we need to transfer money between them in a thread-safe way.
i.e.
accountA.money += transferSum;
accountB.money -= transferSum;
Two requirements exist:
no one should be able to see the intermediate results of the operation (i.e. one acount sum is increased, but others is not yet decreased)
reading access should not be blocked during the operation (i.e. old values of account sums should be shown during the operation goes on)
Can you suggest some ideas on this?
1) Assume 2 threads modify some class field via synchronized method or utilizing an explicit lock. Regardless of synchronization, there are no guarantee that this field will be visible to threads, that read it via NOT synchronized method. - is it correct?
2) How long a thread that is awoken by notify method can wait for a lock? Assume we have a code like this:
synchronized(lock) {
lock.notifyall();
//do some very-very long activity
lock.wait() //or the end of synchronized block
}
Can we state that at least one thread will succeed and grab the lock? Can a signal be lost due to some timeout?
3) A quotation from Java Concurrency Book:
"Single-threaded executors also provide sufficient internal synchronization to guarantee that any memory writes made by tasks are visible to subsequent tasks; this means that objects can be safely confined to the "task thread" even though that thread may be replaced with another from time to time."
Does this mean that the only thread-safety issue that remains for a code being executed in single-threaded executor is data race and we can abandon the volatile variables and overlook all visibility issues? It looks like a universal way to solve a great part of concurrency issues.
4) All standard getters and setters are atomic. They need not to be synchronized if the field is marked as volatile. - is it correct?
5) The initiation of static fields and static blocks is accomplished by one thread and thus need not to be synchronized. - is it correct?
6) Why a thread needs to notify others if it leaves the lock with wait() method, but does not need to do this if it leaves the lock by exiting the synchronized block?
0: You can't.
Assuring an atomic update is easy: you synchronize on whatever object holds the bank accounts. But then you either block all readers (because they synchronize as well), or you can't guarantee what the reader will see.
BUT, in a large-scale system such as a banking system, locking on frequently-accessed objects is a bad idea, as it introduces waits into the system. In the specific case of changing two values, this might not be an issue: it will happen so fast that most accesses will be uncontended.
There are certainly ways to avoid such race conditions. Databases do a pretty good job for ba nk accounts (although ultimately they rely on contended access to the end of a transaction).
1) To the best of my knowledge, there are no guarantees other than those established by synchronized or volatile. If one thread makes a synchronized access and one thread does not, the unsynchronized access does not have a memory barrier. (if I'm wrong, I'm sure that I'll be corrected or at least downvoted)
2) To quote that JavaDoc: "The awakened threads will not be able to proceed until the current thread relinquishes the lock on this object." If you decide to throw a sleep into that synchronized block, you'll be unhappy.
3) I'd have to read that quote several times to be sure, but I believe that "single-threaded executor" is the key phrase. If the executor is running only a single thread, then there is a strict happens-before relationship for all operations on that thread. It does not mean that other threads, running in other executors, can ignore synchronization.
4) No. long and double are not atomic (see the JVM spec). Use an AtomicXXX object if you want unsynchronized access to member variables.
5) No. I couldn't find an exact reference in the JVM spec, but section 2.17.5 implies that multiple threads may initialize classes.
6) Because all threads wait until one thread does a notify. If you're in a synchronized block, and leave it with a wait and no notify, every thread will be waiting for a notification that will never happen.
0) Is a difficult problem because you don't want intermediate results to be visible or to lock readers during the operation. To be honest I'm not sure it's possible at all, in order to ensure no thread sees intermediate results you need to block readers while doing both writes.
If you dont want intermediate results visible then you have to lock both back accounts before doing your writing. The best way to do this is to make sure you get and release the locks in the same order each time (otherwise you get a deadlock). E.G. get the lock on the lower account number first and then the greater.
1) Correct, all access must be via a lock/synchronized or use volatile.
2) Forever
3) Using a Single Threaded Executor means that as long as all access is doen by tasks run by that executor you dont need to worry about thread safety/visibilty.
4) Not sure what you mean by standard getters and setters but writes to most variable types (except double and long) are atomic and so don't need sync, just volatile for visibility. Try using the Atomic variants instead.
5) No, it is possible for two threads to try an init some static code, making naive implementations of Singleton unsafe.
6) Sync and Wait/Notify are two different but related mechanisms. Without wait/notify you'd have to spin lock (i.e. keep getting a lock and polling )on a object to get updates
5) The initiation of static fields and static blocks is accomplished by one thread and thus need not to be synchronized. - is it correct?
VM executes static initialization in a synchronized(clazz) block.
static class Foo {
static {
assert Thread.holdsLock(Foo.class); // true
synchronized(Foo.class){ // redundant, already under the lock
....
0) The only way I can see to do this to to store accountA and accountB in an object stored in an AtomicReference. You then make a copy of the object, modify it, and update the reference if it is still the same as the original reference.
AtomicReference<Accounts> accountRef;
Accounts origRef;
Accounts newRef;
do {
origRef = accountRef.get();
// make a deep copy of origRef
newRef.accountA.money += transferSum;
newRef.accountB.money -= transferSum;
} while(accountRef.compareAndSet(origRef, newRef);
Related
Some people says if multiple threads are reading/writing then you need to use synchronized and if one thread is reading/writing and another one is only reading then you must use volatile. I don't get the difference between this situations.
Basically, the value of a volatile field becomes visible to all readers (other threads in particular) after a write operation completes on it.
Then If I define a variable as volatile, first threadA will read its value, threadA will update its value and write it to memory.After that variable will become visible to threadB. Then why do I need to synchronized block?
Some people says if multiple threads are reading/writing then you need to use synchronized and if one thread is reading/writing and another one is only reading then you must use volatile. I don't get the difference between this situations.
There really isn't a hard and fast rule with this. Choosing whether or not to use synchronized or volatile has more to do with how the objects are being updated as opposed to how many readers or writers there are.
For example, you can achieve multiple readers and writers with an AtomicLong which wraps a volatile long.
private AtomicLong counter = new AtomicLong();
...
// many threads can get/set this counter without synchronized
counter.incrementAndGet();
And there are circumstances where you would need a synchronized block even with a single reader/writer.
synchronized (status) {
status.setNumTransactions(dao.getNumTransactions());
// we don't want the reader thread to see `status` partially updated here
status.setTotalMoney(dao.getTotalMoney());
}
In the above example, since we are making multiple calls to update the status object we may need to ensure that other threads don't see it when the num-transactions has been updated but not the total-money. Yes, AtomicReference handles some of these cases but not all.
To be clear, marking a field volatile ensures memory synchronization. When you read a volatile field you cross a read memory barrier and when you write it you cross a write memory barrier. A synchronized block has a read memory barrier at the start and a write barrier at the end of the block and is has mutex locking to ensure only one thread can enter the block at once.
Sometimes you just need memory barriers to achieve proper sharing of data between threads and sometimes you need locking.
As comments suggest, you might do some further reading. But to give you an idea you can take a look at this stackoverflow question and think for example about the following scenario:
You have couple of variables which need to be in the right state. But although you make them all volatile you need time to update them by some code executing.
Exactly this code may be executed almost at the same time by a different thread. The first variables could be "OK" and somehow synchronized but some other maybe dependent on the first ones and are not correct yet. Thus you need a synchronized block in that case.
To add one more post for further reading about volatile look here
The primary difference between volatile and synchronized is that volatile only guarantees visibility whereas synchronized guarantees both visibility and locking.
If there are multiple read threads and one write thread then volatile usage can ensure that changes by the write thread to the volatile variable are visible to other threads immediately. But you see in this case locking isn't a problem because you only have 1 writing thread.
There are certain rules of thumb for a volatile:
Don't use volatile when its value depends on its previous value
Don't use volatile when it participates in interactions with other invariants
Don't use volatile when there are multiple write threads that update value of volatile variable.
In general, use of volatile should be limited to only those cases where it's relatively easy to reason about its state such as in the case of status flags.
In all other cases where you have shared mutable state always use synchronized wherever shared mutable state is being touched unless declared final and modified only in the constructor without unsafe publication. Volatile is a replacement for synchronized only in special cases as described in my 3 points.
I got one main thread that will start up other threads. Those other threads will ask for jobs to be done, and the main thread will make jobs available for the other threads to see and do.
The job that must be done is to set indexes in the a huge boolean array to true. They are by default false, and the other threads can only set them to true, never false. The various jobs may involve setting the same indexes to true.
The main thread finds new jobs depending on two things.
The values in the huge boolean array.
Which jobs has already been done.
How do I make sure the main thread reads fresh values from the huge boolean array?
I can't have the update of the array be through a synchronized method, because that's pretty much all the other threads do, and as such I would only get a pretty much sequential performance.
Let's say the other threads update the huge boolean array by setting many of it's indexes to true through a non-synchronized function. How can I make sure the main thread reads the updates and make sure it's not just locally cached at the thread? Is there any ways to make it "push" the update? I'm guessing the main thread should just use a synchronized method to "get" the updates?
For the really complete answer to your question, you should open up a copy of the Java Language Spec, and search for "happens before".
When the JLS says that A "happens before" B, it means that in a valid implementation of the Java language, A is required to actually happen before B. The spec says things like:
If some thread updates a field, and then releases a lock (e.g.,
leaves a synchronized block), the update "happens before" the lock is
released,
If some thread releases a lock, and some other thread subsequently
acquires the same lock, the release "happens before" the acquisition.
If some thread acquires a lock, and then reads a field, the
acquisition "happens before" the read.
Since "happens before" is a transitive relationship, you can infer that if thread A updates some variables inside a synchronized block and then thread B examines the variables in a block that is synchronized on the same object, then thread B will see what thread A wrote.
Besides entering and leaving synchronized blocks, there are lots of other events (constructing objects, wait()ing/notify()ing objects, start()ing and join()ing threads, reading and writing volatile variables) that allow you to establish "happens before" relationships between threads.
It's not a quick solution to your problem, but the chapter is worth reading.
...the main thread will make jobs available for the other threads to see and do...
I can't have the update of the array be through a synchronized method, because that's pretty much all the other threads do, and ...
Sounds like you're saying that each worker thread can only do a trivial amount of work before it must wait for further instructions from the main() thread. If that's true, then most of the workers are going to be waiting most of the time. You'd probably get better performance if you just do all of the work in a single thread.
Assuming that your goal is to make the best use of available cycles a multi-processor machine, you will need to partition the work in some way that lets each worker thread go off and do a significant chunk of it before needing to synchronize with any other thread.
I'd use another design pattern. For instance, you could add to a Set the indexes of the boolean values as they're turned on, for instance, and then synchronize access to that. Then you can use wait/notify to wake up.
First of all, don't use boolean arrays in general, use BitSets. See this: boolean[] vs. BitSet: Which is more efficient?
In this case you need an atomic BitSet, so you can't use the java.util.BitSet, but here is one: AtomicBitSet implementation for java
You could instead model this as message passing rather than mutating shared state. In your description the workers never read the boolean array and only write the completion status. Have you considered using a pending job queue that workers consume from and a completion queue that the master reads? The job status fields can be efficiently maintained by the master thread without any shared state concerns. Depending on your needs, you can use either blocking or non-blocking queues.
I was trying to dive into synchronization.
here, its mentioned "Every object has an intrinsic lock associated with it.".
I understand Object class. Its ( quite known to us ) does not have any property like a lock ( thus i guess its called intrinsic lock ). What exactly is this lock (ie is it a Lock.java class ? it is some sort of hidden field ? ) and how is it associated with an object (ie is there some mystery implicit reference to the lock from an object, something happening natively ) ?
When several threads try to acquire the same lock, one or more threads will be suspended and they will be resumed later. Well where are these threads stored ? What data structure keeps records of waiting threads ?
What logic is used under the covers to pick a thread waiting to enter synchronized method, when there are many waiting ?
Any reference to what happens under the covers (step by step) from "synchronized keyword" to "intrinsic lock aquisition" ?
Is there an upper limit on number of threads that are allowed to wait on synchronized ?
1) Yes, the lock is essentially a hidden field on the object. You can't access it except through synchronization.
2) Waiting threads essentially just sleep until the lock is available. They aren't "stored" anyplace special, nor is there a visible queue of waiting threads. Details of implementation are hidden.
3) I don't think there's any promised order. If you explicitly need round-robin scheduling or prioritization or something of that sort, it's your responsibility to implement it (or use a class which implements it for you) on top of the synchronization lock mechanism.
4) This is likely to be handled as an OS semaphore, if you understand those. If you don't, defining them strikes me as too much detail to be addressed properly here... and you don't really need to understand this unless you're reimplementing it.
5) There is no explicit limit, as far as I know. (I haven't checked the official Java Specification, but I understand how this kind of thing gets implemented at the OS level.) Of course at some point you're going to run out of system resources, but I think you'll generally run out of other resources (like memory to run those threads in) first.
One additional note: It's also worth looking at the Atomic... classes. When these can be used, they will be somewhat more efficient in modern processors than traditional Java synchronization can be.
private static void WaitInQueue(Customer c)
{
synchronized(mutex){
//Do some operation here.
}
}
I need to make threads wait before proceeding(only one at a time), however, it appears that synchronized is not using FIFO to determine which should proceed next.(It seems like LIFO) Why is this?
How can I ensure that the first thread to wait at synchronized will be the first one to aquire the lock next?
a synchronized block makes no guarantees about fairness - any of the waiting threads may in theory be chosen to execute. if you really want a fair lock (fifo), switch to use the newer locking mechanisms introduced in java 5+.
see for example the documentation for ReentrantLock.
here's how you'd use a fair lock:
private final ReentrantLock lock = new ReentrantLock(true); //fair lock
// ...
public void m() {
lock.lock(); // block until condition holds
try {
// ... method body
} finally {
lock.unlock()
}
}
note, however, that this results in overall degraded performance and so is not recommended.
quoting from the documentation"
The constructor for this class accepts an optional fairness parameter.
When set true, under contention, locks favor granting access to the
longest-waiting thread. Otherwise this lock does not guarantee any
particular access order. Programs using fair locks accessed by many
threads may display lower overall throughput (i.e., are slower; often
much slower) than those using the default setting
You can use the Semaphore class, with a fairness setting set to true and a count of 1. This guarantees FIFO order for threads, and is almost identical to having a synchronized block.
ReentrantLock also provides a fairness setting.
To answer "Why is this?": There's rarely any reason to use a specific order on a lock. The threads hit the lock in random order and may as well leave it the same way. The important thing from the JVM's viewpoint is to keep the cores busy working on your program's code. Generally, if you care about what order your threads run in you need something a lot fancier than a lock or semaphore.
The only good exception I can think of is if your lock always has waiting threads, creating the real possibility that a thread that hits it might wait for many seconds, continually getting bumped to the back of the "queue", while an irate user fumes. Then, FIFO makes a lot of sense. But even here you might want to spend some time trying to speed up the synchronized block (or avoiding it completely) so most threads that hit it don't get blocked.
In summary, think long and hard about your design if you find yourself worrying about the order your threads run in.
You should use Thread.join for wait before proceeding.
Just go through the following link
http://msdn.microsoft.com/en-us/library/dsw9f9ts(v=vs.90).aspx
I have a java threads related question.
To take a very simple example, lets say I have 2 threads.
Thread A running StockReader Class instance
Thread B running StockAvgDataCollector Class instance
In Thread B, StockAvgDataCollector collects some market Data continuously, does some heavy averaging/manipulation and updates a member variable spAvgData
In Thread A StockReader has access to StockAvgDataCollector instance and its member spAvgData using getspAvgData() method.
So Thread A does READ operation only and Thread B does READ/WRITE operations.
Questions
Now, do I need synchronization or atomic functionality or locking or any concurrency related stuff in this scenario? It doesnt matter if Thread A reads an older value.
Since Thread A is only going READ and not update anything and only Thread B does any WRITE operations, will there be any deadlock scenarios?
I've pasted a paragraph below from the following link. From that paragraph, it seems like I do need to worry about some sort of locking/synchronizing.
http://java.sun.com/developer/technicalArticles/J2SE/concurrency/
Reader/Writer Locks
When using a thread to read data from an object, you do not necessarily need to prevent another thread from reading data at the same time. So long as the threads are only reading and not changing data, there is no reason why they cannot read in parallel. The J2SE 5.0 java.util.concurrent.locks package provides classes that implement this type of locking. The ReadWriteLock interface maintains a pair of associated locks, one for read-only and one for writing. The readLock() may be held simultaneously by multiple reader threads, so long as there are no writers. The writeLock() is exclusive. While in theory, it is clear that the use of reader/writer locks to increase concurrency leads to performance improvements over using a mutual exclusion lock. However, this performance improvement will only be fully realized on a multi-processor and the frequency that the data is read compared to being modified as well as the duration of the read and write operations.
Which concurrent utility would be less expensive and suitable in my example?
java.util.concurrent.atomic ?
java.util.concurrent.locks ?
java.util.concurrent.ConcurrentLinkedQueue ? - In this case StockAvgDataCollector will add and StockReader will remove. No getspAvgData() method will be exposed.
Thanks
Amit
Well, the whole ReadWriteLock thing really makes sense when you have many readers and at least one writer... So you guarantee liveliness (you won't be blocking any reader threads if no one other thread is writing). However, you have only two threads.
If you don't mind thread B reading an old (but not corrupted) value of spAvgData, then I would go for an AtomicDouble (or AtomicReference, depending on what spAvgData's datatype).
So the code would look like this
public class A extends Thread {
// spAvgData
private final AtomicDouble spAvgData = new AtomicDouble(someDefaultValue);
public void run() {
while (compute) {
// do intensive work
// ...
// done with work, update spAvgData
spAvgData.set(resultOfComputation);
}
}
public double getSpAvgData() {
return spAvgData.get();
}
}
// --------------
public class B {
public void someMethod() {
A a = new A();
// after A being created, spAvgData contains a valid value (at least the default)
a.start();
while(read) {
// loll around
a.getSpAvgData();
}
}
}
Yes, synchronization is important and you need to consider two parameters: visibility of the spAvgData variable and atomicity of its update. In order to guarantee visibility of the spAvgData variable in thread B by thread A, the variable can be declared volatile or as an AtomicReference. Also you need to guard that the action of the update is atomic in case there are more invariants involved or the update action is a compound action, using synchronization and locking. If only thread B is updating that variable then you don't need synchronization and visibility should be enough for thread A to read the most up-to-date value of the variable.
If you don't mind that Thread A can read complete nonsense (including partially updated data) then no, you don't need any synchronisation. However, I suspect that you should mind.
If you just use a single mutex, or ReentrantReadWriteLock and don't suspend or sleep without timeout while holding locks then there will be no deadlock. If you do perform unsafe thread operations, or try to roll your own synchronisation solution, then you will need to worry about it.
If you use a blocking queue then you will also need a constantly-running ingestion loop in StockReader. ReadWriteLock is still of benefit on a single core processor - the issues are the same whether the threads are physically running at the same time, or just interleaved by context switches.
If you don't use at least some form of synchronisation (e.g. a volatile) then your reader may never see any change at all.