Implementing a Mutex in Java

Implementing a Mutex in Java - java

I have a multi-threaded application (a web app in Tomcat to be exact). In it there is a class that almost every thread will have its own instance of. In that class there is a section of code in one method that only ONE thread (user) can execute at a time. My research has led me to believe that what I need here is a mutex (which is a semaphore with a count of 1, it would seem).
So, after a bit more research, I think what I should do is the following. Of importance is to note that my lock Object is static.
Am I doing it correctly?
public Class MyClass {
private static Object lock = new Object();
public void myMethod() {
// Stuff that multiple threads can execute simultaneously.
synchronized(MyClass.lock) {
// Stuff that only one thread may execute at a time.
}
}
}

In your code, myMethod may be executed in any thread, but only in one at a time. That means that there can never be two threads executing this method at the same time. I think that's what you want - so: Yes.

Typically, the multithreading problem comes from mutability - where two or more threads are accessing the same data structure and one or more of them modifies it.
The first instinct is to control the access order using locking, as you've suggested - however you can quickly run into lock contention where your application looses a lot of processing time to context switching as your threads are parked on lock monitors.
You can get rid of most of the problem by moving to immutable data structures - so you return a new object from the setters, rather than modifying the existing one, as well as utilising concurrent collections, such a ConcurrentHashMap / CopyOnWriteArrayList.
Concurrent programming is something you'll need to get your head around, especially as throughput comes from parallelisation in todays modern computing world.

This will allow one thread at a time through the block. Other thread will wait, but no queue as such, there is no guarantee that threads will get the lock in a fair manner. In fact with Biased lock, its unlikely to be fair. ;)
Your lock should be final If there is any reason it can't its probably a bug. BTW: You might be able to use synchronized(MyClass.class) instead.

Related

How does thread synchronization work in Kotlin?

I have been experimenting with Kotlin synchronization and I do not understand from the docs on how the locking mechanism works on thread synchronization over common resources and thus attempted to write this piece of code which further complicates my understanding.
fun main() {
val myList = mutableListOf(1)
thread {
myList.forEach {
while (true) {
println("T1 : $it")
}
}
}
thread {
synchronized(myList) {
while (true) {
myList[0] = 9999
println("**********\n**********\n**********\n")
}
}
}
}
myList is the common resource in question.
The first thread is a simple read operation that I intend to keep the resource utilized in read mode. The second is another thread which requests a lock in order to modify the common resource.
Though the first thread does not contain any synchronization, I would expect it to internally handle this so that a while a function like map or forEach is in progress over a resource, another thread should not be able to lock it otherwise the elements being iterated over may change while the map/forEach is in progress (even though that operation may be paused for a bit while another thread has a lock over it).
The output I see instead shows that both the threads are executing in parallel. Both of them are printing the first element in the list and the stars respectively. But in the second thread, even though the stars are being printed, myList[0] is never set to 9999 because the first thread continues to print 1.

Threading and synchronisation are JVM features, not specific to Kotlin. If you can follow Java, there are many resources out there which can explain them fully. But the short answer is: they're quite low-level, and tricky to get right, so please exercise due caution. And if a higher-level construction (work queues/executors, map/reduce, actors...) or immutable objects can do what you need, life will be easier if you use that instead!
But here're the basics. First, in the JVM, every object has a lock, which can be used to control access to something. (That something is usually the object the lock belongs to, but need not be...) The lock can be taken by the code in a particular thread; while it's holding that lock, any other thread which tries to take the lock will block until the first thread releases it.
And that's pretty much all there is! The synchronised keyword (actually a function) is used to claim a lock; either that belonging to a given object or (if none's given) 'this' object.
Note that holding a lock prevents other threads holding the lock; it doesn't prevent anything else. So I'm afraid your expectation is wrong. That's why you're seeing the threads happily running simultaneously.
Ideally, every class would be written with some consideration for how it interacts with multithreading; it could document itself as 'immutable' (no mutable state to worry about), 'thread-safe' (safe to call from multiple threads simultaneously), 'conditionally thread-safe' (safe to call from multiple threads if certain patterns are adhered to), 'thread-compatible' (taking no special precautions but callers can do their own synchronisation to make it safe), or 'thread-hostile' (impossible to use from multiple threads). But in practice, most don't.
In fact, most turn out to be thread-compatible; and that applies to much of the Java and Kotlin collection classes. So you can do your own synchronisation (as per your synchronized block); but you have to take care to synchronise every possible access to the list -- otherwise, a race condition could leave your list in an inconsistent state.
(And that can mean more than just a dodgy value somewhere. I had a server app with a thread that got stuck in a busy-loop -- chewing up 100% of a CPU but never continuing with the rest of the code -- because I had one thread update a HashMap while another thread was reading it, and I'd missed the synchronisation on one of those. Most embarrassing.)
So, as I said, if you can use a higher-level construction instead, your life will be easier!

Second thread is not changing the value of the first list element, as == means compare, not assign. You need to use = tio change the value e.g. myList[0] = 9999. However in your code it's not guaranteed that the change from the second thread will become visible in the first thread as thread one is not synchronising on myList.
If you are targeting JVM you should read about JVM memory model e.g. what is #Volatile. You current approach does not guarantee that first thread will ever see changes from the second one. You can simplify your code to below broken example:
var counter = 1
fun main() {
thread {
while (counter++ < 1000) {
println("T1: $counter")
}
}
thread {
while (counter++ < 1000) {
println("T2: $counter")
}
}
}
Which can print strange results like:
T2: 999
T1: 983
T2: 1000
This can be fixed in few ways e.g. by using synchronisations.

Do I have to use synchronized on main thread methods?

To be more specific my question is if the main thread methods are already synchronized?
For example:
#MainThread
class MyClass{
private Object o = null;
#MainThread
MyClass(){
}
#MainThread
public Object getObjectFromMainThread(){
return this.o.getObj2();
}
#MainThread
public void setObjectFromMainThread(Object obj){
obj.set(1);
this.o=obj;
}
#AnyThread
public synchronized Object getObjectFromAnyThread(){
return this.o;
}
#AnyThread
public synchronized void setObjectFromAnyThread(Object obj){
this.o=obj;
}
}
The methods getObjectFromMainThread and setObjectFromMainThread which are called only from main thread are not synchronized. Does it need to be synchronize as well or is not necessary?

The answer to your immediate question is yes, you will have to synchronize the getObjectFromMainThread and setObjectFromMainThread methods in your example. The answer to why there's this need is a mighty deep rabbit hole.
The general problem with multithreading is what happens when multiple threads access shared, mutable state. In this case, the shared, mutable state is this.o. It doesn't matter whether any of the threads involved is the main thread, it's a general problem that arises when more than one thread is in play.
The problem we're dealing with comes down to "what happens when a thread is reading the state at the same time that one or more threads are writing it?", with all its variations. This problem fans out into really intricate subproblems like each processor core having its own copy of the object in its own processor cache.
The only way of handling this is to make explicit what will happen. The synchronized mechanism is one such way. Synchronization involves a lock, when you use a synchronised method, the lock is this:
public synchronized void foo() {
// this code uses the same lock...
}
public void bar() {
synchronized (this) {
// ...as this code
}
}
Of all the program code that synchronizes on the same lock, only one thread can be executing it at the same time. That means that if (and only if) all code that interacts with this.o runs synchronized to the this lock, the problems described earlier are avoided.
In your example, the presence of setObjectFromAnyThread() means that you must also synchronize setObjectFromMainThread(), otherwise the state in this.o is accessed sometimes-synchronized and sometimes-unsynchronized, which is a broken program.
Synchronization comes at a cost: because your locking bits of code to be run by one thread at a time (and other threads are made to wait), you remove some or all of the speed-up you gained from using multi-threading in the first place. In some cases, you're better off forgetting multi-threading exists and using a simpler single-threaded program.
Within a multi-threaded program, it's useful to limit the amount of shared, mutable state to a minimum. Any state that's not accessed by more than one thread at a time doesn't need synchronization, and is going to be easier to reason about.
The #MainThread annotation, at least as it exists in Android, indicates that the method is intended to be accessed on the main thread only. It doesn't do anything, it's just there as a signal to both the programmer(s) and the compiler. There is no technical protection mechanism involved at run time; it all comes down to your self-discipline and some compile-time tool support. The advantage of this lack of protection is that there's no runtime overhead.
Multi-threaded programming is complicated and easy to get wrong. The only way to get it right is to truly understand it. There's a book called Java Concurrency In Practice that's a really good explanation of both the general principles and problems of concurrency and the specifics in Java.

How to Junit test a synchronized object not accessed by two threads at same time?

Is there any way I can make a Junit test to make sure that a synchronized object (in my case HashMap in synchronized block) is not accessed by two threads simultaneously? e.g. forcing two threads to try to access and having exception thrown.
Thanks for your help!

The best framework I've seento help with thread testing is Thread Weaver. At the very least it offers some deterministic way of thread scheduling, and a limited (yet useful) way of trying to find race conditions.
You can even code up some more intricate thread scheduling scenarios, but those tests will inevitably be white box tests. Still, those can have their use too.

Is there any way I can make a Junit test to make sure that a synchronized object (in my case HashMap in synchronized block) is not accessed by two threads simultaneously?
I'm not sure there is a testing framework to test this but you can certainly write some code that tries to access the protected HashMap over and over with many threads. Unfortunately this is very hard to do reliably since, as #Bohemian mentions, there is no way to be sure how threads run and access the map, especially in concert.
e.g. forcing two threads to try to access and having exception thrown. Thanks for your help!
Yeah this won't happen for 2 reasons. As mentioned, there is no "forcing" of threads. You just don't have that level of control. Also, threads do not throw exceptions because of synchronization problems unless you are doing something other than synchronized(hashMap) { ... }. When a thread is holding the lock on the map, other threads will block until it releases the lock. This is hard to detect and control. If you add code to do the detection and thread control then you get into a Heisenberg situation where you will be affecting the thread behavior because of your monitoring code.
Testing proper synchronization is very difficult and often impossible to do. Reviewing code with other developers to make sure that your HashMap is fully synchronized every time it is used, maybe be more productive.
Lastly, if you are worried about the HashMap then you maybe should consider to moving to ConcurrentHashMap or Collections.synchronizedMap(new HashMap). These take care of the synchronization and protection of the actual map for you although they don't handle race conditions if you are making multiple map calls with one operation. Btw, HashTable is considered and old class and should not be used.
Hope this helps.

Essentially, you can't, because you can't control when threads are scheduled, let alone coordinate them to test a particular behaviour.
Secondly, not all build servers are multi threaded (I got bitten by this only a couple of days ago - cheap AWS instances have only 1 CPU), so you can't rely on even having more than know thread available to test with.
Try to refactor your code so the locking part is separated from your application and test that logic in isolation... if you can.

As I understand, you have a code similar to this one
synchronized(myHashMap) {
...
}
... which means that a thread acquires the lock provided by myHashMap when it enters synchronized block and all other threads that try to enter the same block have to wait, i.e. no other thread can acquire the same lock.
Is there any way I can make a Junit test to make sure that a synchronized object (in my case HashMap in synchronized block) is not accessed by two threads simultaneously?
Knowing the above, why would you do that? If you still want to try then you might want to take a look at this answer.
And last, but not least. I'd recommend you to use Hashtable because it's synchronized. Use ConcurrentHashMap.

Java Thread Synchronization, best concurrent utility, read operation

I have a java threads related question.
To take a very simple example, lets say I have 2 threads.
Thread A running StockReader Class instance
Thread B running StockAvgDataCollector Class instance
In Thread B, StockAvgDataCollector collects some market Data continuously, does some heavy averaging/manipulation and updates a member variable spAvgData
In Thread A StockReader has access to StockAvgDataCollector instance and its member spAvgData using getspAvgData() method.
So Thread A does READ operation only and Thread B does READ/WRITE operations.
Questions
Now, do I need synchronization or atomic functionality or locking or any concurrency related stuff in this scenario? It doesnt matter if Thread A reads an older value.
Since Thread A is only going READ and not update anything and only Thread B does any WRITE operations, will there be any deadlock scenarios?
I've pasted a paragraph below from the following link. From that paragraph, it seems like I do need to worry about some sort of locking/synchronizing.
http://java.sun.com/developer/technicalArticles/J2SE/concurrency/
Reader/Writer Locks
When using a thread to read data from an object, you do not necessarily need to prevent another thread from reading data at the same time. So long as the threads are only reading and not changing data, there is no reason why they cannot read in parallel. The J2SE 5.0 java.util.concurrent.locks package provides classes that implement this type of locking. The ReadWriteLock interface maintains a pair of associated locks, one for read-only and one for writing. The readLock() may be held simultaneously by multiple reader threads, so long as there are no writers. The writeLock() is exclusive. While in theory, it is clear that the use of reader/writer locks to increase concurrency leads to performance improvements over using a mutual exclusion lock. However, this performance improvement will only be fully realized on a multi-processor and the frequency that the data is read compared to being modified as well as the duration of the read and write operations.
Which concurrent utility would be less expensive and suitable in my example?
java.util.concurrent.atomic ?
java.util.concurrent.locks ?
java.util.concurrent.ConcurrentLinkedQueue ? - In this case StockAvgDataCollector will add and StockReader will remove. No getspAvgData() method will be exposed.
Thanks
Amit

Well, the whole ReadWriteLock thing really makes sense when you have many readers and at least one writer... So you guarantee liveliness (you won't be blocking any reader threads if no one other thread is writing). However, you have only two threads.
If you don't mind thread B reading an old (but not corrupted) value of spAvgData, then I would go for an AtomicDouble (or AtomicReference, depending on what spAvgData's datatype).
So the code would look like this
public class A extends Thread {
// spAvgData
private final AtomicDouble spAvgData = new AtomicDouble(someDefaultValue);
public void run() {
while (compute) {
// do intensive work
// ...
// done with work, update spAvgData
spAvgData.set(resultOfComputation);
}
}
public double getSpAvgData() {
return spAvgData.get();
}
}
// --------------
public class B {
public void someMethod() {
A a = new A();
// after A being created, spAvgData contains a valid value (at least the default)
a.start();
while(read) {
// loll around
a.getSpAvgData();
}
}
}

Yes, synchronization is important and you need to consider two parameters: visibility of the spAvgData variable and atomicity of its update. In order to guarantee visibility of the spAvgData variable in thread B by thread A, the variable can be declared volatile or as an AtomicReference. Also you need to guard that the action of the update is atomic in case there are more invariants involved or the update action is a compound action, using synchronization and locking. If only thread B is updating that variable then you don't need synchronization and visibility should be enough for thread A to read the most up-to-date value of the variable.

If you don't mind that Thread A can read complete nonsense (including partially updated data) then no, you don't need any synchronisation. However, I suspect that you should mind.
If you just use a single mutex, or ReentrantReadWriteLock and don't suspend or sleep without timeout while holding locks then there will be no deadlock. If you do perform unsafe thread operations, or try to roll your own synchronisation solution, then you will need to worry about it.
If you use a blocking queue then you will also need a constantly-running ingestion loop in StockReader. ReadWriteLock is still of benefit on a single core processor - the issues are the same whether the threads are physically running at the same time, or just interleaved by context switches.
If you don't use at least some form of synchronisation (e.g. a volatile) then your reader may never see any change at all.

Threadsafe publishing of java object structure?

Assuming that I have the following code:
final Catalog catalog = createCatalog();
for (int i = 0; i< 100; i++{
new Thread(new CatalogWorker(catalog)).start();
}
"Catalog" is an object structure, and the method createCatalog() and the "Catalog" object structure has not been written with concurrency in mind. There are several non-final, non-volatile references within the product catalog, there may even be mutable state (but that's going to have to be handled)
The way I understand the memory model, this code is not thread-safe. Is there any simple way to make it safe ? (The generalized version of this problem is really about single-threaded construction of shared structures that are created before the threads explode into action)

No, there's no simple way to make it safe. Concurrent use of mutable data types is always tricky. In some situations, making each operation on Catalog synchronized (preferably on a privately-held lock) may work, but usually you'll find that a thread actually wants to perform multiple operations without risking any other threads messing around with things.
Just synchronizing every access to variables should be enough to make the Java memory model problem less relevant - you would always see the most recent values, for example - but the bigger problem itself is still significant.
Any immutable state in Catalog should be fine already: there's a "happens-before" between the construction of the Catalog and the new thread being started. From section 17.4.5 of the spec:
A call to start() on a thread
happens-before any actions in the
started thread.
(And the construction finishing happens before the call to start(), so the construction happens before any actions in the started thread.)

You need to synchronize every method that changes the state of Catalog to make it thread-safe.
public synchronized <return type> method(<parameter list>){
...
}

Assuming you handle the "non-final, non-volatile references [and] mutable state" (presumably by not actually mutating anything while these threads are running) then I believe this is thread-safe. From the JSR-133 FAQ:
When one action happens before
another, the first is guaranteed to be
ordered before and visible to the
second. The rules of this ordering are
as follows:
Each action in a thread happens before every action in that thread
that comes later in the program's
order.
An unlock on a monitor happens before every subsequent lock on that
same monitor.
A write to a volatile field happens before every subsequent read
of that same volatile.
A call to start() on a thread happens before any actions in the
started thread.
All actions in a thread happen before any other thread successfully
returns from a join() on that thread.
Since the threads are started after the call to createCatalog, the result of createCatalog should be visible to those threads without any problems. It's only changes to the Catalog objects that occur after start() is called on the thread that would cause trouble.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.