Multithreaded debugging in java

Multithreaded debugging in java - java

I have a program that runs about 50 threads. I employ a producer consumer design pattern to communicate data between the threads. After the program has been running for a while, sometimes it freezes due to one of the BlockingQueue's I use to distribute data between the threads becomes full, and therefore the main distribution part of the program blocks when it tries to add data to this BlockingQueue. In other words, one of the threads stops for some reason and then the blockingQueue it uses to receive data becomes full.
How do I go about debugging this in an efficient manner? I have tried surrounding the content in all run() methods with catch(Exception e), but nothing is ever thrown. I develop in Java/IntelliJ.
Any thoughts, ideas or general guidelines?

"Debug it" by using a logger. I like SLF4J.
Set up log.debug statements before and after each critical operation. Use log.entering and log.exiting calls at the start and end of each method.
While you are 'debugging' you'll run your application with the logger set to a very low level (FINEST) then run your application and watch the logging statements to learn when it fails and what the state is when it fails.
Since you're worried about a threading issue, make sure your log format includes the thread name or number.

general guidelines?
I don't know if this applies to your situation, but a very important guideline is to never have locks being taken in different orders.
An example:
Thread 1:
ResourceA.lock();
ResourceB.lock();
...
ResourceB.unlock();
ResourceA.unlock();
Thread 2:
ResourceB.lock();
ResourceA.lock();
...
ResourceA.unlock();
ResourceB.unlock();
Now if thread 1 is interrupted when it already owns ResourceA but not yet ResourceB, and thread 2 is allowed to run, thread 2 will take ResourceB. Then thread 1 owns ResourceA and waits for ResourceB, and thread 2 owns ResourceB and waits for ResourceA, so you have a deadlock.

Related

Java, why need to use synchronization? instead of using a single thread?

While reading about Java synchronized, I just wondered, if the processing should be in synchronization, why not just creating a single thread (not main thread) and process one by one instead of creating multiple threads.
Because, by 'synchronized', all other threads will be just waiting except single running thread. It seems like the only single thread is working in the time.
Please advise me what I'm missing it.
I would very appreciate it if you could give some use cases.
I read an example, that example about accessing bank account from 2 ATM devices. but it makes me more confused, the blocking(Lock) should be done by the Database side, I think. and I think the 'synchronized' would not work in between multiple EC2 instances.
If my thinking is wrong, please fix me.

If all the code you run with several threads is within a synchronized block, then indeed it makes no difference vs. using a single thread.
However in general your code contains parts which can be run on several threads in parallel and parts which can't. The latter need synchronization but not the former. By using several threads you can speed up the "parallelisable" bits.

Let's consider the following use-case :
Your application is a internet browser game. Every player has a score and can click a button. Every time a player clicks the button, their score is increased and their opponent's is decreased. The first player to reach 10 wins.
As per the nature of the game, and to single a unique winner, you have to consider the two counters increase (and the check for the winner) atomically.
You'll have each player send clickEvents on their own thread and every event will be translated into the increase of the owner's counter, the check on whether the counter reached 10 and the decrease of the opponent's counter.
This is very easily done by synchronizing the method which handles modifying the counters : every concurrent thread will try to obtain the lock, and when they do, they'll execute the code (and finally release the lock).
The locking mechanism is pretty lightweight and only requires a single keyword of code.
If we follow your suggestion to implement another thread that will handle the execution, we'd have to implement the whole thread management logic (more code), to initialize that Thread (more resource) and even so, to guarantee fairness in the handling of events, you still need a way for your client threads to pass the event to your executor thread. The only way I see to do so, is to implement a BlockingQueue, which is also synchronized to prevent the race condition that naturally occurs when trying to add elements from two other thread.
I honnestly don't see a way to resolve this very simple use-case without synchronization (or implementing your own locking algorithm that basically does the same).

You can have a single thread and process one-by-one (and this is done), but there are considerable overheads in doing so and it does not remove the need for synchronization.
You are in a situation where you are starting with multiple threads (for example, you have lots of simultaneous web sessions). You want to do a part of the processing in a single thread - let's say updating some common structure with some new data. You need to pass the new data to the single thread - how do you get it there? You would have to use some kind of message queue (or an equivalent thing) and have the single thread pick requests off the message queue and that would have have to be synchronized anyway, plus there is the overhead of managing the queue, plus the issue that you need to get a reply back from the single thread asynchronously. So you are back to square one.
This technique is used where the processing you need to do is considerable and you don't want to block your main threads for a long time.
In summary: having a single thread does not remove the need for synchronization.

difference between wait() and yield()

So far what I have understood about wait() and yield () methods is that yield() is called when the thread is not carrying out any task and lets the CPU execute some other thread. wait() is used when some thread is put on hold and usually used in the concept of synchronization. However, I fail to understand the difference in their functionality and i'm not sure if what I have understood is right or wrong. Can someone please explain the difference between them(apart from the package they are present in).

aren't they both doing the same task - waiting so that other threads can execute?
Not even close, because yield() does not wait for anything.
Every thread can be in one of a number of different states: Running means that the thread is actually running on a CPU, Runnable means that nothing is preventing the thread from running except, maybe the availability of a CPU for it to run on. All of the other states can be lumped into a category called blocked. A blocked thread is a thread that is waiting for something to happen before it can become runnable.
The operating system preempts running threads on a regular basis: Every so often (between 10 times per second and 100 times per second on most operating systems) the OS tags each running thread and says, "your turn is up, go to the back of the run queue' (i.e., change state from running to runnable). Then it lets whatever thread is at the head of the run queue use that CPU (i.e., become running again).
When your program calls Thread.yield(), it's saying to the operating system, "I still have work to do, but it might not be as important as the work that some other thread is doing. Please send me to the back of the run queue right now." If there is an available CPU for the thread to run on though, then it effectively will just keep running (i.e., the yield() call will immediately return).
When your program calls foobar.wait() on the other hand, it's saying to the operating system, "Block me until some other thread calls foobar.notify().
Yielding was first implemented on non-preemptive operating systems and, in non-preemptive threading libraries. On a computer with only one CPU, the only way that more than one thread ever got to run was when the threads explicitly yielded to one another.
Yielding also was useful for busy waiting. That's where a thread waits for something to happen by sitting in a tight loop, testing the same condition over and over again. If the condition depended on some other thread to do some work, the waiting thread would yield() each time around the loop in order to let the other thread do its work.
Now that we have preemption and multiprocessor systems and libraries that provide us with higher-level synchronization objects, there is basically no reason why an application programs would need to call yield() anymore.

wait is for waiting on a condition. This might not jump into the eye when looking at the method as it is entirely up to you to define what kind of condition it is. But the API tries to force you to use it correctly by requiring that you own the monitor of the object on which you are waiting, which is necessary for a correct condition check in a multi-threaded environment.
So a correct use of wait looks like:
synchronized(object) {
while( ! /* your defined condition */)
object.wait();
/* execute other critical actions if needed */
}
And it must be paired with another thread executing code like:
synchronized(object) {
/* make your defined condition true */)
object.notify();
}
In contrast Thread.yield() is just a hint that your thread might release the CPU at this point of time. It’s not specified whether it actually does anything and, regardless of whether the CPU has been released or not, it has no impact on the semantics in respect to the memory model. In other words, it does not create any relationship to other threads which would be required for accessing shared variables correctly.
For example the following loop accessing sharedVariable (which is not declared volatile) might run forever without ever noticing updates made by other threads:
while(sharedVariable != expectedValue) Thread.yield();
While Thread.yield might help other threads to run (they will run anyway on most systems), it does not enforce re-reading the value of sharedVariable from the shared memory. Thus, without other constructs enforcing memory visibility, e.g. decaring sharedVariable as volatile, this loop is broken.

The first difference is that yield() is a Thread method , wait() is at the origins Object method inheritid in thread as for all classes , that in the shape, in the background (using java doc)
wait()
Causes the current thread to wait until another thread invokes the notify() method or the notifyAll() method for this object. In other words, this method behaves exactly as if it simply performs the call wait(0).
yield()
A hint to the scheduler that the current thread is willing to yield its current use of a processor. The scheduler is free to ignore this hint.
and here you can see the difference between yield() and wait()

Yield(): When a running thread is stopped to give its space to another thread with a high priority, this is called Yield.Here the running thread changes to runnable thread.
Wait(): A thread is waiting to get resources from a thread to continue its execution.

I've been taught that conditions in concurrency do not necessarily need to be written in a loop, against what the oracle doc says

So basically I am learning a bit more serious concurrency (studying how things actually work, instead of just using random stuff if needed).
And my proffesor, when I asked him about this, said me that he and his colleagues hadn't been able to reproduce a spurious wake up, and believes that line is an old line not deleted (like, it was there, java got "better", it's not longer needed, the line is still there), and that is not the case.
Link:
http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/locks/Condition.html
It's right below the point called:
Implementation Considerations
In his opinion, a condition that looked kind of like this:
lock.lock()
if (p>q) {
lock.newCondition().await
}
Would be perfectly fine, since he says a spurious wake up can not happen, it wouldn't be needed a loop:
lock.lock()
while (p>q) {
lock.newCondition().await
}
I am MORE than likely mixing things and understanding both the doc and my teacher the wrong way, but I do have spent some time trying to understand why each thing, and can't come with an "answer" of my own, I either believe one or the other (not like it matters, it's pure I-want-to-learn).
My teacher does spend time telling us how explaining concurrency in java it's pretty silly, but I didn't choose it either, so there's that.

Would be perfectly fine, since he says a spurious wake up can not happen, it wouldnt be needed a loop:
Your teacher is wrong for two reasons:
Spurious wakeups do happen. It may not happen on the architecture that they tested on but if you don't take it into account, when you move your application to a different piece of hardware or a different OS revision, you will see problems. It may also be that the spurious interrupts happen occasionally during an exceptional kernel event such as a signal getting delivered at precisely the wrong time. Again, your application may run fine in testing but when you move it into production with a lot higher load, the frequency of the exceptional event may increase...
The underlying problem is that certain native thread implementations may choose to wakeup all conditions associated with an application instead of the specific one that was notified. This is well documented in the javadocs for Object.wait():
As in the one argument version, interrupts and spurious wakeups are possible, and this method should always be used in a loop:
Here's one example of an architecture that has this limitation. I'll quote from this interesting blog entry:
Internally, wait is implemented as a call to the 'futex' system call. Each blocking system call on Linux returns abruptly when the process receives a signal -- because calling signal handler from kernel call is tricky. What if the signal handler calls some other system function? And a new signal arrives? It's easy to run out of kernel stack for a process. Exactly because each system call can be interrupted, when glibc calls any blocking function, like 'read', it does it in a loop, and if 'read' returns EINTR, calls 'read' again.
The while loop is also very important to protect against race conditions -- especially in multiple thread producer/consumer models. If you have multiple threads that are consuming from a queue (for example), a notification that there are items in the queue may wakeup a thread but by the time it is able to get the lock, another thread has already dequeued the item.
This is well documented on my page here with a sample program that demonstrates the race condition without the use of while.
Producer Consumer Thread Race Conditions
In your example, thread A may be waiting in await() while another thread B may be waiting to get the lock(). Thread C has the lock and is adding to the queue.
// B is here waiting for the lock
lock.lock()
while (p > q) {
// A is here waiting for the signal
lock.newCondition().await();
}
// dequeue
lock.unlock();
Then if the producer adds something to the queue and calls signal() the thread A moves from the WAIT state to the BLOCKED state to get the lock itself. But it may be behind thread B which is already waiting. Once the lock is released, thread B dequeues the element, not thread A. When thread A then gets a chance to dequeue, the queue is empty. Without the while loop, you can get out-of-bounds exceptions or other problems by trying to dequeue from an empty queue.
See my link for more explicit details of the race.

It is still necessary. Your professor is not necessarily incorrect, but has created a strawman argument to knock down.
There are two reasons why you must protect your conditions in a loop.
The first is spurious wake-up. Your professor seems to have been unable to reproduce this, and it may likely not be a problem on the platforms he tests on. This does not mean it is unreproduceable on all platforms.
The second is that between the times that you wake up and actually go to do the logic, the condition may no longer be true. You must guard against this potential race condition. This is also notoriously difficult to reproduce in the lab, and will probably only happen in bizarre circumstances in production.

Java multithreading in CPU load

I have a bit of an issue with an application running multiple Java threads.
The application runs a number of working threads that peek continuously at an input queue and if there are messages in the queue they pull them out and process them.
Among those working threads there is another verification thread scheduled to perform at a fixed period a check to see if the host (on which the application runs) is still in "good shape" to run the application. This thread updates an AtomicBoolean value which in turn is verified by the working thread before they start peeking to see if the host is OK.
My problem is that in cases with high CPU load the thread responsible with the verification will take longer because it has to compete with all the other threads. If the AtomicBoolean does not get updated after a certain period it is automatically set to false, causing me a nasty bottleneck.
My initial approach was to increase the priority of the verification thread, but digging into it deeper I found that this is not a guaranteed behavior and an algorithm shouldn't rely on thread priority to function correctly.
Anyone got any alternative ideas? Thanks!

Instead of peeking into a regular queue data structure, use the java.util.concurrent package's LinkedBlockingQueue.
What you can do is, run an pool of threads (you could use executer service's fixed thread pool, i.e., a number of workers of your choice) and do LinkedBlockingQueue.take().
If a message arrives at the queue, it is fed to one of the waiting threads (yeah, take does block the thread until there is something to be fed with).
Java API Reference for Linked Blocking Queue's take method
HTH.

One old school approach to throttling rate of work, that does not use a health check thread at all (and so by-passes these problems) is to block or reject requests to add to the queue if the queue is longer than say 100. This applies dynamic back pressure on to the clients generating the load, slowing them down when the worker threads are over loaded.
This approach was added to the Java 1.5 library, see java.util.concurrent.ArrayBlockingQueue. Its put(o) method blocks if the queue is full.

Are u using Executor framework (from Java's concurrency package)? If not give it a shot. You could try using ScheduledExecutorService for the verification thread.

More threads does not mean better performance. Usually if you have dual core, 2 threads gives best performance, 3 or more starts getting worse. Quad core should handle 4 threads best, etc. So be careful how much threads you use.
You can put the other threads to sleep after they perform their work, and allow other threads to do their part. I believe Thread.yield() will pause the current thread to give time to other threads.
If you want your thread to run continuously, I would suggest creating two main threads, thread A and B. Use A for the verification thread, and from B, create the other threads. Therefore thread A gets more execution time.

Seems you need to utilize Condition variables. Peeking will take cpu cycles.
http://docs.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/locks/Condition.html

Thread behaviour

From a Thread perspective, what is a block, wait and lock? Rather,is it necessary to have all these three in any operation? For example, in a producer-consumer pattern how this things are implemented.
Thanks in advance

A blocking operation is one that blocks the thread until the operation completes. Blocking a thread is the process of telling the thread scheduler (usually the operating system, although there are user-level thread libraries) not to run a thread until that thread is woken up. There are many kinds of blocking operations, and one example is file I/O. As with any other blocking operation, the method doesn't return until the relevant operation (in this case, file I/O) has completed.
A wait is a particular kind of blocking operation used for thread synchronization. Specifically, it says "please block the thread that called wait until some other thread wakes it up." In Java, wait is a method. The corresponding wake-up method is notify.
A lock is a higher-level abstraction that says "only allow a limited number of threads into this region of code." Most commonly, that limited number is 1, in which case a mutex (which I explain in plenty of detail in this SO answer) is the preferred locking primitive in a lower-level language like C. In Java, the most common locking primitive is called a monitor. There is a notion of owning an object's monitor (every object has a monitor), and waiting on a monitor, and waking up a thread that is waiting on a monitor. How do we accomplish this? You guessed it - we use the wait method to wait on a monitor, and notify to wake up one of the threads that is waiting on the monitor.
Now an answer that will probably sound a bit like Greek, given that you are just starting with concurrency: To implement the producer-consumer pattern, the most common strategy is to use two semaphores (plus a mutex to synchronize access to the buffer). A semaphore is usually implemented with a mutex, but is a higher-order construct because it allows counting some resource. So you keep one semaphore to count the number of items in the buffer, and one to count the number of empty spaces in the buffer. The producer waits on the empty space semaphore and adds items to the buffer whenever space becomes available, and the consumer waits on the items semaphore and consumes an item whenever an item becomes available.
Now I've defined what these things are, but I haven't really talked about how to use them. That, however, is worth several lectures in a college course, and is certainly too much for a StackOverflow answer. I'd recommend the concurrency lessons in the Java tutorials as a way to get started with threading. Also, look up college courses on the web. Many schools post notes publicly online, so with a little searching you can often find high-quality material.
EDIT: A description of the difference between wait and blocking I/O
Before you begin reading this, make sure you're familiar with what a thread is, and what a process is. I give an explanation in the first four paragraphs of this SO answer, and Wikipedia has a more detailed explanation (albeit with less historical context).
Each thread has one very important piece of information: an instruction pointer (there are other important pieces of information associated with each thread, but they aren't important now). The instruction pointer is a JVM-maintained pointer to the currently executing bytecode instruction. Every time you execute an instruction (each instruction is an abstract representation of a very simple operation, such as "call method foo on object x), the instruction pointer is moved forward to some "next instruction." To run your program, the JVM sets the instruction pointer to the beginning of main and keeps executing instructions and moving the instruction pointer forward until the program exits somehow.
A blocking operation stops the instruction pointer from moving forward until some event occurs to cause the instruction pointer to move forward again. Certainly the thread that initiated the blocking operation can't make this event happen, because that thread's instruction pointer isn't moving forward i.e. that thread is doing nothing.
Now, there are a lot of different kinds of blocking operations. One is blocking I/O. If you call System.out.println, for example, the println method doesn't return until the text is written out to the console. In this case, the instruction pointer stops somewhere inside System.out.println, and the operating system signals the thread to wake up whenever the console printing finishes. So the thread doesn't have to start its own instruction pointer moving again, but the method still returns just after the text is written to the console. So, at a very high level:
Thread 0 calls System.out.println("foo")
Thread 0's instruction pointer stops moving while the operating system writes "foo" to the console
When the operating system is done writing to the console, it notifies the JVM, and the JVM automatically starts moving thread 0's instruction pointer moving again. All of this happens without the programmer who writes System.out.println having to think about it.
Another completely separate kind of blocking operation is encapsulated in the Object.wait method. Whenever a thread calls Object.wait, that thread's instruction pointer stops moving, but instead of the operating system starting the movement of the instruction pointer again, another thread does the job. In this case, there is no external event that will cause the thread's instruction pointer to be restarted (as in the blocking I/O case), but there is an event internal to the program. As I said, another thread will start the instruction pointer moving again by calling Object.notify. So, at a very high level:
Thread 0 calls x.wait() on some object
Thread 0's instruction pointer stops moving
Thread 1 calls x.notify() on the same object x
Thread 0's instruction pointer starts moving again
Thread 0 and thread 1 are now executing concurrently
Notice that a lot more work has to go into writing wait/notify code correctly - the JVM and the operating system don't do all the work for you this time. They still actually do most of the work for you, but you actually have to think about calling wait and notify, and how they allow you to communicate between threads, implement locks, and more.
So there are two morals to this story. The first is that blocking I/O and wait are completely different beasts. In both cases, a thread is blocked, but in the blocking I/O case the thread is woken up automatically by the operating system, while in the wait case the thread has to rely on another thread calling notify in order to wake it up. The second is that concurrent programming is harder to reason about than serial programming. The toy examples I've put in this answer don't really do the second point justice.

No, you don't necessarily need a lock or a wait just because you're using threads. However, if you want the threads to exchange data, they are often useful.
Here's a good explanation with an example of the consumer producer model:
http://www.ase.md/~aursu/JavaThreadsSynchronization.html
Cheers!

Block : Prevent the Executing.
Wait : Suspends the current thread.
Lock : When you lock it others can't Use it.
Consider online purchase when a customer buys a Movie Ticket
As soon as he chooses the seat. Others won't be able to get those seat at the same time(Locking those seats).

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.