Working on a profiler of my own, I would like to explain what I see. There are some default threads which always appear, even in the simplest program:
DestroyJavaVM
Signal Dispatcher
Finalizer
Reference Handler
Although their names are quite self-documenting, I would like to get a little bit more information. It seems these threads are not documented, does someone know a source to dig for these information or even knows exactly what these threads do?
DestroyJavaVM is a thread that unloads the Java VM on program
exit. Most of the time it should be waiting, until apocalypse of your VM.
Signal Dispatcher is a thread that handles the native signals sent by the OS to your jvm.
Finalizer threads pull objects from the finalization queue and calls it finalize method.
Reference Handler is a high-priority thread to enqueue pending References. Its defined in java.lang.ref.References.java
Related
The definition of a blocking method is very clear. There is still something that puzzles me. In a Java program, if I create a thread and try to take() from an empty BlockingQueue, that thread becomes in WAITING state according to the debugger. This is as expected.
On the other hand, if I create a thread and try to call accept() method of ServerSocket class(This is also a blocking code according to the JavaDoc), I see that this thread always in RUNNING state.
I am expecting a blocking method to be parked with monitors in Java. If a method is blocking like ServerSocket::accept, how come this method does not progress accept line and still have the status of RUNNING?
There's the concept of 'a blocking call' / 'a blocking method', as one might use in conversation, or as a tutorial might use, or even as a javadoc might use it.
Then there is the specific and extremely precisely defined java.lang.Thread state of BLOCKING.
The two concepts do not align, as you've already figured out with your test. The BLOCKING state effectively means 'blocking by way of this list of mechanisms that can block' (mostly, waiting to acquire a monitor, i.e. what happens when you enter a synchronized(x) block or try to pick up again from an x.wait() call: In both cases the thread needs to become 'the thread' that owns the lock on the object x is pointing at, and if it can't do that because another thread holds it, then the thread's state becomes BLOCKING.
This is spelled out in the javadoc. Here's the quote:
A thread that is blocked waiting for a monitor lock is in this state.
('monitor lock' is JVM-ese for the mechanism that synchronized and obj.wait/notify/notifyAll work with, and nothing else).
Keep reading the full javadoc of that page, as the detailed descriptions of these states usually spell out precisely which methods can cause these states.
This lets you figure out that if you write this code:
synchronized (foo) {
foo.wait();
}
then that thread goes through these states, in the worst case scenario)
RUNNING -> BLOCKED (another thread is in a synchronized(foo) block already).
BLOCKED -> RUNNING (that other thread is done)
RUNNING -> WAITING (obj.wait() is called, now waiting for a notify)
WAITING -> BLOCKED (we've been notified, but the thread still cannot continue until that monitor is picked up again, that's how wait and notify work).
BLOCKED -> RUNNING (got the lock on foo)
So why is my I/O thing on RUNNING then?
Unfortunately, I/O-related blocking is highly undefined behaviour.
However, I can explain a common scenario (i.e. what most combinations of OS, hardware, and JVM provider end up doing).
The reason for the undefined behaviour is the same reason for the RUNNING state: Java is supposed to run on a lot of hardware/operation system combos, and most of the I/O is, in the end, just stuff that java 'farms out' to the OS, and the OS does whatever the OS is going to do. Java is not itself managing the state of these threads, it just calls on the OS to do a thing, and then the OS ends up blocking, waiting, etc. Java doesn't need to manage it; not all OSes even allow java to attempt to update this state, and in any case there'd be no point whatsoever to it for the java process, it would just slow things down, and add to the pile of 'code that needs to be custom written for every OS that java should run on', and that's a pile you'd prefer remain quite small. The only benefit would be for you to write code that can programatically inspect thread states... and that's more a job for agents, not for java code running inside the same VM.
But, as I said, undefined, mostly. Don't go relying on the fact that x.get() on a socket's InputStream will keep the thread in RUNNING state.
Similar story when you try to interrupt() a thread that is currently waiting in I/O mode. That means the I/O call that is currently waiting for data might exit immediately with some IOException (not InterruptedException, though, that is guaranteed; InterruptedException is checked, InputStream.read() isn't declared to throw it, therefore, it won't) - or, it might do nothing at all. Depends on OS, java version, hardware, vendor, etc.
So the thread states don’t match up with OS thread states. They are defined in https://docs.oracle.com/javase/8/docs/api/java/lang/Thread.State.html:
public static enum Thread.State
extends Enum<Thread.State>
A thread state. A thread can be in one of the following states:
NEW
A thread that has not yet started is in this state.
RUNNABLE
A thread executing in the Java virtual machine is in this state.
BLOCKED
A thread that is blocked waiting for a monitor lock is in this state.
WAITING
A thread that is waiting indefinitely for another thread to perform a particular action is in this state.
TIMED_WAITING
A thread that is waiting for another thread to perform an action for up to a specified waiting time is in this state.
TERMINATED
A thread that has exited is in this state.
A thread can be in only one state at a given point in time. These states are virtual machine states which do not reflect any operating system thread states.
When we say something is blocked or waiting, we have broad ideas about what that means. But blocked here doesn’t mean blocked on I/O, it doesn’t mean blocked trying to acquire a ReentrantLock, it specifically means blocked trying to acquire a monitor lock. So that is why your socket accept call shows the thread as running, the definitions are very narrow. Read the rest of the linked Java doc, it is extremely specific about what qualifies as a given state.
I know that threads have a message queue and handlers are able to push runnables or messages to them, but when I profile my android application using Android Studio tools, there is a strange process:
android.os.MessageQueue.nativePollOnce
It uses the CPU more than all the other processes. What is it and how can I reduce the time that the CPU spends on it?
You can find the profiler result below.
Short answer:
The nativePollOnce method is used to "wait" till the next Message becomes available. If the time spent during this call is long, your main (UI) thread has no real work to do and waits for next events to process. There's no need to worry about that.
Explanation:
Because the "main" thread is responsible for drawing UI and handling various events, it's Runnable has a loop which processes all these events.
The loop is managed by a Looper and its job is quite straightforward: it processes all Messages in the MessageQueue.
A Message is added to the queue for example in response to input events, as frame rendering callback or even your own Handler.post calls. Sometimes the main thread has no work to do (that is, no messages in the queue), which may happen e.g. just after finishing rendering single frame (the thread has just drawn one frame and is ready for the next one, just waits for a proper time). Two Java methods in the MessageQueue class are interesting to us: Message next() and boolean enqueueMessage(Message, long). Message next(), as its name suggest, takes and returns the next Message from the queue. If the queue is empty (and there's nothing to return), the method calls native void nativePollOnce(long, int) which blocks until a new message is added. At this point you might ask how does nativePollOnce know when to wake up. That's a very good question. When a Message is added to the queue, the framework calls the enqueueMessage method, which not only inserts the message into the queue, but also calls native static void nativeWake(long), if there's need to wake up the queue. The core magic of nativePollOnce and nativeWake happens in the native (actually, C++) code. Native MessageQueue utilizes a Linux system call named epoll, which allows to monitor a file descriptor for IO events. nativePollOnce calls epoll_wait on a certain file descriptor, whereas nativeWake writes to the descriptor, which is one of the IO operations, epoll_wait waits for. The kernel then takes out the epoll-waiting thread from the waiting state and the thread proceeds with handling the new message. If you're familiar with Java's Object.wait() and Object.notify() methods, you can imagine that nativePollOnce is a rough equivalent for Object.wait() and nativeWake for Object.notify(), except they're implemented completely differently: nativePollOnce uses epoll and Object.wait() uses futex Linux call. It's worth noticing that neither nativePollOnce nor Object.wait() waste CPU cycles, as when a thread enters either method, it becomes disabled for thread scheduling purposes (quoting the javadoc for the Object class). However, some profilers may mistakenly recognize epoll-waiting (or even Object-waiting) threads as running and consuming CPU time, which is incorrect. If those methods actually wasted CPU cycles, all idle apps would use 100% of the CPU, heating and slowing down the device.
Conclusion:
You shouldn't worry about nativePollOnce. It just indicates that processing of all Messages has been finished and the thread waits for the next one. Well, that simply means you don't give too much work to your main thread ;)
So far what I have understood about wait() and yield () methods is that yield() is called when the thread is not carrying out any task and lets the CPU execute some other thread. wait() is used when some thread is put on hold and usually used in the concept of synchronization. However, I fail to understand the difference in their functionality and i'm not sure if what I have understood is right or wrong. Can someone please explain the difference between them(apart from the package they are present in).
aren't they both doing the same task - waiting so that other threads can execute?
Not even close, because yield() does not wait for anything.
Every thread can be in one of a number of different states: Running means that the thread is actually running on a CPU, Runnable means that nothing is preventing the thread from running except, maybe the availability of a CPU for it to run on. All of the other states can be lumped into a category called blocked. A blocked thread is a thread that is waiting for something to happen before it can become runnable.
The operating system preempts running threads on a regular basis: Every so often (between 10 times per second and 100 times per second on most operating systems) the OS tags each running thread and says, "your turn is up, go to the back of the run queue' (i.e., change state from running to runnable). Then it lets whatever thread is at the head of the run queue use that CPU (i.e., become running again).
When your program calls Thread.yield(), it's saying to the operating system, "I still have work to do, but it might not be as important as the work that some other thread is doing. Please send me to the back of the run queue right now." If there is an available CPU for the thread to run on though, then it effectively will just keep running (i.e., the yield() call will immediately return).
When your program calls foobar.wait() on the other hand, it's saying to the operating system, "Block me until some other thread calls foobar.notify().
Yielding was first implemented on non-preemptive operating systems and, in non-preemptive threading libraries. On a computer with only one CPU, the only way that more than one thread ever got to run was when the threads explicitly yielded to one another.
Yielding also was useful for busy waiting. That's where a thread waits for something to happen by sitting in a tight loop, testing the same condition over and over again. If the condition depended on some other thread to do some work, the waiting thread would yield() each time around the loop in order to let the other thread do its work.
Now that we have preemption and multiprocessor systems and libraries that provide us with higher-level synchronization objects, there is basically no reason why an application programs would need to call yield() anymore.
wait is for waiting on a condition. This might not jump into the eye when looking at the method as it is entirely up to you to define what kind of condition it is. But the API tries to force you to use it correctly by requiring that you own the monitor of the object on which you are waiting, which is necessary for a correct condition check in a multi-threaded environment.
So a correct use of wait looks like:
synchronized(object) {
while( ! /* your defined condition */)
object.wait();
/* execute other critical actions if needed */
}
And it must be paired with another thread executing code like:
synchronized(object) {
/* make your defined condition true */)
object.notify();
}
In contrast Thread.yield() is just a hint that your thread might release the CPU at this point of time. It’s not specified whether it actually does anything and, regardless of whether the CPU has been released or not, it has no impact on the semantics in respect to the memory model. In other words, it does not create any relationship to other threads which would be required for accessing shared variables correctly.
For example the following loop accessing sharedVariable (which is not declared volatile) might run forever without ever noticing updates made by other threads:
while(sharedVariable != expectedValue) Thread.yield();
While Thread.yield might help other threads to run (they will run anyway on most systems), it does not enforce re-reading the value of sharedVariable from the shared memory. Thus, without other constructs enforcing memory visibility, e.g. decaring sharedVariable as volatile, this loop is broken.
The first difference is that yield() is a Thread method , wait() is at the origins Object method inheritid in thread as for all classes , that in the shape, in the background (using java doc)
wait()
Causes the current thread to wait until another thread invokes the notify() method or the notifyAll() method for this object. In other words, this method behaves exactly as if it simply performs the call wait(0).
yield()
A hint to the scheduler that the current thread is willing to yield its current use of a processor. The scheduler is free to ignore this hint.
and here you can see the difference between yield() and wait()
Yield(): When a running thread is stopped to give its space to another thread with a high priority, this is called Yield.Here the running thread changes to runnable thread.
Wait(): A thread is waiting to get resources from a thread to continue its execution.
From a Thread perspective, what is a block, wait and lock? Rather,is it necessary to have all these three in any operation? For example, in a producer-consumer pattern how this things are implemented.
Thanks in advance
A blocking operation is one that blocks the thread until the operation completes. Blocking a thread is the process of telling the thread scheduler (usually the operating system, although there are user-level thread libraries) not to run a thread until that thread is woken up. There are many kinds of blocking operations, and one example is file I/O. As with any other blocking operation, the method doesn't return until the relevant operation (in this case, file I/O) has completed.
A wait is a particular kind of blocking operation used for thread synchronization. Specifically, it says "please block the thread that called wait until some other thread wakes it up." In Java, wait is a method. The corresponding wake-up method is notify.
A lock is a higher-level abstraction that says "only allow a limited number of threads into this region of code." Most commonly, that limited number is 1, in which case a mutex (which I explain in plenty of detail in this SO answer) is the preferred locking primitive in a lower-level language like C. In Java, the most common locking primitive is called a monitor. There is a notion of owning an object's monitor (every object has a monitor), and waiting on a monitor, and waking up a thread that is waiting on a monitor. How do we accomplish this? You guessed it - we use the wait method to wait on a monitor, and notify to wake up one of the threads that is waiting on the monitor.
Now an answer that will probably sound a bit like Greek, given that you are just starting with concurrency: To implement the producer-consumer pattern, the most common strategy is to use two semaphores (plus a mutex to synchronize access to the buffer). A semaphore is usually implemented with a mutex, but is a higher-order construct because it allows counting some resource. So you keep one semaphore to count the number of items in the buffer, and one to count the number of empty spaces in the buffer. The producer waits on the empty space semaphore and adds items to the buffer whenever space becomes available, and the consumer waits on the items semaphore and consumes an item whenever an item becomes available.
Now I've defined what these things are, but I haven't really talked about how to use them. That, however, is worth several lectures in a college course, and is certainly too much for a StackOverflow answer. I'd recommend the concurrency lessons in the Java tutorials as a way to get started with threading. Also, look up college courses on the web. Many schools post notes publicly online, so with a little searching you can often find high-quality material.
EDIT: A description of the difference between wait and blocking I/O
Before you begin reading this, make sure you're familiar with what a thread is, and what a process is. I give an explanation in the first four paragraphs of this SO answer, and Wikipedia has a more detailed explanation (albeit with less historical context).
Each thread has one very important piece of information: an instruction pointer (there are other important pieces of information associated with each thread, but they aren't important now). The instruction pointer is a JVM-maintained pointer to the currently executing bytecode instruction. Every time you execute an instruction (each instruction is an abstract representation of a very simple operation, such as "call method foo on object x), the instruction pointer is moved forward to some "next instruction." To run your program, the JVM sets the instruction pointer to the beginning of main and keeps executing instructions and moving the instruction pointer forward until the program exits somehow.
A blocking operation stops the instruction pointer from moving forward until some event occurs to cause the instruction pointer to move forward again. Certainly the thread that initiated the blocking operation can't make this event happen, because that thread's instruction pointer isn't moving forward i.e. that thread is doing nothing.
Now, there are a lot of different kinds of blocking operations. One is blocking I/O. If you call System.out.println, for example, the println method doesn't return until the text is written out to the console. In this case, the instruction pointer stops somewhere inside System.out.println, and the operating system signals the thread to wake up whenever the console printing finishes. So the thread doesn't have to start its own instruction pointer moving again, but the method still returns just after the text is written to the console. So, at a very high level:
Thread 0 calls System.out.println("foo")
Thread 0's instruction pointer stops moving while the operating system writes "foo" to the console
When the operating system is done writing to the console, it notifies the JVM, and the JVM automatically starts moving thread 0's instruction pointer moving again. All of this happens without the programmer who writes System.out.println having to think about it.
Another completely separate kind of blocking operation is encapsulated in the Object.wait method. Whenever a thread calls Object.wait, that thread's instruction pointer stops moving, but instead of the operating system starting the movement of the instruction pointer again, another thread does the job. In this case, there is no external event that will cause the thread's instruction pointer to be restarted (as in the blocking I/O case), but there is an event internal to the program. As I said, another thread will start the instruction pointer moving again by calling Object.notify. So, at a very high level:
Thread 0 calls x.wait() on some object
Thread 0's instruction pointer stops moving
Thread 1 calls x.notify() on the same object x
Thread 0's instruction pointer starts moving again
Thread 0 and thread 1 are now executing concurrently
Notice that a lot more work has to go into writing wait/notify code correctly - the JVM and the operating system don't do all the work for you this time. They still actually do most of the work for you, but you actually have to think about calling wait and notify, and how they allow you to communicate between threads, implement locks, and more.
So there are two morals to this story. The first is that blocking I/O and wait are completely different beasts. In both cases, a thread is blocked, but in the blocking I/O case the thread is woken up automatically by the operating system, while in the wait case the thread has to rely on another thread calling notify in order to wake it up. The second is that concurrent programming is harder to reason about than serial programming. The toy examples I've put in this answer don't really do the second point justice.
No, you don't necessarily need a lock or a wait just because you're using threads. However, if you want the threads to exchange data, they are often useful.
Here's a good explanation with an example of the consumer producer model:
http://www.ase.md/~aursu/JavaThreadsSynchronization.html
Cheers!
Block : Prevent the Executing.
Wait : Suspends the current thread.
Lock : When you lock it others can't Use it.
Consider online purchase when a customer buys a Movie Ticket
As soon as he chooses the seat. Others won't be able to get those seat at the same time(Locking those seats).
Is there a default timeout for threads waiting on a synchronised method in Java? Some threads in my app are not completing as expected. Is there anyway to check whether threads have died as a result of timeouts?
The JLS does not specify any timeout for synchronized sections. It just mentions
While the executing thread owns the
lock, no other thread may acquire the
lock.
You can set a timeout on the join() method to make sure you don't wait forever.
I'd have a look at the java.util.concurrent packages to see if there were new features added to help your situation.
I'd also recommend "Java Concurrency In Practice" by Brian Goetz. (I need to re-read it again myself.)
If a method is waiting on a synchronization object it should never die, but it could be waiting for an awfully long time (as in, "forever") if something goes wrong. Perhaps your program is never releasing a lock on a resource?
Threads in Java don't just die suddenly. Either they are not progressing (blocked on a lock or infinite loop or similar), or if an exception is thrown and it is not handled, then the thread's execution will stop when the exception propagates to the top level (which should then print the exception's stack trace to System.err).
If your application is deadlocking, one way to find out the reason is to make a thread dump. The JVM can also itself detect simple deadlocks, in which case it will report them in the thread dump.
You can generate a thread dump under Linux by running kill -QUIT <pid> and under Windows by hitting Ctrl + Break in the console window. Or even simpler, use VisualVM, StackTrace or a similar tool.
I suggest you use kill -3 to see a thread dump, and then see what the problematic threads are.