How JVM collect ThreadDump underhood

How JVM collect ThreadDump underhood - java

Please explain how JVM collect ThreadDump underhood.
I don't understand how it collectons stack traces of threads that are off-CPU (wait disk IO,Network,non-voluntary context switches).
For example, linux perf collect info only about on-CPU threads(which use CPU-cycles)

I'll take HotSpot JVM as an example.
The JVM maintains the list of all Java threads: for each thread it has a corresponding VM structure. A thread can be in one of the following states depending on its execution context (HotSpot knows the current state of each thread, because it's responsible for switching states):
in_Java - a thread is executing Java code, either in the interpreter or in a JIT-compiled method;
in_vm - a thread is inside a VM runtime function;
in_native - a thread is running a native method in JNI context;
there are also transitional states, but let's skip them for simplicity.
An off-cpu thread can have only
in_native state: all socket I/O, disk I/O, and otherwise blocking operations are performed only in native code;
in_vm state, when a thread is blocked on a VM mutex.
Whenever the JVM calls a native method or acquires a contended mutex, it stores the last Java frame pointer into the Thread structure.
Now the crucial part: HotSpot JVM obtains a thread dump only at a safepoint.
When you ask for a thread dump, the JVM requests a stop-the-world pause. All threads in in_Java state are stopped at the nearest safepoint, where the JVM knows how to walk the stack.
Threads in in_native state are not stopped, but they don't need to. HotSpot knows their last Java frame, because the pointer is stored in a Thread structure. Knowing the top Java frame, the JVM can find its caller, then the caller of the caller, and so on.
What important here is that the Java part of the stack is "frozen", no matter what the native method does. The top part of the stack (native) can change back and forth, while the bottom part (Java) remains immutable. It cannot change, since the JVM checks for a pending safepoint operation on every switch from in_native to in_Java: if a native method returns, and the VM is currently running a stop-the-world operation, current thread blocks until the operation ends.
So, getting a thread dump involves
Stopping all in_Java and in_vm threads at a safepoint;
Walking through the global list of threads maintained by the JVM;
If a thread is running native method, its top Java frame is stored in a thread structure; if a thread is running Java code, its top frame corresponds to the currently executing Java method.
Each frame has a link to the previous frame, so given the top frame, the JVM can construct the whole stack trace to the bottom.

Related

The Internal Java Memory Model for Thread Stacks

I was reading article about Internal Java Memory Model.
There is one point I want to ask about :
Each thread running in the Java virtual machine has its own thread stack. The thread stack contains information about what methods the thread has called to reach the current point of execution.
Why each thread needs to save information about what methods has been executed(!) ? If it's related to context-switching then (if I'm not wrong) thread must save the information about method which is currently being executed.
What is actual need for save already executed method's information?

This is referring to the currently active methods. Note that there can be several methods in a thread active at the same time (A calls B calls C, ...). The stack does not contain information about methods that have already completed.

I think rephrasing this paragraph makes it clearer and easier to understand:
Each thread running in the Java virtual machine allocates some memory for its call stack. The call stack contains information about what methods the thread has called to reach the current point of execution.

Resuming a thread on another machine from the current JVM stack

On the JVM, I run a thread and at some point I block it. If I persist the JVM thread stack at this point, and all the objects I explicitly created in my code that it refers to (I assume they are all serializable), will it be feasible to use this data to resume the thread on another JVM?
Are there any frameworks/libraries out there that can help me or get me closer to doing such a thing?

Where is Thread Object created? Stack or Heap?

When I say something like:
Thread t1 = new Thread();
does it create it on a heap or a stack?

There is no way to allocate objects on the stack in Java.
The stack can only hold references and primitives, and only for local variables.
Note that starting a thread will create a new stack for that thread.

Thread t1 = new Thread();
tl;dr This allocates object i.e. t1 in heap.
As each new thread comes into existence, it gets its own pc register (program counter) and Java stack. If the thread is executing a Java method (not a native method), the value of the pc register indicates the next instruction to execute. A thread's Java stack stores the state of Java (not native) method invocations for the thread. The state of a Java method invocation includes its local variables, the parameters with which it was invoked, its return value (if any), and intermediate calculations. The state of native method invocations is stored in an implementation-dependent way in native method stacks, as well as possibly in registers or other implementation-dependent memory areas.
The Java stack is composed of stack frames (or frames). A stack frame contains the state of one Java method invocation. When a thread invokes a method, the Java virtual machine pushes a new frame onto that thread's Java stack. When the method completes, the virtual machine pops and discards the frame for that method.
The Java virtual machine has no registers to hold intermediate data values. The instruction set uses the Java stack for storage of intermediate data values.
Figure shows a snapshot of a virtual machine instance in which three threads are executing. At the instant of the snapshot, threads one and two are executing Java methods. Thread three is executing a native method. It also shows of the memory areas the Java virtual machine creates for each thread, these areas are private to the owning thread. No thread can access the pc register or Java stack of another thread.

In Java 8, using Escape Analysis objects can be created on the stack. This occurs when an object is detected as not escaping the current method (after inlining has been performed) Note: this optimisation is available in Java 7, but I don't think it worked as well.
However, as soon as you call start() it will escape the current method so it must be placed on the heap.
When I say something like:
Thread t1 = new Thread();
does it create it on a heap or a stack?
It could place it on the stack, provided you don't use it to create a real thread. i.e. if you so
Thread t1 = new Thread(runnable);
t1.start();
It has to place it on the heap.

In Java if one thread got killed, what will happen to the other thread?

I want to know in Java:
If main thread got killed what will happen to other children threads?
If child thread got killed what will happen to siblings and parent thread?
I read in the following link that since threads sharing address space, killing one thread can affect other thread also.
Below is a quote from here.
Threads are light weight processes that divide main flow of control into multiple flows and each flow of control/thread will execute independently. Activity of the process in a system is represented by threads. The process that has multiple threads is called as multi threaded. Each threads has its own thread ID ( Data Type Integer), register, program counter, stack, error no. Threads can communicate using shared memory within same process.
There are different advantages of using threads to mange and maintain the subtask of applications. When we are using threads than less system resources are used for context switching and increased the throughput of application. Threads also simplify the structure of program. There is no special mechanism for communication between tasks.
Threads also have some disadvantages for example threads are not reusable as they are dependent on a process and cannot be separated from the process. Threads are not isolated as they don't have their own address space. The error cause by the thread can kill the entire process or program because that error affects the entire memory space of all threads use in that process or program. Due to the shared resources by the threads with in the process can also affect the whole process or program when a resource damage by the thread. For concurrent read and write access to the memory thread will required synchronizations. Data of the process can easily damage by the thread through data race because all the threads with in the process have write access to same piece of data.
Can u gys please tell whether whatever told in the above link is applicable to java

1) Nothing will happen to the "child threads"...
2) Nothing will happen to the "sibling threads"...
...with the following exception: If all remaining threads are daemon threads, the application will terminate (i.e., when there are only daemon threads left, these will be killed as well).
From the documentation of Thread:
[...] The Java Virtual Machine continues to execute threads until either of the following occurs:
The exit method of class Runtime has been called [...]
All threads that are not daemon threads have died, either by returning from the call to the run method or by throwing an exception that propagates beyond the run method.

Nothing, in both cases. Threads run independantly of one another and there's no such thing as "parent" or "child" threads in this sense. The process will continue to run until there are no threads running in it.
A process is simply a container that contains some threads. The threads execute code. If there is one thread or more running inside a process container, the process will continue to exist. There's no symbiotic relationship between the threads, killing one will not kill another.

Default threads like, DestroyJavaVM, Reference Handler, Signal Dispatcher

Working on a profiler of my own, I would like to explain what I see. There are some default threads which always appear, even in the simplest program:
DestroyJavaVM
Signal Dispatcher
Finalizer
Reference Handler
Although their names are quite self-documenting, I would like to get a little bit more information. It seems these threads are not documented, does someone know a source to dig for these information or even knows exactly what these threads do?

DestroyJavaVM is a thread that unloads the Java VM on program
exit. Most of the time it should be waiting, until apocalypse of your VM.
Signal Dispatcher is a thread that handles the native signals sent by the OS to your jvm.
Finalizer threads pull objects from the finalization queue and calls it finalize method.
Reference Handler is a high-priority thread to enqueue pending References. Its defined in java.lang.ref.References.java

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.