Do Java threads have a tree structure? - java

I have several threads. And I'd like to create sub-threads in one of them. So I'd like know whether java thread has a tree structure. Or the new created sub-threads is just the sibling of other threads. And What's the resource allocation strategy when these thread are competing for resources? Does the parent thread has higher priority ?
Thanks.
Jeff

Threads do not strictly have a tree structure. However, you can use ThreadGroups to make a hierarchical nesting of Threads and other ThreadGroups.
I don't believe that there is a hard rule for thread priority inside groups, as each thread can be set separately.
Why do you need "sub-threads"? In my experience, Thread should be avoided in favor of ExecutorService. The service won't give you a hierarchy of stuff easily, but I'm having a hard time thinking of a case where the hierarchy would be useful.

There's no thread hierarchy, all threads are siblings of each other. There is no concept of "parent thread" or "child thread". You'll have to be more specific about what you mean by resource allocation strategy -- what resources are you referring to?
If it's memory (by far the most common type of resource), then the answer is that the memory allocator is thread safe: multiple simultaneous allocations happening on different threads will always work correctly. If you run out of memory, then some unlucky thread is going to get an OutOfMemoryError.
If you're referring to other types of resources, then it depends entirely on the implementation of the resource. It's either going to be thread-safe or non-thread-safe. If it's thread-safe, you can allocate the resources freely from multiple threads. If it's not thread-safe, then read the documentation for that type of resource.

There is no concept of "subthread" in the way you describe, certainly no (apparent) tree structure. You can adjust the priority of threads if you like, though the actual meaning of thread priority is essentially undefined (unless you go to RTSJ). CPU resource allocation, by and large, is managed by the OS scheduler... you can try to influence it with priorities, but again that's a dicey proposition.
If you are looking for a framework to break a problem into subcomponents and process them concurrently, look at JDK5/JDK6's java.util.concurrent package, with specific attention to Callable and the Executor framework. Also, the ForkJoin framework might match.

The closest that there is to a thread hierarchy in Java is the ThreadGroup. By default, every application thread goes into the same ThreadGroup, but you can create new ThreadGroups and create new Threads within them. ThreadGroups are inherently hierarchical, and offer methods for doing things like enumerating the threads and child groups, interrupting all threads, setting a default uncaught exception handler, and so on.
Unfortunately, this doesn't do what you want to do. In particular, thread groups are not taken into account when allocating resources or scheduling.
There is a setMaxPriority method, but it only (indirectly) affects new threads or existing threads that change priority. Existing threads whose current priority is greater than the new "max" are not changed. So even this is not much use if you want to alter the priority of a number of threads.
(I understand, that the primary motivation of thread groups was to enable things like suspending or killing a bunch1 of related threads. But that use-case went out of the window when the Sun engineers realized that suspending and killing threads was fundamentally unsafe ... and deprecated the Thread API methods for doing that.)
1 - Anyone know the proper collective noun for a group of threads is?

Thread inherit a default priority from their parents (the one which created them) however this is just a hint and is ignored in many cases on most OSes.
Threads are designed to share resources as much as possible in as light wieght (ie simple) way possible. If you want to manage resources of tasks you need to have different processes. However these are not simple or light weight i.e they are much more expensive resource and code complexity wise.
What sort of problem are you trying to solve? It is likely for any problem there is a fairly simple solution.

Related

Confusion regarding the Blocking of "peer threads" when a user-level thread blocks

I was reading about differences between threads and processes, and literally everywhere online, one difference is commonly written without much explanation:
If a process gets blocked, remaining processes can continue execution.
If a user level thread gets blocked, all of its peer threads also get
blocked.
It doesn't make any sense to me. What would be the sense of concurrency if a scheduler cannot switch between a blocked thread and a ready/runnable thread. The reason given is that since the OS doesn't differentiate between the various threads of a given parent process, it blocks all of them at once.
I find it very unconvincing, since all modern OS have thread control blocks with a thread ID, even if it is valid only within the memory space of the parent process. Like the example given in Galvin's Operating Systems book, I wouldn't want the thread which is handling my typing to be blocked if the spell checking thread cannot connect to some online dictionary, perhaps.
Either I am understanding this concept wrong, or all these websites have just copied some old thread differences over the years. Moreover, I cannot find this statement in books, like Galvin's or maybe in William Stalling's COA book where threads have been discussed.
These are resouces where I found the statements:
https://www.geeksforgeeks.org/difference-between-process-and-thread/
https://www.tutorialspoint.com/difference-between-process-and-thread
https://www.guru99.com/difference-between-process-and-thread.html
https://www.javatpoint.com/process-vs-thread
There is a difference between kernel-level and user-level threads. In simple words:
Kernel-level threads: Threads that are managed by the operating system, including scheduling. They are what is executed on the processor. That's what probably most of us think of threads.
User-level threads: Threads that are managed by the program itself. They are also called fibers or coroutines in some contexts. In contrast to kernel-level threads, they need to "yield the execution", i.e. switching from one user-level to another user-level thread is done explicitly by the program. User-level threads are mapped to kernel-level threads.
As user-level threads need to be mapped to kernel-level threads, you need to choose a suiteable mapping. You could map each user-level to a separate kernel-level thread. You could also map many user-level to one kernel-level thread. In the latter mapping, you let multiple concurrent execution paths be executed by a single thread "as we know it". If one of those paths blocks, recall that user-level threads need to yield the execution, then the executing (kernel-level) thread blocks, which causes all other assigned paths to also be effectively blocked. I think, this is what the statement refers to. FYI: In Java, user-level threads – the multithreading you do in your programs – are mapped to kernel-level threads by the JVM, i.e. the runtime system.
Related stuff:
Understanding java's native threads and the jvm
What is the difference between a thread and a fiber?
What is difference between User space and Kernel space?
What is the difference between concurrent programming and parallel programming?
Implementing threads
Back in the early days of Java at least, user-level threads were called "green threads", to implement Java threading on OSes that didn't support native threading. There's still a Wiki article https://en.wikipedia.org/wiki/Green_threads which explains the origin and meaning.
(This was back when desktops/laptops were uniprocessor systems, with a single-core CPU in their 1 physical socket, and SMP machines mostly only existed as multi-socket.)
You're right, this was terrible, and once mainstream OSes grew up to support native threads, people mostly stopped ever doing this. For Java specifically at least, Green threads refers to the name of the original thread library for the programming language Java (that was released in version 1.1 and then Green threads were abandoned in version 1.3 to native threads).
So use Java version 1.3 or later if you don't want your spell-check thread to block your whole application. :P This is ancient history.
Although there is some scope for using non-blocking IO and context switching when a system call returns that it would block, but usually it's better to let the kernel handle threads blocking and unblocking, so that's what normal modern systems do.
IIRC on Solaris there was also some use of an N:M model where N user-space threads might be handled by fewer than N kernel threads. This could mean having some "peer" threads (that share the same kernel thread) like in your quote without being fully terrible purely userspace green threads.
(i.e. only some of your total threads are sharing the same kernel thread.)
pthreads on Linux uses a 1:1 model where every software thread is a separate task for the kernel to schedule.
Google found https://flylib.com/books/en/3.19.1.51/1/ which defines those thread models and talks about them some, including the N:M hybrid model, and the N:1 user-space aka green threads model that needs to use non-blocking I/O if it wants to avoid blocking other threads. (e.g. do a user-space context switch if a system call returns EAGAIN or after queueing an async read or write.)
Okay, the other answers provide detailed information.
But to hit your main convern right in the middle:
the article is putting that a bit wrong, lacking the necessary context (see all the details in #akuzminykh 's explanation of user-level threads and kernel-level threads)
what this means for a Java programmer: don't bother with those explanations. If one of your Java threads blocks (due to I/O etc), that will have NO IMPACT on any other of your threads (unless, of course, you explicitly WANT them to, but then you'd have to explicitly use mechanisms for that)
How do Threads get blocked in Java?
If you call sleep() or wait() etc, the Thread that currently executes that code (NOT the objects you call them on) will be blocked. These will get released on certain events: sleep will finish once the timer runs out or the thread gets interrupted by another, wait will release once it gets notified by another thread.
if you run into a synchronized(lockObj) block or method: this will release once the other thread occupying that lockObj releases it
closely related to that, if you enter ThreadGates, mutexes etc, all those 1000s of specialized classes for extended thread control like rendezvous etc
If you call a blocking I/O method, like block reading from InputStream etc: int amountOfBytesRead = read(buffer, offset, length), or String line = myBufferedReader.readLine();
opposed to that, there are many non-blocking I/O operations, like most of the java.nio (non-blocking I/O) package, that return immediately, but may indicate invalid result values
If the Garbage Collector does a quick cleanup cycle (which are usually so short you will not even notice, and the Threads get released automatically again)
if you call .parallelStream() functions for certain long-lasting lambda functions on streams (like myList.parallelStream().forEach(myConsumerAction)) that - if too complex or with too many elements - get handled by automated multithreading mechanisms (which you will not notice, because after the whole stuff is done, your calling thread will resume normally, just as if a normal method was called). See more here: https://www.baeldung.com/java-when-to-use-parallel-stream

Are all Java apps single-threaded unless the programmer explicitly creates Threads or implementations of Runnable?

I'm trying to understand how Java (and the JVM) create threads under the hood.
I read Java Concurrency in Practice, and I couldn't find a good explanation of whether or not all Java apps are, by default, single- or multi-threaded.
On the one hand, from the POV of a developer: I write a pile of sequential code without creating Thread instances or implementing Runnable anywhere. Do I need to synchronize anything? Should I be making double-sure my classes are thread-safe? If so, should I stop using POJOs that have mutable fields? I read that the JVM will create multiple threads under the hood for its own business. Is the JVM also creating threads to run my application without me explicitly creating those threads?
On the other hand: I write a pile of code in which I explicitly create Threads and Runnable implementations. Does the JVM spin off its own threads to "help" my multi-threaded code run faster?
It's entirely possible I'm not even thinking about the JVM's thread handling in the right way. But, I'm an entry-level Java developer, and I hate that I find this confusing.
On the one hand, from the POV of a developer: I write a pile of sequential code without creating Thread instances or implementing
Runnable anywhere. Do I need to synchronize anything? Should I be
making double-sure my classes are thread-safe? If so, should I stop
using POJOs that have mutable fields?
The straightforward answer is that no, you do not need to proactively make your objects thread-safe to protect them from concurrent access by threads you did not create.
Generally speaking, threads that interact concurrently with the code and classes you write1 won't be be created unless you do something yourself that is known to create threads, and then you organize to share an object instance between threads. Creating a Thread object is one example of creating a thread, but there are others. Here is a non-exhaustive list:
Using Executor or ExecutorService implementations which use threads (most of them).
Use a concurrent Stream method e.g., by creating a stream with the parallelStream method.
Use a library method which creates threads behind the scenes.
So generally threads don't just pop out of nowhere but rather as a result of something you do. Even if a library creates threads that you don't know about, it doesn't matter for your concern because unless documented otherwise they will not be accessing your objects, or will use locking to ensure they access them in a serialized fashion (or the library is seriously broken).
So you generally don't need to worry about cross-thread synchronization except in places where you know threads are being used. So by default you don't need to make your objects thread-safe.
1 I'm making this distinction about "interact with code you write" because a typical JVM will use several threads behind the scenes, even if you never create any yourself, for housekeeping tasks like garbage collection, calling finalizers, listening for JMX connections, whatever.
Is the app code you wrote single- or multi-threaded? Unless you explicitly took steps to create new threads – for example, by doing something like Thread t = new Thread(); – your app code will be single-threaded.
Are there multiple threads running in a single JVM? Yes, always – there will be various things running in the background that have nothing to do with the code you wrote (like the garbage collector).
Should you guard against concurrency concerns? With a single-threaded app, there is no need. However, if your code itself creates one (or more) threads, or if your code is packaged up in some manner to be used by other app creators (maybe you've created a data structure for others to use in their code), then you might need to take steps for concurrency correctness. In the case of creating code for others to use, it's perfectly fine to declare in Javadoc that your code is not threadsafe. For example, ArrayList (https://docs.oracle.com/javase/7/docs/api/java/util/ArrayList.html) says "Note that this implementation is not synchronized" along with suggested workarounds.

When should one create new Thread Groups

I was wondering, what are the advantages of assigning threads to a thread group instead of containing them all in one (The Main) group?
Assuming there are 10 or more constantly active threads, and a couple of threads been initiated every now and again as the application requires, how would one approach grouping these?
Thanks,
Adam.
There is no advantage at all. ThreadGroups are there for backward compatibility, but I've never seen them used.
Here's what Brian Goetz (author of Java Concurrency in Practice - the bible) said about them a long time ago:
The ThreadGroup class was originally intended to be useful in
structuring collectionsof threads into groups. However, it turns out
that ThreadGroup is not all that useful. You are better off simply
using the equivalent methods in Thread. ThreadGroup does offer one
useful feature not (yet) present in Thread: the uncaughtException()
method. When a thread within a thread group exits becauseit threw an
uncaught exception, the ThreadGroup.uncaughtException() method
is called. This gives you an opportunity to shut down the system, write
a message to a log file, or restart a failed service.
Threads now have an uncauht exception handler, and this single reason to use thread groups isn't valid anymore.

Most efficient Java threading technique?

There seem to be a number of different ways in which one can create threads (Runnable vs Thread class) and also ThreadPools.
Are there any difference in terms of efficiency and which are the most efficient (in terms of performance) techniques for creating and pooling threads in Java?
If you need to handle many short and frequent requests it is better to use a ThreadPool so you can reuse threads already open and assign them Runnable tasks.
But when you need to launch a thread for a single task operation or instantiate a daemon thread that run for all the application time or for a long specific time then could be better create a single thread and terminate it when you don't need it anymore.
At the end of the day, they're all relying on the same underlying Thread-based mechanism to actually do the work. That means that if you are asking "what is the most efficient way to start a single thread?" the answer is, create a Thread object and call start() on it, because any other method will take some other steps before it eventually creates a Thread object and calls start() on it.
That doesn't mean that this is the best way to spawn threads, it just means that it is the most low-level way to do it from Java code. What the other ways to create threads give you is different types of infrastructure to manage the underlying Threads, so your choice of method should depend on the amount and kind of infrastructure you need.

What is the benefit of ThreadGroup in java over creating separate threads?

Many methods like stop(), resume(), suspend() etc are deprecated.
So is it useful to create threads using ThreadGroup?
Using ThreadGroup can be a useful diagnostic technique in big application servers with thousands of threads. If your threads are logically grouped together, then when you get a stack trace you can see which group the offending thread was part of (e.g. "Tomcat threads", "MDB threads", "thread pool X", etc), which can be a big help in tracking down and fixing the problem.
Don't use ThreadGroup for new code. Use the Executor stuff in java.util.concurrent instead.
Somewhat complimentary to the answer provided (6 years ago or so). But, while the Concurrency API provides a lot of constructs, the ThreadGroup might still be useful to use. It provides the following functionality:
Logical organisation of your threads (for diagnostic purposes).
You can interrupt() all the threads in the group. (Interrupting is perfectly fine, unlike suspend(), resume() and stop()).
You can set the maximum priority of the threads in the group. (not sure how widely useful is that, but there you have it).
Sets the ThreadGroup as a daemon. (So all new threads added to it will be daemon threads).
It allows you to override its uncaughtExceptionHandler so that if one of the threads in the group throws an Exception, you have a callback to handle it.
It provides you some extra tools such as getting the list of threads, how many active ones you have etc. Useful when having a group of worker threads, or some thread pool of some kind.
The short answer is - no, not really. There's little if any benefit to using one.
To expand on that slightly, if you want to group worker threads together you're much better off using an ExecutorService. If you want to quickly count how many threads in a conceptual group are alive, you still need to check each Thread individually (as ThreadGroup.activeCount() is an estimation, meaning it's not useful if the correctness of your code depends on its output).
I'd go so far as to say that the only thing you'd get from one these days, aside from the semantic compartmentalisation, is that Threads constructed as part of a group will pick up the daemon flag and a sensible name based on their group. And using this as a shortcut for filling in a few primitives in a constructor call (which typically you'd only have to write once anyway, sicne you're probably starting the threads in a loop and/or method call).
So - I really don't see any compelling reason to use one at all. I specifically tried to, a few months back, and failed.
EDIT - I suppose one potential use would be if you're running with a SecurityManager, and want to assert that only threads in the same group can interrupt each other. Even that's pretty borderline, as the default implementation always returns true for a Thread in any non-system thread group. And if you're implementing your own SecurityManager, you've got the possibility to have it make its decision on any other criteria (including the typical technique of storing Threads in collections as they get created).
Great answer for #skaffman. I want to add one more advantage:
Thread groups helps manipulating all the threads which are defined in this at once.

Categories

Resources