Assuming a control thread has access to a bunch of threads and to the objects this thread would wait on. Which one will have a greater impact in performance if I have to start and stop, what several of these threads are doing, from this single control thread ?
Wouldn't it just be better for example to kill it via interruption and just create a new one with the same Runnable?
Creating (actually start()-ing) a new thread is relatively expensive, so from a performance perspective it would be better to use wait / notify.
Secondly, interrupt is not guaranteed to "stop" a thread. The thread may choose to ignore the interrupt ... or if it is purely CPU bound, it may not notice it at all.
There is also a third option: use an existing thread pool mechanism. For example, the ExecutorService API has a various implementations that provided bounded and unbounded thread pools. These can take care of scaling up and down, and pool shutdown. You use them by submit(...)-ing tasks as Runnable instances and you optionally get a Future that allows you to wait for the task completion.
Finally, for most concurrent programming use-cases, there are standard classes that support the use-case, and it is better to use them rather than attempting to implement from scratch; e.g. using wait / notify directly. In your case, you probably need some kind of "barrier" mechanism: java.util.concurrent.Phaser might be the one that you need.
Threads are fairly independent from one another and in most cases each thread will know better than the control thread when it's ready to terminate. Killing a thread is very abrupt thing, it's much better to wait for the threat to terminate itself cleanly.
Related
A few words about what I'm planing to do. I need to create some task executor, that will poll tasks from queue and just execute code in this task. And for this I need to implement some interrupt mechanism to enable user to stop this task.
So I see two possible solutions: 1. start a pool of threads and stop them by using .destroy() method of a thread. (I will not use any shared objects) 2. Use pool of separated processes and System.exit() or kill signal to process. Option 2. looks much safer for me as I can ensure that thread killing will not lead to any concurrency problems. But I'm not sure that it won't produce a big overhead.
Also I'm not sure about JVM, if I will use separated processes, each process will be using the separated JVM, and it can bring a lot of overhead. Or not. So my question in this. Choosing a different language without runtime for worker process is possible option for me, but I still don't have enough experience with processes and don't know about overhead.
start a pool of threads and stop them by using .destroy() method of a thread. (I will not use any shared objects)
You can't stop threads on modern VMs unless said thread is 'in on it'. destroy and friends do not actually do what you want and this is unsafe. The right way is to call interrupt(). If the thread wants to annoy you and not actually stop in the face of an interrupt call, they can. The solution is to fix the code so that it doesn't do that anymore. Note that raising the interrupt flag will guaranteed stop any method that is sleeping which is specced to throw InterruptedException (sleep, wait, etc), and on most OSes, will also cause any I/O call that is currently frozen to exit by throwing an IOException, but there is no guarantee for this.
Use pool of separated processes and System.exit() or kill signal to process.
Hella expensive; a VM is not a light thing to spin up; it'll have its own copy of all the classes (even something as simple as java.lang.String and company). 10 VMs is a stretch. Whereas 1000 threads is no problem.
And for this I need to implement some interrupt mechanism to enable user to stop this task.
The real problem is that this is very difficult to guarantee. But if you control the code that needs interrupting, then usually no big deal. Just use the interrupt() mechanism.
EDIT: In case you're wondering how to do the interrupt thing: Raising the interrupt flag on a thread just raises the flag; nothing else happens unless you write code that interacts with it, or call a method that does.
There are 3 main interactions:
All things that block and are declared to throw InterruptedEx will lower the flag and throw InterruptedEx. If the flag is up and you call Thread.sleep, that will immediately_ clear the flag and throw that exception without ever even waiting. Thus, catch that exception, and return/abort/break off the task.
Thread.interrupted() will lower the flag and return true (thus, does so only once). Put this in your event loops. It's not public void run() {while (true) { ... }} or while (running) {} or whatnot, it's while (!Thread.interrupted() or possibly while (running && !Thread.interrupted9)).
Any other blocking method may or may not; java intentionally doesn't specify either way because it depends on OS and architecture. If they do (and many do), they can't throw interruptedex, as e.g. FileInputStream.read isn't specced to throw it. They throw IOException with a message indicating an abort happened.
Ensure that these 3 code paths one way or another lead to a task that swiftly ends, and you have what you want: user-interruptible tasks.
Executors framework
Java already provides a facility with your desired features, the Executors framework.
You said:
I need to create some task executor, that will poll tasks from queue and just execute code in this task.
The ExecutorService interface does just that.
Choose an implementation meeting your needs from the Executors class. For example, if you want to run your tasks in the sequence of their submission, use a single-threaded executor service. You have several others to choose from if you want other behavior.
ExecutorService executorService = Executors.newSingleThreadExecutor() ;
You said:
start a pool of threads
The executor service may be backed by a pool of threads.
ExecutorService executorService = Executors.newFixedThreadPool( 3 ) ; // Create a pool of exactly three threads to be used for any number of submitted tasks.
You said:
just execute code in this task
Define your task as a class implementing either Runnable or Callable. That means your class carries a run method, or a call method.
Runnable task = ( ) -> System.out.println( "Doing this work on a background thread. " + Instant.now() );
You said:
will poll tasks from queue
Submit your tasks to be run. You can submit many tasks, either of the same class or of different classes. The executor service maintains a queue of submitted tasks.
executorService.submit( task );
Optionally, you may capture the Future object returned.
Future future = executorService.submit( task );
That Future object lets you check to see if the task has finished or has been cancelled.
if( future.isDone() ) { … }
You said:
enable user to stop this task
If you want to cancel the task, call Future::cancel.
Pass true if you want to interrupt the task if it has already begun execution.
Pass false if you only want to cancel the task before it has begun execution.
future.cancel( true );
You said:
looks much safer for me as I can ensure that thread killing will not lead to any concurrency problems.
Using the Executors framework, you would not be creating or killing any threads. The executor service implementation handles the threads. Your code never addresses the Thread class directly.
So no concurrency problems of that kind.
But you may have other concurrency problems if you share any resources across threads. I highly recommend reading Java Concurrency in Practice by Brian Goetz et al.
You said:
But I'm not sure that it won't produce a big overhead.
As the correct Answer by rzwitserloot explained, your approach would certainly create much more overhead that would the use of the Executors framework.
FYI, in the future Project Loom will bring virtual threads (fibers) to the Java platform. This will generally make background threading even faster, and will make practical having many thousands or even millions of non-CPU-bound tasks. Special builds available now on early-access Java 16.
ExecutorService executorService = newVirtualThreadExecutor() ;
executorService.submit( task ) ;
I have only two short-lived tasks to run in the background upon the start of the application. Would it make sense to use a thread for each task or an Executor, for instance, a single thread executor to submit these two tasks.
Does it make sense to create two threads that die quickly as opposed to having a single threaded executor waiting for tasks throughout the lifecycle of the application when there are none?
One big benefit of using a threadpool is that you avoid the scenario where you have some task that you perform repeatedly then, if something goes wrong with that task that causes the thread to hang, you're at risk of losing a thread every time the task happens, resulting in running the application out of threads. If your threads only run once on startup then it seems likely that risk wouldn't apply to your case.
You could still use Executor, but shut it down once your tasks have both run. It might be preferable to use Futures or a CompletionService over raw threads.
If you do this more than once in your application, ThreadPoolExecutor is definitely worth a look.
One benefit is the pooling of threads. This releaves the runtime to create and destroy OS objects every time you need a thread. Additionally you get control of the amount of threads spawned - but this seems not the big issue for you - and threads running/done.
But if you actually really only spawn two threads over the runtime of your application, the executors may be oversized, but they are nevertheless very comfortable to work with.
Since Nathan added Futures, there is also Timer and TimerTask. Also very convenient for "Fire and Forget" type of background action :-).
I am creating thread pools like this:
ExecutorService workers = Executors.newCachedThreadPool();
Invoking each pool tasks like this:
workers.invokeAll(tasks);
And after completion shutting those down like this:
workers.shutdown();
I have about 4 thread pools that do different procedures and those thread pools are being created from a servlet class.
What I want to do is shutdown all threads in those thread pools.
What is the cleanest way to achieve this?
Thanks
If all your worker tasks handle interrupts properly you could try to invoke:
workers.shutdownNow()
That call with typically send interrupts too all worker threads. However, proper interrupt handling is a bit implicit and the method documentation says that only a best effort attempt to stop the tasks is made. Hence, some JVM implementations might make a worse attempt than sending interrupts, why you might not want to trust this call.
You might want to look into other answers how to gracefully ensure proper shutdown of threads and implement such a solution for all your worker tasks, to guarantee proper shutdown. For example, in this answer, Jack explains the typical solution to have a volatile field that you can check in your workers. This field can be set from where you want to stop your tasks.
I know the thread pool is a good thing because it can reuse threads and thus save the cost of creating new threads. But my question is, are there any disadvantages of using a thread pool? In which situation is using a thread pool not as good as using just individual threads?
In which situation is using a thread pool not as good as using just individual threads?
The only time I can think of is when you have a single thread that only needs to do a single task for the life of your program. Something like a background thread attached to a permanent cache or something. That's about the only time I fork a thread directly as opposed to using an ExecutorService. Even then, using a Executor.newSingleThreadExecutor() would be fine. The overhead of the thread-pool itself is maybe a bit more logic and some memory but very hard to see a pressing downside.
Certainly anytime you need multiple threads to perform tasks, a thread-pool is warranted. What the ExecutorService code does is reduce the amount of code you need to write to manage the threads. The improvements in readability and code maintainability is a big win.
Threadpool is suitable only when you use it for operations that takes less time to complete. Threadpool threads are not suitable for long running operations, as it can easily lead to thread starvation.
If you require your thread to have a specific priority, then threadpool thread is not suitable.
You have tasks that cause the thread to block for long periods of time. The thread pool has a maximum number of threads, so a large number of blocked thread pool threads might prevent tasks from starting.
You've got a bunch of different answers here. I think one reason for that is the question is incomplete. You are asking for "disadvantages of using a thread pool," but you didn't say, disadvantages compared to what?
A thread pool solves a particular problem. There are other problems where "thread" or "threads" is part of the solution, but "thread pool" is not. "Thread pool" usually is the answer, when the question is, how to achieve parallel execution of many, short-lived, CPU-intensive tasks, on a multi-processor system.
Threads are useful, even on a uni-processor, for other purposes. The first question I ask about any long-running thread, for example, is "what does it wait for." Threads are an excellent tool for organizing a program that has to wait for different kinds of event. You would not use a thread pool for that, though.
In addition to Gray's answer.
Other use-case is if you are using thread local or using thread as a key of some kind of hash table or stateful custom implementation of thread. In this case you have to care about cleaning the state when particular task finished using the thread even if it failed. Otherwise some surprises are possible: next task that uses thread that has some state can start functioning wrong.
Thread pools of limited size are dangerous if the tasks running on it exchange information via blocking queues - this may cause a thread starvation: What is starvation?. Good rule is to never use blocking operation in the tasks running on a thread pool.
Theads are better when you don't plan to stop using the thread. For instance in an infinite loop. Threadpools are best when doing many tasks that don't happen all at the same time. Especially when the tasks are short the overhead and clarity of using the same thread is bigger.
It depends on the situation you are going to utilize the thread pool. For example, if your system does not need to perform tasks in parallel, a threading pool would be in no use. It would keep unnecessary threads ready for a work that will never come. In such cases you can use a SingleThreadExecutor anyway. Check this link if you haven't it may clarify you about it: Thread Pool Pattern
Could please somebody tell me a real life example where it's convenient to use this factory method rather than others?
newSingleThreadExecutor
public static ExecutorService newSingleThreadExecutor()
Creates an Executor that uses a single worker thread operating off an
unbounded queue. (Note however that if this single thread terminates
due to a failure during execution prior to shutdown, a new one will
take its place if needed to execute subsequent tasks.) Tasks are
guaranteed to execute sequentially, and no more than one task will be
active at any given time. Unlike the otherwise equivalent
newFixedThreadPool(1) the returned executor is guaranteed not to be
reconfigurable to use additional threads.
Thanks in advance.
Could please somebody tell me a real life example where it's convenient to use [the newSingleThreadExecutor() factory method] rather than others?
I assume you are asking about when you use a single-threaded thread-pool as opposed to a fixed or cached thread pool.
I use a single threaded executor when I have many tasks to run but I only want one thread to do it. This is the same as using a fixed thread pool of 1 of course. Often this is because we don't need them to run in parallel, they are background tasks, and we don't want to take too many system resources (CPU, memory, IO). I want to deal with the various tasks as Callable or Runnable objects so an ExecutorService is optimal but all I need is a single thread to run them.
For example, I have a number of timer tasks that I spring inject. I have two kinds of tasks and my "short-run" tasks run in a single thread pool. There is only one thread that executes them all even though there are a couple of hundred in my system. They do routine tasks such as checking for disk space, cleaning up logs, dumping statistics, etc.. For the tasks that are time critical, I run in a cached thread pool.
Another example is that we have a series of partner integration tasks. They don't take very long and they run rather infrequently and we don't want them to compete with other system threads so they run in a single threaded executor.
A third example is that we have a finite state machine where each of the state mutators takes the job from one state to another and is registered as a Runnable in a single thread-pool. Even though we have hundreds of mutators, only one task is valid at any one point in time so it makes no sense to allocate more than one thread for the task.
Apart from the reasons already mentioned, you would want to use a single threaded executor when you want ordering guarantees, i.e you need to make sure that whatever tasks are being submitted will always happen in the order they were submitted.
The difference between Executors.newSingleThreadExecutor() and Executors.newFixedThreadPool(1) is small but can be helpful when designing a library API. If you expose the returned ExecutorService to users of your library and the library works correctly only when the executor uses a single thread (tasks are not thread safe), it is preferable to use Executors.newSingleThreadExecutor(). Otherwise the user of your library could break it by doing this:
ExecutorService e = myLibrary.getBackgroundTaskExecutor();
((ThreadPoolExecutor)e).setCorePoolSize(10);
, which is not possible for Executors.newSingleThreadExecutor().
It is helpful when you need a lightweight service which only makes it convenient to defer task execution, and you want to ensure only one thread is used for the job.