Which is easier and more suitable to use for running things in another thread, notably so that the program waits for the result but doesn't lock up an ui.
There may be a method that is better than either of these also, but I don't know of them.
Thanks :)
Runnable represents the code to be executed.
Executor and its subclasses represent execution strategies.
This means that the former is actually consumed by the later. What you probably meant is: between simple threads and executors, which are more suitable?
The answer to this question is basically: it depends.
Executors are sophisticated tools, which let you choose how many concurrent tasks may be running, and tune different aspects of the execution context. They also provide facilities to monitor the tasks' executions, by returning a token (called a Future or sometimes a promise) which let the code requesting the task execution to query for that task completion.
Threads are less elaborate (or more barebone) a solution to executing code asynchronously. You can still have them return a Future by hand, or simply check if the thread is still running.
So maybe depending on much sophistication you require, you will pick one or the other: Executors for more streamlined requirements (many tasks to execute and monitor), Threads for one shot or simpler situations.
Related
A few words about what I'm planing to do. I need to create some task executor, that will poll tasks from queue and just execute code in this task. And for this I need to implement some interrupt mechanism to enable user to stop this task.
So I see two possible solutions: 1. start a pool of threads and stop them by using .destroy() method of a thread. (I will not use any shared objects) 2. Use pool of separated processes and System.exit() or kill signal to process. Option 2. looks much safer for me as I can ensure that thread killing will not lead to any concurrency problems. But I'm not sure that it won't produce a big overhead.
Also I'm not sure about JVM, if I will use separated processes, each process will be using the separated JVM, and it can bring a lot of overhead. Or not. So my question in this. Choosing a different language without runtime for worker process is possible option for me, but I still don't have enough experience with processes and don't know about overhead.
start a pool of threads and stop them by using .destroy() method of a thread. (I will not use any shared objects)
You can't stop threads on modern VMs unless said thread is 'in on it'. destroy and friends do not actually do what you want and this is unsafe. The right way is to call interrupt(). If the thread wants to annoy you and not actually stop in the face of an interrupt call, they can. The solution is to fix the code so that it doesn't do that anymore. Note that raising the interrupt flag will guaranteed stop any method that is sleeping which is specced to throw InterruptedException (sleep, wait, etc), and on most OSes, will also cause any I/O call that is currently frozen to exit by throwing an IOException, but there is no guarantee for this.
Use pool of separated processes and System.exit() or kill signal to process.
Hella expensive; a VM is not a light thing to spin up; it'll have its own copy of all the classes (even something as simple as java.lang.String and company). 10 VMs is a stretch. Whereas 1000 threads is no problem.
And for this I need to implement some interrupt mechanism to enable user to stop this task.
The real problem is that this is very difficult to guarantee. But if you control the code that needs interrupting, then usually no big deal. Just use the interrupt() mechanism.
EDIT: In case you're wondering how to do the interrupt thing: Raising the interrupt flag on a thread just raises the flag; nothing else happens unless you write code that interacts with it, or call a method that does.
There are 3 main interactions:
All things that block and are declared to throw InterruptedEx will lower the flag and throw InterruptedEx. If the flag is up and you call Thread.sleep, that will immediately_ clear the flag and throw that exception without ever even waiting. Thus, catch that exception, and return/abort/break off the task.
Thread.interrupted() will lower the flag and return true (thus, does so only once). Put this in your event loops. It's not public void run() {while (true) { ... }} or while (running) {} or whatnot, it's while (!Thread.interrupted() or possibly while (running && !Thread.interrupted9)).
Any other blocking method may or may not; java intentionally doesn't specify either way because it depends on OS and architecture. If they do (and many do), they can't throw interruptedex, as e.g. FileInputStream.read isn't specced to throw it. They throw IOException with a message indicating an abort happened.
Ensure that these 3 code paths one way or another lead to a task that swiftly ends, and you have what you want: user-interruptible tasks.
Executors framework
Java already provides a facility with your desired features, the Executors framework.
You said:
I need to create some task executor, that will poll tasks from queue and just execute code in this task.
The ExecutorService interface does just that.
Choose an implementation meeting your needs from the Executors class. For example, if you want to run your tasks in the sequence of their submission, use a single-threaded executor service. You have several others to choose from if you want other behavior.
ExecutorService executorService = Executors.newSingleThreadExecutor() ;
You said:
start a pool of threads
The executor service may be backed by a pool of threads.
ExecutorService executorService = Executors.newFixedThreadPool( 3 ) ; // Create a pool of exactly three threads to be used for any number of submitted tasks.
You said:
just execute code in this task
Define your task as a class implementing either Runnable or Callable. That means your class carries a run method, or a call method.
Runnable task = ( ) -> System.out.println( "Doing this work on a background thread. " + Instant.now() );
You said:
will poll tasks from queue
Submit your tasks to be run. You can submit many tasks, either of the same class or of different classes. The executor service maintains a queue of submitted tasks.
executorService.submit( task );
Optionally, you may capture the Future object returned.
Future future = executorService.submit( task );
That Future object lets you check to see if the task has finished or has been cancelled.
if( future.isDone() ) { … }
You said:
enable user to stop this task
If you want to cancel the task, call Future::cancel.
Pass true if you want to interrupt the task if it has already begun execution.
Pass false if you only want to cancel the task before it has begun execution.
future.cancel( true );
You said:
looks much safer for me as I can ensure that thread killing will not lead to any concurrency problems.
Using the Executors framework, you would not be creating or killing any threads. The executor service implementation handles the threads. Your code never addresses the Thread class directly.
So no concurrency problems of that kind.
But you may have other concurrency problems if you share any resources across threads. I highly recommend reading Java Concurrency in Practice by Brian Goetz et al.
You said:
But I'm not sure that it won't produce a big overhead.
As the correct Answer by rzwitserloot explained, your approach would certainly create much more overhead that would the use of the Executors framework.
FYI, in the future Project Loom will bring virtual threads (fibers) to the Java platform. This will generally make background threading even faster, and will make practical having many thousands or even millions of non-CPU-bound tasks. Special builds available now on early-access Java 16.
ExecutorService executorService = newVirtualThreadExecutor() ;
executorService.submit( task ) ;
Use case: tasks are generated in one thread, need to be distributed for computation to many threads and finally the generating task shall reap the results and mark the tasks as done.
I found the class ExecutorCompletionService which fits the use case nearly perfectly --- except that I see no good solution for non-idle waiting. Let me explain.
In principle my code would look like
while (true) {
MyTask t = generateNextTask();
if (t!=null) {
completionService.submit(t);
}
MyTask finished;
while (null!=(finished=compService.poll())) {
retireTaks(finished);
}
}
Both, generateNextTask() and completionService.poll() may return null if there are currently no new tasks available and if currently no task has returned from the CompletionService respectively.
In these cases, the loop degenerates into an ugly idle-wait. I could poll() with a timeout or add a Thread.sleep() for the double-null case, but I consider this a bad workaround, because it nevertheless wastes CPU and is not as responsive as possible, due to the wait.
Suppose I replace generateNextTask() by a poll() on a BlockingQueue, is there good way to poll the queue as well as the CompletionService in parallel to be woken up for work on whichever end something becomes available?
Actually this reminds me of Selector. Is something like it available for queues?
You should use CompletionService.take() to wait until the next task completes and retrieve its Future. poll() is the non-blocking version, returning null if no task is currently completed.
Also, your code seems to be inefficient, because you produce and consume tasks one at a time, instead of allowing multiple tasks to be processed in parallel. Consider having a different thread for task generation and for task results consumption.
-- Edit --
I think that given the constraints you mention in your comments, you can't achieve all your requirements.
Requiring the main thread to be producer and consumer, and disallowing any busy loop or timed loop, you can't avoid the scenario where a blocking wait for a task completion takes too long and no other task gets processed in the meanwhile.
Since you "can replace generateNextTask() by a poll() on a BlockingQueue", I assume incoming tasks can be put in a queue by some other thread, and the problem is, you cannot execute take() on 2 queues simultaneously. The solution is to simply put both incoming and finished tasks in the same queue. To differentiate, wrap them in objects of different types, and then check that type in the loop after take().
This solution works, but we can go further. You said you don't want to use 2 threads for handling tasks - then you can use zero threads. Let wrappers implement Runnable and, instead of checking of the type, you just call take().run(). This way your thread become a single-threaded Executor. But we already have an Executor (CompletionService), can we use it? The problem is, handling of incoming and finished tasks should be done serially, not in parallel. So we need SerialExecutor described in api/java/util/concurrent/Executor, which accepts Runnables and executes them serially, but on another executor. This way no thread is wasted.
And finally, you mentioned Selector as possible solution. I must say, it is an outdated approach. Learn dataflow and actor computing. Nice introduction is here. Look at Dataflow4java project of mine, it has MultiPortActorTest.java example, where class Accum does what you need, with all the boilerplate with wrapper Runnables and serial executors hidden in the supporting library.
What you need is a ListenableFuture from Guava. ListenableFutureExplained
I am working on an application that at some point starts a worker thread. This thread's behaviour will vary greatly depending on the parameters used to start it, but the following list of properties apply:
It will do some minor I/O operations
It will spend minor time in 3rd party libraries
It may create some worker threads for a certain subtask (these threads will not be reused after their task is finished)
It will spend most of its time crunching numbers (there are no blocking calls present)
Due to the possible long duration (5 minutes up to several hours, depending on the input), we want to be able to abort the calculation. If we choose to abort it, we no longer care about the output, and the thread is in fact wasting valuable resources as long as it keeps running. Since the code is under our control, the advised way is to use interrupts to indicate an abort.
While most examples on the web deal with a worker thread that is looping over some method, this is not the case for me (similar question here). There are also very few blocking calls in this work thread, in which case this article advises to manually check the interrupt flag. My question is: How to deal with this interrupt?
I see several options, but can't decide which is the most "clean" approach. Despite my practical example, I'm mainly interested in the "best practice" on how to deal with this.
Throw some kind of unchecked exception: this would kill the thread in a quick and easy way, but it reminds me of the ThreadDeath approach used by the deprecated Thread#stop() method, with all its related problems. I can see this approach being acceptable in owned code (due to the known logic flow), but not in library code.
Throw some kind of checked exception: this would kill the thread in a quick and easy way, and alleviates the ThreadDeath-like problems by enforcing programmers to deal with this event. However, it places a big burden on the code, requiring the exception to be mentioned everywhere. There is a reason not everything throws an InterruptedException.
Exit the methods with a "best result so far" or empty result. Because of the amount of classes involved, this will be a very hard task. If not enough care is taken, NullPointerExceptions might arise from empty results, leading to the same problems as point 1. Finding these causes would be next to impossible in large code bases.
I suggest you check Thread.currentThread().isInterrupted() periodically at points you knwo it is safe to stop and stop if it is set.
You could do this in a method which checks this flag and throws a custom unchecked exception or error.
What about a use of ExecutorService to execute the Runnable? Checkout the methods wherein you can specify the timeout. E.g.
ExecutorService executor = Executors.newSingleThreadExecutor();
executor.invokeAll(Arrays.asList(new Task()), 10, TimeUnit.MINUTES); // Timeout of 10 minutes.
executor.shutdown();
Here Task of course implements Runnable.
Could please somebody tell me a real life example where it's convenient to use this factory method rather than others?
newSingleThreadExecutor
public static ExecutorService newSingleThreadExecutor()
Creates an Executor that uses a single worker thread operating off an
unbounded queue. (Note however that if this single thread terminates
due to a failure during execution prior to shutdown, a new one will
take its place if needed to execute subsequent tasks.) Tasks are
guaranteed to execute sequentially, and no more than one task will be
active at any given time. Unlike the otherwise equivalent
newFixedThreadPool(1) the returned executor is guaranteed not to be
reconfigurable to use additional threads.
Thanks in advance.
Could please somebody tell me a real life example where it's convenient to use [the newSingleThreadExecutor() factory method] rather than others?
I assume you are asking about when you use a single-threaded thread-pool as opposed to a fixed or cached thread pool.
I use a single threaded executor when I have many tasks to run but I only want one thread to do it. This is the same as using a fixed thread pool of 1 of course. Often this is because we don't need them to run in parallel, they are background tasks, and we don't want to take too many system resources (CPU, memory, IO). I want to deal with the various tasks as Callable or Runnable objects so an ExecutorService is optimal but all I need is a single thread to run them.
For example, I have a number of timer tasks that I spring inject. I have two kinds of tasks and my "short-run" tasks run in a single thread pool. There is only one thread that executes them all even though there are a couple of hundred in my system. They do routine tasks such as checking for disk space, cleaning up logs, dumping statistics, etc.. For the tasks that are time critical, I run in a cached thread pool.
Another example is that we have a series of partner integration tasks. They don't take very long and they run rather infrequently and we don't want them to compete with other system threads so they run in a single threaded executor.
A third example is that we have a finite state machine where each of the state mutators takes the job from one state to another and is registered as a Runnable in a single thread-pool. Even though we have hundreds of mutators, only one task is valid at any one point in time so it makes no sense to allocate more than one thread for the task.
Apart from the reasons already mentioned, you would want to use a single threaded executor when you want ordering guarantees, i.e you need to make sure that whatever tasks are being submitted will always happen in the order they were submitted.
The difference between Executors.newSingleThreadExecutor() and Executors.newFixedThreadPool(1) is small but can be helpful when designing a library API. If you expose the returned ExecutorService to users of your library and the library works correctly only when the executor uses a single thread (tasks are not thread safe), it is preferable to use Executors.newSingleThreadExecutor(). Otherwise the user of your library could break it by doing this:
ExecutorService e = myLibrary.getBackgroundTaskExecutor();
((ThreadPoolExecutor)e).setCorePoolSize(10);
, which is not possible for Executors.newSingleThreadExecutor().
It is helpful when you need a lightweight service which only makes it convenient to defer task execution, and you want to ensure only one thread is used for the job.
This is an interview question, that means this could be done in a short time.
I thought to ask here because I cannot figure out what to do if I were asked.
"Design and code a task scheduler that can take unsynchronized or synchronized tasks"
Please use your imagination/assumption and share your thoughts and comments.
This question is deliberately vague, it's suppose to show how good you are at designing and solving problems, what kind of assumptions do you make, how you justify them, etc. There is no single, good answer. It's a matter of approaching the problem.
That being said here is my take:
My scheduler can take arbitrary Runnable or Callable<V>, I will implement ScheduledExecutorService because it seems to be a good abstraction for the problem. I am using as many standard classes as I can to make API portable and easy to use.
By unsychronized and synchronized I understand: safe to run concurrently and those that require exclusive lock. I.e. the scheduler is not allowed to run two synchronized tasks at the same time.
The distinction between synchronized and unsychronized tasks will be made using marker interface. Annotation is also fine, but harder to extract at runtime.
I won't give you the full implementation, but it'll probably wrap some standard ScheduledExecutorService with an additional synchronization for synchronized tasks. I think ConcurrentMap<Class, Semaphore> would do. Before running tasks marked as synchronized I make sure no other synchronized task of the same time is running. I block and wait or reject (this can be configurable).
I would use an ExecutorService as it's built in and does most of the things you would want. It doesn't care if those tasks uses synchronized or not.