Waiting for another thread in executorservice scenario - java

Suppose there are three threads created using executor service and now I want that t2 would start running after t1 and t3 would start running after t2. how to achieve this kind of scenario in case of thread pool?
If it would have any normal thread creating using thread.start(). I could have waited using join() method. But how to handle above scenario?

Thread t1,t2 and t3 can implement callable interface and from the call method you can return some value.
Based on the return value, after t1 returns, you can initiate t2 and similarly for t3.
"Callable" is the answer for it

You are confusing the notion of threads and what is executed on a thread. It doesn't matter when a thread "starts" in a thread pool but when execution of your processing begins or continues. So the better statement is that you have 3 Callables or Runnables and you need one of the to wait for the other two before continuing. This is done using a CountDownLatch. Create a shared latch with a count of 2. 2 of the Callables will call countDown() on the latch, the one that should wait will call await() (possibly with a timeout).

Jobs submitted to an ExecutorService must be mutually independent. If you try to establish dependencies by waiting on Semaphores, CountDownLatches or similar, you run the risk of blocking the whole Service, when all available worker threads execute jobs that wait for a jobs that has been submitted, but is behind the current jobs in the queue. You want to make sure you have more workers than possible blocking jobs. In most cases, it is better to use more than one ExecutorService and submit each job of a dependent group to a different Service.

A few options:
If this is the only scenario you have to deal with (t1->t2->t3), don't use a thread pool. Run the three tasks sequentially.
Use some inter-thread notification mechanism (e.g. BlockingQueue, CountDownLatch). This requires your tasks to hold a shared reference to the synchronization instrument you choose.
Wrap any dependence sequence with a new runnable/callable to be submitted as a single task. This approach is simple, but won't deal correctly with non-linear dependency topologies.
Every task that depends on another task should submit the other task for execution, and wait for its completion. This is a generic approach for thread pools with dependencies, but it requires a careful tuning to avoid possible deadlocks (running tasks may wait for tasks which don't have an available thread to run on. See my response here for a simple solution).

Related

A couple of questions regarding Java ExecutorService newFixedThreadPool

Please note that I usually ask a question after googling for more than 20 times about the issue. But I can't still understand it. So I need your help.
Basically, I don't understand the exact usage of newFixedThreadPool
Does newFixedThreadPool(10) mean having ten different threads? Or does it mean it can have 10 of the same threads? or the both?
I executed with submit() methods more than 20 times and it's working.
Does submit() print a value? Or are you putting threads in the ExecutorService?
Briefly, tasks are small units of code that could be executed in parallel (code sections). The threads (in a thread pool) are what execute them. You can think of the threads like workers and the tasks like jobs. Jobs can be done in parallel, and workers can work in parallel. Workers work on jobs.
So, to answer your questions:
newFixedThreadPool(int nThreads) creates a thread pool of nThread threads that operate on the same input queue. nThreads is the maximum number of threads that can be running at any given time. Each thread can run a different task. With your example, you can be running up to 10 tasks at the same time. (The documentation can be found here with credit to #hovercraft-full-of-eels)
submit() pushes the given task into an event queue that is shared by the threads in the thread pool. Once a thread is available, it will take a task from the front of the queue and execute it. It shouldn't print anything, unless the Runnable you pass it has a print statement in it. However, the print statement may not be printed right when you submit the task! It will print once a thread is executing that particular task. (The documentation can be found here)
Just refer java docs or JAVA API's description rather than googling it.
For your questions I have below comments .
Question 1 ->
ExecutorService executorService = Executors.newFixedThreadPool(10);
First an ExecutorService is created using the Executors newFixedThreadPool() factory method. This creates a thread pool with 10 threads executing tasks.
Executors.newFixedThreadPool API creates, a thread pool that reuses a fixed number of threads and these threads work on a s***hared unbounded queue***.
At any point, at most nThreads threads will be active processing tasks.
If additional tasks are submitted when all threads are active, they will wait in the queue until a thread is available.
If any thread terminates due to a failure during execution prior to shutdown, a new one will take its place if needed to execute subsequent tasks. The threads in the pool will exist until it is explicitly SHUTDOWN.
After submitting even 20 tasks ,it worked with this thread pool.
Internally it calls below line of codes .
public static ExecutorService newFixedThreadPool(int nThreads) {
return new ThreadPoolExecutor(nThreads, nThreads,
0L, TimeUnit.MILLISECONDS,
new LinkedBlockingQueue());
}
Question 2- > Submits a Runnable task for execution in Queue and it can also return an Object of type Future Object representing task. we can use Future's get method to check whether submitted task has successfully completed or not because it will return null upon successful completion.

Using a ScheduledThreadPool with parallel execution

Currently, I have an application that collects data every second and sends it to an API endpoint. To run every second, I am using a ScheduledThreadPoolExecutor that runs the thread which sends the data. The issue is the sending of the data sometimes takes more than one second, and this results in the next sequence of data to be collected more than a second later. Is there any way this can be changed (or other libraries can be used) so that even if a thread is not finished sending the data, another thread can start running in parallel?
The usual way to deal with the desire for overlapping executions of the same scheduled task is to execute the (time consuming) business logic of the task asynchronously.
In other words, when the once-per-second task is triggered, submit the real work to an ExecutorService (either the one you are using for the scheduled tasks or another one). This way, the scheduled task has already finished it's work (to queue the actual work) long before it is time for it to execute again.
Separate out the data collection and send tasks.
Data collection on a separate Thread pool (or a scheduled single thread) and submit the data to another pool whose job is to publish the data
Assuming you are not concerned about out of order invocations on the "API endpoint" then you can create the ScheduledThreadPoolExecutor with a corePoolSize > 1. In this way, every time the scheduler kicks in it will use the first available thread in the pool. And given a corePoolSize > 1 you would need several invocations to take more than 1s before you'd run out of threads.
For additional context: a ScheduledThreadPoolExecutor has a scheduling thread which checks for tasks and on finding one it delegates the task to a worker thread from its internal pool. If the internal pool has a single thread (i.e. corePoolSize=1) then all tasks are executed seriallly and you cannot guarantee that the tasks will be executed every _wait_period_ (though you can be certain about ordering). If you want to insist on the tasks running on schedule and you are not concerned about ordering then you can configure the pool with a corePoolSize which ensures that there is always an available thread in the 'worker' pool every time the scheduler finds a task.
Edit 1: if you are using scheduleAtFixedRate then the other answer which refers to delegating the scheduled invocation to a separate thread pool is an option. If you adopt this approach then corePoolSize=1 will be sufficient since the 'worker' thread is then only reponsible for delegating the task to a separate pool.

Queues for every thread in thread pool

As far I know, thread pools (java.util.concurrent.Executor class) provide a queue of tasks for all threads in a pool. So I don't really know, which thread will execute my task. But I need to have queues of tasks assigned to every thread. How can I do it?
If you want only certain threads to execute certain tasks you a standard Threadpool will not fit.
But you can use multiple Threadpools with only one thread in each to solve your problem.
You should write your program so you don't need to know what thread executes a task. They are just anonymous worker threads.
However, if you really want to know anyway you can create a single threaded ExecutorService for each thread you want and then you will know which thread will execute a task.

Java: How thread pools map threads to runnables

Trying to wrap my head around Java concurrency and am having a tough time understanding the relationship between thread pools, threads, and the runnable "tasks" they are executing.
If I create a thread pool with, say, 10 threads, then do I have to pass the same task to each thread in the pool, or are the pooled threads literally just task-agnostic "worker drones" available to execute any task?
Either way, how does an Executor/ExecutorService assign the right task to the right thread?
Typically, thread pools are implemented with one producer-consumer queue that all of the pool threads wait on for tasks. The Executor does not have to assign tasks, all it has to do is push them onto the queue. Some thread, a 'task-agnostic worker drone', will pop the task, execute its 'run()' method and, when complete, loop round to wait on the queue again for more work.
If I create a thread pool with, say, 10 threads, then do I have to pass the same task to each thread in the pool, or are the pooled threads literally just task-agnostic "worker drones" available to execute any task?
More or less the latter. Any given task gets assigned to the next available thread.
Either way, how does an Executor/ExecutorService assign the right task to the right thread?
There is no such thing as the "right" thread. The task (i.e. the Runnable) needs to be designed so that it doesn't matter which thread runs it. This is not normally an issue ... assuming that your application properly synchronizes access / updates to data that is potentially used by more than one threads.

ForkJoinPool seems to waste a thread

I'm comparing two variations on a test program. Both are operating with a 4-thread ForkJoinPool on a machine with four cores.
In 'mode 1', I use the pool very much like an executor service. I toss a pile of tasks into ExecutorService.invokeAll. I get better performance than from an ordinary fixed thread executor service (even though there are calls to Lucene, that do some I/O, in there).
There is no divide-and-conquer here. Literally, I do
ExecutorService es = new ForkJoinPool(4);
es.invokeAll(collection_of_Callables);
In 'mode 2', I submit a single task to the pool, and in that task call ForkJoinTask.invokeAll to submit the subtasks. So, I have an object that inherits from RecursiveAction, and it is submitted to the pool. In the compute method of that class, I call the invokeAll on a collection of objects from a different class that also inherits from RecursiveAction. For testing purposes, I submit only one-at-a-time of the first objects. What I naively expected to see what all four threads busy, as the thread calling invokeAll would grab one of the subtasks for itself instead of just sitting and blocking. I can think of some reasons why it might not work that way.
Watching in VisualVM, in mode 2, one thread is pretty nearly always waiting. What I expect to see is the thread calling invokeAll immediately going to work on one of the invoked tasks rather than just sitting still. This is certainly better than the deadlocks that would result from trying this scheme with an ordinary thread pool, but still, what up? Is it holding one thread back in case something else gets submitted? And, if so, why not the same problem in mode 1?
So far I've been running this using the jsr166 jar added to java 1.6's boot class path.
ForkJoinTask.invokeAll is forking all tasks, but the first in the list. The first task it runs itself. Then it joins other tasks. It's thread is not released in any way to the pool. So you what you see, it it's thread blocking on other tasks to be complete.
The classic use of invokeAll for a Fork Join pool is to fork one task and compute another (in that executing thread). The thread that does not fork will join after it computes. The work stealing comes in with both tasks computing. When each task computes it is expected to fork it's own subtasks (until some threshold is met).
I am not sure what invokeAll is being called for your RecursiveAction.compute() but if it is the invokeAll which takes two RecursiveAction it will fork one, compute the other and wait for the forked task to finish.
This is different then a plain executor service because each task of an ExecutorService is simply a Runnable on a queue. There is no need for two tasks of an ExecutorService to know the outcome of another. That is the primary use case of a FJ Pool.

Categories

Resources