Trying to wrap my head around Java concurrency and am having a tough time understanding the relationship between thread pools, threads, and the runnable "tasks" they are executing.
If I create a thread pool with, say, 10 threads, then do I have to pass the same task to each thread in the pool, or are the pooled threads literally just task-agnostic "worker drones" available to execute any task?
Either way, how does an Executor/ExecutorService assign the right task to the right thread?
Typically, thread pools are implemented with one producer-consumer queue that all of the pool threads wait on for tasks. The Executor does not have to assign tasks, all it has to do is push them onto the queue. Some thread, a 'task-agnostic worker drone', will pop the task, execute its 'run()' method and, when complete, loop round to wait on the queue again for more work.
If I create a thread pool with, say, 10 threads, then do I have to pass the same task to each thread in the pool, or are the pooled threads literally just task-agnostic "worker drones" available to execute any task?
More or less the latter. Any given task gets assigned to the next available thread.
Either way, how does an Executor/ExecutorService assign the right task to the right thread?
There is no such thing as the "right" thread. The task (i.e. the Runnable) needs to be designed so that it doesn't matter which thread runs it. This is not normally an issue ... assuming that your application properly synchronizes access / updates to data that is potentially used by more than one threads.
Related
Looking for an approach to solve a multi threading problem.
I have N number of tasks say 100. I need to run this 100 tasks using limited number of threads say 4. Task size is huge , so I dont want to create all the tasks together. Each task will be created only when a free thread is available from the pool. Any recommended solution for the same.
You could use a BlockingQueue to define the tasks. Have one thread create the tasks and add them to the queue using put, which blocks until there's space in the queue. Then have each worker thread just pull the next task off of the queue. The queue's blocking nature will basically force that first thread (that's defining the tasks) to not get too far ahead of the workers.
This is really just a case of the producer-consumer pattern, where the thing being produced and consumed is a request to do some work.
You'll need to specify some way for the whole thing to finish once all of the work is done. One way to do this is to put N "poison pills" on the queue when the generating thread has created all of the tasks. These are special tasks that just tell the worker thread to exit (rather than doing some work and then asking for the next item). Since each thread can only read at most one poison pill (because it exits after it reads it), and you put N poison pills in the queue, you'll ensure that each of your N threads will see exactly one poison pill.
Note that if the task-generating thread consumes resources, like a database connection to read tasks from, those resources will be held until all of the tasks have been generated -- which could be a while! That's not generally a good idea, so this approach isn't a good one in those cases.
If can get the number of active threads at a certain point of time from the thread pool you can solve your problem. To do that you can use ThreadPoolExecutor#getActiveCount. Once you have the number of the active thread then you can decide you should create a task or not.
ThreadPoolExecutor executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(5);
executor.getActiveCount();
Note: ExecutorService does not provide getActiveCount method, you
have to use ThreadPoolExecutor. ThreadPoolExecutor#getActiveCount
Returns the approximate number of threads that are actively
executing tasks.
As far I know, thread pools (java.util.concurrent.Executor class) provide a queue of tasks for all threads in a pool. So I don't really know, which thread will execute my task. But I need to have queues of tasks assigned to every thread. How can I do it?
If you want only certain threads to execute certain tasks you a standard Threadpool will not fit.
But you can use multiple Threadpools with only one thread in each to solve your problem.
You should write your program so you don't need to know what thread executes a task. They are just anonymous worker threads.
However, if you really want to know anyway you can create a single threaded ExecutorService for each thread you want and then you will know which thread will execute a task.
Suppose there are three threads created using executor service and now I want that t2 would start running after t1 and t3 would start running after t2. how to achieve this kind of scenario in case of thread pool?
If it would have any normal thread creating using thread.start(). I could have waited using join() method. But how to handle above scenario?
Thread t1,t2 and t3 can implement callable interface and from the call method you can return some value.
Based on the return value, after t1 returns, you can initiate t2 and similarly for t3.
"Callable" is the answer for it
You are confusing the notion of threads and what is executed on a thread. It doesn't matter when a thread "starts" in a thread pool but when execution of your processing begins or continues. So the better statement is that you have 3 Callables or Runnables and you need one of the to wait for the other two before continuing. This is done using a CountDownLatch. Create a shared latch with a count of 2. 2 of the Callables will call countDown() on the latch, the one that should wait will call await() (possibly with a timeout).
Jobs submitted to an ExecutorService must be mutually independent. If you try to establish dependencies by waiting on Semaphores, CountDownLatches or similar, you run the risk of blocking the whole Service, when all available worker threads execute jobs that wait for a jobs that has been submitted, but is behind the current jobs in the queue. You want to make sure you have more workers than possible blocking jobs. In most cases, it is better to use more than one ExecutorService and submit each job of a dependent group to a different Service.
A few options:
If this is the only scenario you have to deal with (t1->t2->t3), don't use a thread pool. Run the three tasks sequentially.
Use some inter-thread notification mechanism (e.g. BlockingQueue, CountDownLatch). This requires your tasks to hold a shared reference to the synchronization instrument you choose.
Wrap any dependence sequence with a new runnable/callable to be submitted as a single task. This approach is simple, but won't deal correctly with non-linear dependency topologies.
Every task that depends on another task should submit the other task for execution, and wait for its completion. This is a generic approach for thread pools with dependencies, but it requires a careful tuning to avoid possible deadlocks (running tasks may wait for tasks which don't have an available thread to run on. See my response here for a simple solution).
So I have my Android Bluetooth application that has it's host and clients. The problem is, because I am making multiple connections, I need a thread to handle each connection. That's all milk'n'cookies, so I thought I'd stick all the threads in an array. A little research says a better method to doing this is using a Thread Pool, but I can't seem to get my head around how that works. Also, is it actually even possible to hold threads in an array?
A thread pool is built around the idea that, since creating threads over and over again is time-consuming, we should try to recycle them as much as possible. Thus, a thread pool is a collection of threads that execute jobs, but are not destroyed when they finish a job, but instead "return to the pool" and either take another job or sit idle if there is nothing to do.
Usually the underlying implementation is a thread-safe queue in which the programmer puts jobs and a bunch of threads managed by the implementation keep polling (I'm not implying busy-spinning necessarily) the queue for work.
In Java the thread pool is represented by the ExecutorService class which can be:
fixed - create a thread pool with a fixed number of threads
cached - dynamically creates and destroys threads as needed
single - a pool with a single thread
Note that, since thread pool threads operate in the manner described above (i.e. are recycled), in the case of a fixed thread pool it is not recommended to have jobs that do blocking I/O operations, since the threads taking those jobs will be effectively removed from the pool until they finish the job and thus you may have deadlocks.
As for the array of threads, it's as simple as creating any object array:
Thread[] threads = new Thread[10]; // array of 10 threads
I'm comparing two variations on a test program. Both are operating with a 4-thread ForkJoinPool on a machine with four cores.
In 'mode 1', I use the pool very much like an executor service. I toss a pile of tasks into ExecutorService.invokeAll. I get better performance than from an ordinary fixed thread executor service (even though there are calls to Lucene, that do some I/O, in there).
There is no divide-and-conquer here. Literally, I do
ExecutorService es = new ForkJoinPool(4);
es.invokeAll(collection_of_Callables);
In 'mode 2', I submit a single task to the pool, and in that task call ForkJoinTask.invokeAll to submit the subtasks. So, I have an object that inherits from RecursiveAction, and it is submitted to the pool. In the compute method of that class, I call the invokeAll on a collection of objects from a different class that also inherits from RecursiveAction. For testing purposes, I submit only one-at-a-time of the first objects. What I naively expected to see what all four threads busy, as the thread calling invokeAll would grab one of the subtasks for itself instead of just sitting and blocking. I can think of some reasons why it might not work that way.
Watching in VisualVM, in mode 2, one thread is pretty nearly always waiting. What I expect to see is the thread calling invokeAll immediately going to work on one of the invoked tasks rather than just sitting still. This is certainly better than the deadlocks that would result from trying this scheme with an ordinary thread pool, but still, what up? Is it holding one thread back in case something else gets submitted? And, if so, why not the same problem in mode 1?
So far I've been running this using the jsr166 jar added to java 1.6's boot class path.
ForkJoinTask.invokeAll is forking all tasks, but the first in the list. The first task it runs itself. Then it joins other tasks. It's thread is not released in any way to the pool. So you what you see, it it's thread blocking on other tasks to be complete.
The classic use of invokeAll for a Fork Join pool is to fork one task and compute another (in that executing thread). The thread that does not fork will join after it computes. The work stealing comes in with both tasks computing. When each task computes it is expected to fork it's own subtasks (until some threshold is met).
I am not sure what invokeAll is being called for your RecursiveAction.compute() but if it is the invokeAll which takes two RecursiveAction it will fork one, compute the other and wait for the forked task to finish.
This is different then a plain executor service because each task of an ExecutorService is simply a Runnable on a queue. There is no need for two tasks of an ExecutorService to know the outcome of another. That is the primary use case of a FJ Pool.