I've been working on a file/loading method for my project. I'm currently using a pool of worker-threads with a queue of all the IO tasks. There is a timer that runs every 50MS that checks whether or not the queue is empty and executes all the tasks within the queue. All the tasks are added to the queue from various threads that CANNOT be delayed from IO or waiting for the worker threads to complete a task.
Is there alternative solution for what i'm trying to achieve? Like adding a wait that only applies to the worker threads rather then the threads attempting to put IO tasks within them.
Edit: I'd also like to avoid constantly creating new Thread objects. It seems to have a pretty big impact on my applications performance.
Related
Looking for an approach to solve a multi threading problem.
I have N number of tasks say 100. I need to run this 100 tasks using limited number of threads say 4. Task size is huge , so I dont want to create all the tasks together. Each task will be created only when a free thread is available from the pool. Any recommended solution for the same.
You could use a BlockingQueue to define the tasks. Have one thread create the tasks and add them to the queue using put, which blocks until there's space in the queue. Then have each worker thread just pull the next task off of the queue. The queue's blocking nature will basically force that first thread (that's defining the tasks) to not get too far ahead of the workers.
This is really just a case of the producer-consumer pattern, where the thing being produced and consumed is a request to do some work.
You'll need to specify some way for the whole thing to finish once all of the work is done. One way to do this is to put N "poison pills" on the queue when the generating thread has created all of the tasks. These are special tasks that just tell the worker thread to exit (rather than doing some work and then asking for the next item). Since each thread can only read at most one poison pill (because it exits after it reads it), and you put N poison pills in the queue, you'll ensure that each of your N threads will see exactly one poison pill.
Note that if the task-generating thread consumes resources, like a database connection to read tasks from, those resources will be held until all of the tasks have been generated -- which could be a while! That's not generally a good idea, so this approach isn't a good one in those cases.
If can get the number of active threads at a certain point of time from the thread pool you can solve your problem. To do that you can use ThreadPoolExecutor#getActiveCount. Once you have the number of the active thread then you can decide you should create a task or not.
ThreadPoolExecutor executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(5);
executor.getActiveCount();
Note: ExecutorService does not provide getActiveCount method, you
have to use ThreadPoolExecutor. ThreadPoolExecutor#getActiveCount
Returns the approximate number of threads that are actively
executing tasks.
Is there any thread pool implementation that also allows to use the calling thread for execution?
Some background - I have a service that needs to call lots of dependent services (and do some work with their results). My service is massively parallel and might use up to 1000 threads serving concurrent requests (really, I'm not kidding).
A common pattern for parallel processing is, of course, a shared pool of background threads that is used to farm out the work from the main thread. It also has a fundamental problem of exhaustion, if each of 1000 service threads submits a long-running request then it's extremely easy to completely exhaust all of the pool's capacity.
Another classic solution is to use a private thread pool for each of the service threads. It's not very appealing, since I won't be able to make these private pools large enough.
So my idea is to use a special type of a thread pool executor that runs tasks in the calling thread and opportunistically uses the background thread pool to run tasks if it has free capacity. This way I can guarantee that the calling thread will make some progress in any case, even if the background pool is exhausted.
Does anybody know of such thread pool implementation?
Though it isn't very clear from the question, it sounds like the threads are mostly blocking waiting for responses from other services. This isn't a very productive use of these threads. A large number of threads often causes the scheduler operate inefficiently.
Alternatively, you can think about using asynchronous sockets with completion handlers. This avoids the blocking i/o, and calls handlers in your code where you can respond to i/o events occurring in the channel.
This ultimately means that you can reduce massively the number of threads in your application, and should improve performance.
Another approach is to place a task queue between the calling thread(s) and the thread pool. Every request is placed on the queue, and workers process tasks in the queue in turn. When a task is complete notification is sent back to the calling thread.
Using this mechanism, you can always ensure that tasks will eventually be processed.
I have only two short-lived tasks to run in the background upon the start of the application. Would it make sense to use a thread for each task or an Executor, for instance, a single thread executor to submit these two tasks.
Does it make sense to create two threads that die quickly as opposed to having a single threaded executor waiting for tasks throughout the lifecycle of the application when there are none?
One big benefit of using a threadpool is that you avoid the scenario where you have some task that you perform repeatedly then, if something goes wrong with that task that causes the thread to hang, you're at risk of losing a thread every time the task happens, resulting in running the application out of threads. If your threads only run once on startup then it seems likely that risk wouldn't apply to your case.
You could still use Executor, but shut it down once your tasks have both run. It might be preferable to use Futures or a CompletionService over raw threads.
If you do this more than once in your application, ThreadPoolExecutor is definitely worth a look.
One benefit is the pooling of threads. This releaves the runtime to create and destroy OS objects every time you need a thread. Additionally you get control of the amount of threads spawned - but this seems not the big issue for you - and threads running/done.
But if you actually really only spawn two threads over the runtime of your application, the executors may be oversized, but they are nevertheless very comfortable to work with.
Since Nathan added Futures, there is also Timer and TimerTask. Also very convenient for "Fire and Forget" type of background action :-).
This question already has answers here:
When should we use Java's Thread over Executor?
(7 answers)
Closed 7 years ago.
In Java, both of the following code snippets can be used to quickly spawn a new thread for running some task-
This one using Thread-
new Thread(new Runnable() {
#Override
public void run() {
// TODO: Code goes here
}
}).start();
And this one using Executor-
Executors.newSingleThreadExecutor().execute(new Runnable(){
#Override
public void run() {
// TODO: Code goes here
}
});
Internally, what is the difference between this two codes and which one is a better approach?
Just in case, I'm developing for Android.
Now I think, I was actually looking for use-cases of newSingleThreadExecutor(). Exactly this was asked in this question and answered-
Examples of when it is convenient to use Executors.newSingleThreadExecutor()
Your second example is strange, creating an executor just to run one task is not a good usage. The point of having the executor is so that you can keep it around for the duration of your application and submit tasks to it. It will work but you're not getting the benefits of having the executor.
The executor can keep a pool of threads handy that it can reuse for incoming tasks, so that each task doesn't have to spin up a new thread, or if you pick the singleThread one it can enforce that the tasks are done in sequence and not overlap. With the executor you can better separate the individual tasks being performed from the technical implementation of how the work is done.
With the first approach where you create a thread, if something goes wrong with your task in some cases the thread can get leaked; it gets hung up on something, never finishes its task, and the thread is lost to the application and anything else using that JVM. Using an executor can put an upper bound on the number of threads you lose to this kind of error, so at least your application degrades gracefully and doesn't impair other applications using the same JVM.
Also with the thread approach each thread you create has to be kept track of separately (so that for instance you can interrupt them once it's time to shutdown the application), with the executor you can shut the executor down once and let it handle its threads itself.
The second using an ExecutorService is definitely the best approach.
ExecutorService determines how you want your tasks to run concurrently. It decouples the Runnables (or Callables) from their execution.
When using Thread, you couple the tasks with how you want them to be executed, giving you much less flexibility.
Also, ExecutorService gives you a better way of tracking your tasks and getting a return value with Future while the start method from Thread just run without giving any information. Thread therefore encourages you to code side-effects in the Runnable which may make the overall execution harder to understand and debug.
Also Thread is a costly resource and ExecutorService can handle their lifecycle, reusing Thread to run a new tasks or creating new ones depending on the strategy you defined. For instance: Executors.newSingleThreadExecutor(); creates a ThreadPoolExecutor with only one thread that can sequentially execute the tasks passed to it while Executors.newFixedThreadPool(8)creates a ThreadPoolExecutor with 8 thread allowing to run a maximum of 8 tasks in parallel.
You already have three answers, but I think this question deserves one more because none of the others talk about thread pools and the problem that they are meant to solve.
A thread pool (e.g., java.util.concurrent.ThreadPoolExecutor) is meant to reduce the number of threads that are created and destroyed by a program.
Some programs need to continually create and destroy new tasks that will run in separate threads. One example is a server that accepts connections from many clients, and spawns a new task to serve each one.
Creating a new thread for each new task is expensive; In many programs, the cost of creating the thread can be significantly higher than the cost of performing the task. Instead of letting a thread die after it has finished one task, wouldn't it be better to use the same thread over again to perform the next one?
That's what a thread pool does: It manages and re-uses a controlled number of worker threads, to perform your program's tasks.
Your two examples show two different ways of creating a single thread that will perform a single task, but there's no context. How much work will that task perform? How long will it take?
The first example is a perfectly acceptable way to create a thread that will run for a long time---a thread that must exist for the entire lifetime of the program, or a thread that performs a task so big that the cost of creating and destroying the thread is not significant.
Your second example makes no sense though because it creates a thread pool just to execute one Runnable. Creating a thread pool for one Runnable (or worse, for each new task) completely defeats the purpose of the thread-pool which is to re-use threads.
P.S.: If you are writing code that will become part of some larger system, and you are worried about the "right way" to create threads, then you probably should also learn what problem the java.util.concurrent.ThreadFactory interface was meant to solve.
Google is your friend.
According to documentation of ThreadPoolExecutor
Thread pools address two different problems: they usually provide
improved performance when executing large numbers of asynchronous
tasks, due to reduced per-task invocation overhead, and they provide a
means of bounding and managing the resources, including threads,
consumed when executing a collection of tasks. Each ThreadPoolExecutor
also maintains some basic statistics, such as the number of completed
tasks.
First approach is suitable for me if I want to spawn single background processing and for small applications.
I will prefer second approach for controlled thread execution environment. If I use ThreadPoolExecutor, I am sure that 1 thread will be running at time , even If I submit more threads to executor. Such cases are tend to happen if you consider large enterprise application, where threading logic is not exposed to other modules. In large enterprise application , you want to control the number of concurrent running threads. So second approach is more pereferable if you are designing enterprise or large scale applications.
If I create 1K threads and launch them at the same time using a latch, once the threads complete my process ends.
What I want to do is, as the thread ends, start up another thread to work on the same task (or somehow get the same thread to continue processing with the same task again).
Scenario:
I want to start 1K threads, and don't want the performance penalty of starting another 1K threads when they finish processing.
The threads simply make a http url connection to a url http://www.example.com/some/page
What I want to do is continuously run for x seconds, and always have 1K threads running.
I don't want to use an executor for this for both learning how to do it w/o it and I believe since the executor framework separates the task and threads, it doesn't gaurantee how many threads are running at the same time.
You'll have to do it in the Runnable itself. Create a simple loop surrounding your actions.
If you want them all to synchronize at a certain point, create a CountdownLatch with count 1000 and at the end of every iteration do a countDown and await.
Apache JMeter is a free performance testing tool that you can easily configure to test URL's in multiple threads. It can also distribute the tests to have e.g. 10 clients doing 100 threads instead.
use a loop in your run() method.
Close as I can tell, you want to have a large number of server threads, and have them execute a piece of work from a list, then come back and wait for another piece of work to be be specified (or work on another already-present piece in the list).
This is what you use a queue for. Probably a BlockingQueue is the simplest form to use that will suit your purposes, and there are several implementations of this in the JDK.