I apologize in advance if this is a basic question, but I'm new to the material. I've got a piece of software that is kicked off by user's submitting jobs through a website. Because the software is itself designed to capitalize on parallel processing, all I want to do is queue up these jobs so they can kick off one after the other. To do this, I've tried to capitalize on the Executor framework built into Java. The code I've developed is:
public JobManager()
{
mcpExecutor = Executors.newSingleThreadExecutor();
}
public Future<MatlabProcessResults> startProcess(inputs)
{
MyProcess myProcess = new MyProcess(inputs);
Future<MyProcessResults> future = mcpExecutor.submit(myProcess);
Long newKey = System.currentTimeMillis();
futures.putIfAbsent(newKey, future);
}
Where startProcess is run every time the "submit" button is pressed. Now, the description of the newSingleThreadExecutor reads:
Creates an Executor that uses a single worker thread operating off an unbounded queue. (Note however that if this single thread terminates due to a failure during execution prior to shutdown, a new one will take its place if needed to execute subsequent tasks.) Tasks are guaranteed to execute sequentially, and no more than one task will be active at any given time. Unlike the otherwise equivalent newFixedThreadPool(1) the returned executor is guaranteed not to be reconfigurable to use additional threads.
This led me to think that it would take multiple tasks, queue them, and only run one instance of the software at a time. As you might suspect, I'm writing because it doesn't do that. It is starting as many tasks as I submit in parallel (I know, the opposite problem of what most people probably want to do). Any help on this issue is much appreciated, and thank you in advance.
Related
Hello everyone and thanks for the time you dedicated in advance!
I have a problem I'm not sure how to approach in Java. Let's say I have a user interface that creates events that have to be "executed" in a specific time in the future, which may vary from a couple of minutes in the future to several days.
I have though of creating a class (let's say EventHandler) that implementa Runnable, and then a ConcurrentLinkedList that stores those instantiations ordered by the time they should be executed, from least in advance to most in advance. After that, a thread that peeks the queue, and If system time is greater than expected execution time, start the process.
Problem is, aside of concurrency problems associated with list, that the peek thread consumes CPU time. So I was wondering if there is a more elegant solution, considering there may be hundreds of events scheduled in a single second interval. Also, I'm using Hibernate with MongoDb to store stuff, if that affects at all.
Thank you!
EDIT: I'm open to all other solutions you may think of, as long as it solves the "queue events and execute them in the time they are set to"
A straightforward approach is to use ScheduledThreadPoolExecutor.
int corePoolSize = 1;// for sequential execution of tasks
// for parallel execution use Runtime.getRuntime().availableProcessors();
ScheduledThreadPoolExecutor executor = ScheduledThreadPoolExecutor(corePoolSize);
...
Runnable command1 = ...
executor.schedule(command1, 2, TimeUnit.MINUTES); // execute in couple of minutes
Runnable command2 = ...
executor.schedule(command2, 7, TimeUnit.DAYS); // execute in 7 days
This question already has answers here:
When should we use Java's Thread over Executor?
(7 answers)
Closed 7 years ago.
In Java, both of the following code snippets can be used to quickly spawn a new thread for running some task-
This one using Thread-
new Thread(new Runnable() {
#Override
public void run() {
// TODO: Code goes here
}
}).start();
And this one using Executor-
Executors.newSingleThreadExecutor().execute(new Runnable(){
#Override
public void run() {
// TODO: Code goes here
}
});
Internally, what is the difference between this two codes and which one is a better approach?
Just in case, I'm developing for Android.
Now I think, I was actually looking for use-cases of newSingleThreadExecutor(). Exactly this was asked in this question and answered-
Examples of when it is convenient to use Executors.newSingleThreadExecutor()
Your second example is strange, creating an executor just to run one task is not a good usage. The point of having the executor is so that you can keep it around for the duration of your application and submit tasks to it. It will work but you're not getting the benefits of having the executor.
The executor can keep a pool of threads handy that it can reuse for incoming tasks, so that each task doesn't have to spin up a new thread, or if you pick the singleThread one it can enforce that the tasks are done in sequence and not overlap. With the executor you can better separate the individual tasks being performed from the technical implementation of how the work is done.
With the first approach where you create a thread, if something goes wrong with your task in some cases the thread can get leaked; it gets hung up on something, never finishes its task, and the thread is lost to the application and anything else using that JVM. Using an executor can put an upper bound on the number of threads you lose to this kind of error, so at least your application degrades gracefully and doesn't impair other applications using the same JVM.
Also with the thread approach each thread you create has to be kept track of separately (so that for instance you can interrupt them once it's time to shutdown the application), with the executor you can shut the executor down once and let it handle its threads itself.
The second using an ExecutorService is definitely the best approach.
ExecutorService determines how you want your tasks to run concurrently. It decouples the Runnables (or Callables) from their execution.
When using Thread, you couple the tasks with how you want them to be executed, giving you much less flexibility.
Also, ExecutorService gives you a better way of tracking your tasks and getting a return value with Future while the start method from Thread just run without giving any information. Thread therefore encourages you to code side-effects in the Runnable which may make the overall execution harder to understand and debug.
Also Thread is a costly resource and ExecutorService can handle their lifecycle, reusing Thread to run a new tasks or creating new ones depending on the strategy you defined. For instance: Executors.newSingleThreadExecutor(); creates a ThreadPoolExecutor with only one thread that can sequentially execute the tasks passed to it while Executors.newFixedThreadPool(8)creates a ThreadPoolExecutor with 8 thread allowing to run a maximum of 8 tasks in parallel.
You already have three answers, but I think this question deserves one more because none of the others talk about thread pools and the problem that they are meant to solve.
A thread pool (e.g., java.util.concurrent.ThreadPoolExecutor) is meant to reduce the number of threads that are created and destroyed by a program.
Some programs need to continually create and destroy new tasks that will run in separate threads. One example is a server that accepts connections from many clients, and spawns a new task to serve each one.
Creating a new thread for each new task is expensive; In many programs, the cost of creating the thread can be significantly higher than the cost of performing the task. Instead of letting a thread die after it has finished one task, wouldn't it be better to use the same thread over again to perform the next one?
That's what a thread pool does: It manages and re-uses a controlled number of worker threads, to perform your program's tasks.
Your two examples show two different ways of creating a single thread that will perform a single task, but there's no context. How much work will that task perform? How long will it take?
The first example is a perfectly acceptable way to create a thread that will run for a long time---a thread that must exist for the entire lifetime of the program, or a thread that performs a task so big that the cost of creating and destroying the thread is not significant.
Your second example makes no sense though because it creates a thread pool just to execute one Runnable. Creating a thread pool for one Runnable (or worse, for each new task) completely defeats the purpose of the thread-pool which is to re-use threads.
P.S.: If you are writing code that will become part of some larger system, and you are worried about the "right way" to create threads, then you probably should also learn what problem the java.util.concurrent.ThreadFactory interface was meant to solve.
Google is your friend.
According to documentation of ThreadPoolExecutor
Thread pools address two different problems: they usually provide
improved performance when executing large numbers of asynchronous
tasks, due to reduced per-task invocation overhead, and they provide a
means of bounding and managing the resources, including threads,
consumed when executing a collection of tasks. Each ThreadPoolExecutor
also maintains some basic statistics, such as the number of completed
tasks.
First approach is suitable for me if I want to spawn single background processing and for small applications.
I will prefer second approach for controlled thread execution environment. If I use ThreadPoolExecutor, I am sure that 1 thread will be running at time , even If I submit more threads to executor. Such cases are tend to happen if you consider large enterprise application, where threading logic is not exposed to other modules. In large enterprise application , you want to control the number of concurrent running threads. So second approach is more pereferable if you are designing enterprise or large scale applications.
I try to work with Java's FutureTask, Future, Runnable, Callable and ExecutorService types.
What is the best practice to compose those building blocks?
Given that I have multiple FutureTasks and and I want to execute them in sequence.
Ofcourse I could make another FutureTask which is submitting / waiting for result for each subtask in sequence, but I want to avoid blocking calls.
Another option would be to let those subtasks invoke a callback when they complete, and schedule the next task in the callback. But going that route, how to I create a proper outer FutureTask object which also handles exceptions in the subtask without producing that much of a boilerplate?
Do I miss something here?
Very important thing, though usually not described in tutorials:
Runnables to be executed on an ExecutorService should not block. This is because each blocking switches off a working thread, and if ExecutorService has limited number of working threads, there is a risk to fall into deadlock (thread starvation), and if ExecutorService has unlimited number of working threads, then there is a risk to run out of memory. Blocking operations in the tasks simply destroy all advantages of ExecutorService, so use blocking operations on usual threads only.
FutureTask.get() is blocking operation, so can be used on ordinary threads and not from an ExecutorService task. That is, it cannot serve as a building block, but only to deliver result of execution to the master thread.
Right approach to build execution from tasks is to start next task when all input data for the next task is ready, so that the task do not have to block waiting for input data. So you need a kind of a gate which stores intermediate results and starts new task when all arguments have arrived. Thus tasks do not bother explicitly to start other tasks. So a gate, which consists of input sockets for arguments and a Runnable to compute them, can be considered as a right building block for computations on ExcutorServices.
This approach is called dataflow or workflow (if gates cannot be created dynamically).
Actor frameworks like Akka use this approach but are limited in the fact that an actor is a gate with single input socket.
I have written a true dataflow library published at https://github.com/rfqu/df4j.
I tried to do something similar with a ScheduledFuture, trying to cause a delay before things were displayed to the user. This is what I come up with, simply use the same ScheduledFuture for all your 'delays'. The code was:
public static final ScheduledExecutorService scheduler = Executors
.newScheduledThreadPool(1);
public ScheduledFuture delay = null;
delay = scheduler.schedule(new Runnable() {
#Override
public void run() {
//do something
}
}, 1000, TimeUnit.MILLISECONDS);
delay = scheduler.schedule(new Runnable() {
#Override
public void run() {
//do something else
}
}, 2000, TimeUnit.MILLISECONDS);
Hope this helps
Andy
The usual approach is to:
Decide about ExecutorService (which type, how many threads).
Decide about the task queue (for how long it could be non-blocking).
If you have some external code that waits for the task result:
* Submit tasks as Callables (this is non blocking as long as you do not run out of the queue).
* Call get on the Future.
If you want some actions to be taken automatically after the task is finished:
You can submit as Callables or Runnables.
Just add that you need to do at the end as the last code inside the task. Use
Activity.runOnUIThread these final actions need to modify GUI.
Normally, you should not actively check when you can submit one more task or schedule callback in order just to submit them. The thread queue (blocking, if preferred) will handle this for you.
I am creating a http proxy server in java. I have a class named Handler which is responsible for processing the requests and responses coming and going from web browser and to web server respectively. I have also another class named Copy which copies the inputStream object to outputStream object . Both these classes implement Runnable interface. I would like to use the concept of Thread pooling in my design, however i don't know how to go about that! Any hint or idea would be highly appreciated.
I suggest you look at Executor and ExecutorService. They add a lot of good stuff to make it easier to use Thread pools.
...
#Azad provided some good information and links. You should also buy and read the book Java Concurrency in Practice. (often abbreviated as JCiP) Note to stackoverflow big-wigs - how about some revenue link to Amazon???
Below is my brief summary of how to use and take advantage of ExecutorService with thread pools. Let's say you want 8 threads in the pool.
You can create one using the full featured constructors of ThreadPoolExecutor, e.g.
ExecutorService service = new ThreadPoolExecutor(8,8, more args here...);
or you can use the simpler but less customizable Executors factories, e.g.
ExecutorService service = Executors.newFixedThreadPool(8);
One advantage you immediately get is the ability to shutdown() or shutdownNow() the thread pool, and to check this status via isShutdown() or isTerminated().
If you don't care much about the Runnable you wish to run, or they are very well written, self-contained, never fail or log any errors appropriately, etc... you can call
execute(Runnable r);
If you do care about either the result (say, it calculates pi or downloads an image from a webpage) and/or you care if there was an Exception, you should use one of the submit methods that returns a Future. That allows you, at some time in the future, check if the task isDone() and to retrieve the result via get(). If there was an Exception, get() will throw it (wrapped in an ExecutionException). Note - even of your Future doesn't "return" anything (it is of type Void) it may still be good practice to call get() (ignoring the void result) to test for an Exception.
However, this checking the Future is a bit of chicken and egg problem. The whole point of a thread pool is to submit tasks without blocking. But Future.get() blocks, and Future.isDone() begs the questions of which thread is calling it, and what it does if it isn't done - do you sleep() and block?
If you are submitting a known chunk of related of tasks simultaneously, e.g., you are performing some big mathematical calculation like a matrix multiply that can be done in parallel, and there is no particular advantage to obtaining partial results, you can call invokeAll(). The calling thread will then block until all the tasks are complete, when you can call Future.get() on all the Futures.
What if the tasks are more disjointed, or you really want to use the partial results? Use ExecutorCompletionService, which wraps an ExecutorService. As tasks get completed, they are added to a queue. This makes it easy for a single thread to poll and remove events from the queue. JCiP has a great example of an web page app that downloads all the images in parallel, and renders them as soon as they become available for responsiveness.
I hope below will help you:,
class Executor
An object that executes submitted Runnable tasks. This interface provides a way of decoupling task submission from the mechanics of how each task will be run, including details of thread use, scheduling, etc. An Executor is normally used instead of explicitly creating threads. For example, rather than invoking new Thread(new(RunnableTask())).start() for each of a set of tasks, you might use:
Executor executor = anExecutor;
executor.execute(new RunnableTask1());
executor.execute(new RunnableTask2());
...
class ScheduledThreadPoolExecutor
A ThreadPoolExecutor that can additionally schedule commands to run after a given delay, or to execute periodically. This class is preferable to Timer when multiple worker threads are needed, or when the additional flexibility or capabilities of ThreadPoolExecutor (which this class extends) are required.
Delayed tasks execute no sooner than they are enabled, but without any real-time guarantees about when, after they are enabled, they will commence. Tasks scheduled for exactly the same execution time are enabled in first-in-first-out (FIFO) order of submission.
and
Interface ExecutorService
An Executor that provides methods to manage termination and methods that can produce a Future for tracking progress of one or more asynchronous tasks.
An ExecutorService can be shut down, which will cause it to stop accepting new tasks. After being shut down, the executor will eventually terminate, at which point no tasks are actively executing, no tasks are awaiting execution, and no new tasks can be submitted.
Edited:
you can find example to use Executor and ExecutorService herehereand here Question will be useful for you.
I want to know how to execute more than one job in Eclipse at a time. I want to run more than one job concurrently in RCP.
A Job more or less wraps a Thread and all started (scheduled) job instances run in parallel.
Suggestion for further reading:
On the Job: the Eclipse Jobs API
Use Threading to run more than one job at a time.
Thread th = new Thread() {
public void run() {
//Here is a thread that you can use wherever you want in your code
}
};
th.start();
See Eclipse RCP: Only one Job runs at a time?
Jobs can optionally finish their execution asynchronously (in another thread) by returning a result status of ASYNC_FINISH. Jobs that finish asynchronously must specify the execution thread by calling setThread, and must indicate when they are finished by calling the method done.
A few years later, you now have the opposite issue, which is to limit the number of concurrent jobs.
That is why Eclipse 4.5M4 will include now (Q4 2014) a way to Support for Job Groups with throttling.
See bug 432049:
Eclipse provides a simple Jobs API to perform different tasks in parallel and in asynchronous fashion. One limitation of the Eclipse Jobs is that there is no easy way to limit the number of worker threads being used to execute jobs.
This may lead to a thread pool explosion when many jobs are scheduled in quick succession. Due to that it’s easy to use Jobs to perform different unrelated tasks in parallel, but hard to implement thousands of Jobs co-operating to complete a single large task.
Eclipse currently supports the concept of Job Families, which provides one way of grouping with support for join, cancel, sleep, and wakeup operations on the whole family.
To address all these issue we would like to propose a simple way to group a set of Eclipse Jobs that are responsible for pieces of the same large task.
The API would support throttling, join, cancel, combined progress and error reporting for all of the jobs in the group and the job grouping functionality can be used to rewrite performance critical algorithms to use parallel execution of cooperating jobs.
You can see the implementation in this commit 26471fa