I was wondering which would be the most efficient approach to implement some kind of background task in java (I guess that would be some kind of nonblocking Threads). To be more precise - I have some java code and then at some point I need to execute a long running operation. What I would like to do is to execute that operation in the background so that the rest of the program can continue executing and when that task is completed just update some specific object which. This change would be then detected by other components.
You want to make a new thread; depending on how long the method needs to be, you can make it inline:
// some code
new Thread(new Runnable() {
#Override public void run() {
// do stuff in this thread
}
}).start();
Or just make a new class:
public class MyWorker extends Thread {
public void run() {
// do stuff in this thread
}
}
// some code
new MyWorker().start();
You should use Thread Pools,
http://java.sun.com/docs/books/tutorial/essential/concurrency/pools.html
Naïve idea : you might be able to create Thread, give it a low priority, and do a loop of :
doing a little bit of work
using yield or sleep to let other threads work in parrallel
That would depend on what you actually want to do in your thread
Yes, you're going to want to spin the operation off on to it's own thread. Adding new threads can be a little dangerous if you aren't careful and aware of what that means and how resources will interact. Here is a good introduction to threads to help you get started.
Make a thread. Mark this thread as Daemon. The JVM exits when the only thread running are all daemon threads.
Related
I'm trying to test a method that does it's work in a separate thread, simplified it's like this:
public void methodToTest()
{
Thread thread = new Thread()
{
#Override
public void run() {
Clazz.i = 2;
}
};
thread.start();
}
In my unit test I want to test that Clazz.i == 2, but I can't do this because I think that the assert is run before the thread changes the value. I thought of using another thread to test it and then use join to wait but it still doesn't work.
SSCCE:
#Test
public void sscce() throws InterruptedException
{
Thread thread = new Thread()
{
#Override
public void run() {
methodToTest()
}
};
thread.start();
thread.join();
AssertEquals(2, Clazz.i);
}
public static class Clazz
{
public static int i = 0;
}
I think this is because the test main code creates a thread that is waiting (joined) to the 2nd thread, but the 2nd thread doesn't do the work, it creates another thread to do the work and then finishes, which continues the first thread, while the third thread does the Clazz.i = 2 after the assertion.
How can I make it so that the first thread waits for the thread that it starts as well as any threads that that thread starts?
Without a reference to the thread created in methodToTest, you cannot, quite simply. Java provides no mechanism for finding "threads that were spawned during this particular time period" (and even if it did, it would arguably be an ugly mechanism to use).
As I see it, you have two choices:
Make methodToTest wait for the thread it spawns. Of course, if you explicitly want this to be an asynchronous action, then you can't very well do that.
Return the newly created thread from methodToTest, so that any callers can choose to wait for it if they so wish.
It may be noted that the second choice can be formulated in a few different ways. You could, for instance, return some abstract Future-like object rather than a thread, if you want to extend the liberty of methodToTest to use various ways of doing asynchronous work. You could perhaps also define some global task-pool that you enforce all your asynchronous tasks to run inside, and then wait for all tasks in the pool to finish before checking the assertion. Such a task pool could take the form of an ExecutorService, or a ThreadGroup, or any number of other forms. They all do the same thing in the end, but may be more or less suited to your environment -- the main point being that you have to explicitly give the caller access to the newly created thread, is some manner or another.
Since your threads seems to be performing different operations, you can use CountDownLatch to solve your problem.
Declare a CountDownLatch in main thread and pass this latch object to other threads. use await() in main thread and decrement latch in other threads.
In Main thread: ( first thread)
CountDownLatch latch = new CountDownLatch(2);
/* Create Second thread and pass the latch. Pass the same latch from second
thread to third thread when you are creating third thread */
try {
latch.await();
} catch (InterruptedException e) {
e.printStackTrace();
}
Pass this latch to second and third threads and use countdown in these threads
In second and third threads,
try {
// add your business logic i.e. run() method implementation
latch.countDown();
} catch (InterruptedException e) {
e.printStackTrace();
}
Have a look this article for better understanding.
ExecutorService invokeAll() API is other preferable solution.
You can't unit-test functionality that the unit does not provide.
You're saying that you want to verify that methodToTest() eventually sets Clazz.i=2, but what does "eventually" mean? Your methodToTest() function does not provide its caller with any way to know when Clazz.i has been set. The reason you're having a hard time figuring out how to test the feature is because your module does not provide that feature.
This might be a good time for you to read up on Test Driven Development (TDD). That's where you write the tests first, and then you write code that makes the tests pass. Writing the tests first helps you to paint a clearer picture of whatever it is that you want the module to do.
It also has a side benefit: If you practice strict TDD (i.e., if you never write any module code except to make a test pass), then your test coverage will be 100%.
And, that leads to another side benefit: If you have 100% test coverage, then you can refactor without fear because if you break anything at all, your unit tests will tell you so.
I have a program that creates hundreds of instances of a class, each of which listens to another thread which simply fires an event on a regular timed schedule (so that they all perform at the same speed). What I'd like is for each of the hundreds of instances to be its own thread, so that when an event is fired, they can all work in parallel. What makes sense to me is to have these classes extend the Thread class and then have this code inside them...
public class IteratorStepListener implements StepEventListener {
public void actionPerformed(ActionEvent e) {
start();
}
}
public void run() {
doStuff();
}
This doesn't seem to work though. Clearly I'm not understanding something basic here. What's the proper way to do this?
Okay, first thing: overcome the notion that your hundreds of threads will run in parallel. At the very best, they will run concurrently, ie, time-sliced. As you get into the hundreds of threads, you will see the bearings on the scheduling algorithm start to glow; in the thousands they'll smoke and eventually seize up, and you'll get no more threads.
Now, that said, we don't have near enough code to understand what you're really doing, but one thing that I note is you don't seem to be making new Threads. Remember that a thread is an object; the canonical way to start a thread is
Thread t = new Thread(Runnable r);
t.run();
What it looks like is that you're trying to run() the same thread over and over again; this way lies madness. Have a look at Wiki on Event Driven Programming. If you really want to have a separate thread for handling each event, you'll want a scheme something like this (pseudocode):
processEvents: function
eventQueue: queue of Events
event: implements Runnable
-- something produces events and puts them on the queue
loop -- forever
do
Event ev := eventQueue.front
new Thread(ev).run();
od
end -- processEvents
It sounds like the event is going to be fired more than once... but you can't start the same thread more than once.
It sounds like your listener should implement the interface but start a thread directly in actionPerformed (or better, use an Executor so that it could use a thread pool). So instead of your current implementation, you could use:
// Assuming the listener implements runnable; you may want to
// delegate that to a separate class for separation of concerns.
public void actionPerformed(ActionEvent e) {
new Thread(this).start();
}
or
public void actionPerformed(ActionEvent e) {
executor.execute(this);
}
What I'd like is for each of the hundreds of instances to be its own thread, so that when an event is fired, they can all work in parallel.
I don't think this is a good approach.
Unless you have hundreds of processors, the threads cannot possibly all work in parallel. You'll end up with the threads running them one at a time (one per processor), or time-slicing between processors.
Each thread actually ties down a significant slice of the JVM's resources, even when inactive. IIRC, the default stack size is about 1 Mbyte.
The example code in your question shows the event calling start() on the thread. Unfortunately, you can only call start() on a thread once. Once the thread has terminated it cannot be restarted.
A better approach would be to create an executor with a bounded thread pool, and have each event cause a new task to be submitted to the executor. Something like this:
ThreadPoolExecutor executor = new ThreadPoolExecutor(corePoolSize, maxPoolSize,
keepAliveTime, timeUnit, workQueue);
...
public class IteratorStepListener implements StepEventListener, Runnable {
public void actionPerformed(ActionEvent e) {
executor.submit(this);
}
public void run() {
doStuff();
}
}
You can't use threads like that in Java. This is because Java threads directly map to underlying OS threads (at least on JVM implementations that I'm aware of), and OS threads can't scale like that. A rule of thumb is, you want to keep total number of threads within hundred or something in an app. A few hundred is probably ok. A few thousand gets usually problematic, depending on the HW you are using.
The use of threads like you described is a valid implementation strategy in languages like Erlang for example. Meanwhile, if you are stuck with Java this time, creating a shared thread pool and submitting your tasks to this instead of allowing all tasks to run concurrently might be a good alternative. In this case, you can choose a suitable number of threads (best number depends on the nature of the task. If you have no idea, number of CPU core available times 2 is a good start), and have that number of tasks run concurrently.
If you absolutely need all tasks to proceed concurrently, it could get a little complicated, but that's doable as well.
I know that it is not possible to restart a used Java Thread object, but I don't find an explanation why this is not allowed; even if it is guaranteed that the thread has finished (see example code below).
I don't see why start() (or at least a restart()) method should not be able to somehow reset the internal states - whatever they are - of a Thread object to the same values they have when the Thread object is freshly created.
Example code:
class ThreadExample {
public static void main(String[] args){
Thread myThread = new Thread(){
public void run() {
for(int i=0; i<3; i++) {
try{ sleep(100); }catch(InterruptedException ie){}
System.out.print(i+", ");
}
System.out.println("done.");
}
};
myThread.start();
try{ Thread.sleep(500); }catch(InterruptedException ie){}
System.out.println("Now myThread.run() should be done.");
myThread.start(); // <-- causes java.lang.IllegalThreadStateException
} // main
} // class
I know that it is not possible to
restart a used Java Thread object, but
I don't find an explanation why this
is not allowed; even if it is
guaranteed that the thread has
finished (see example code below).
My guestimation is that Threads might be directly tied (for efficiency or other constrains) to actual native resources that might be re-startable in some operating systems, but not in others. If the designers of the Java language had allowed Threads to be re-started, they might limit the number of operating systems on which the JVM can run.
Come to think of it, I cannot think of a OS that allows a thread or process to be restarted once it is finished or terminated. When a process completes, it dies. You want another one, you restart it. You never resurrect it.
Beyond the issues of efficiency and limitations imposed by the underlying OS, there is the issue of analysis and reasoning. You can reason about concurrency when things are either immutable or have a discrete, finite life-time. Just like state machines, they have to have a terminal state. Is it started, waiting, finished? Things like that cannot be easily reasoned about if you allow Threads to resurrect.
You also have to consider the implications of resurrecting a thread. Recreate its stack, its state, is is safe to resurrect? Can you resurrect a thread that ended abnormally? Etc.
Too hairy, too complex. All that for insignificant gains. Better to keep Threads as non-resurrectable resources.
I'd pose the question the other way round - why should a Thread object be restartable?
It's arguably much easier to reason about (and probably implement) a Thread that simply executes its given task exactly once and is then permanently finished. To restart threads would require a more complex view on what state a program was in at a given time.
So unless you can come up with a specific reason why restarting a given Thread is a better option than just creating a new one with the same Runnable, I'd posit that the design decision is for the better.
(This is broadly similar to an argument about mutable vs final variables - I find the final "variables" much easier to reason about and would much rather create multiple new constant variables rather than reuse existing ones.)
Because they didn't design it that way. From a clarity standpoint, that makes sense to me. A Thread represents a thread of execution, not a task. When that thread of execution has completed, it has done its work and it just muddies things were it to start at the top again.
A Runnable on the other hand represents a task, and can be submitted to many Threads as many times as you like.
Why don't you want to create a new Thread? If you're concerned about the overhead of creating your MyThread object, make it a Runnable and run it with a new Thread(myThread).start();
Java Threads follow a lifecycle based on the State Diagram below. Once the thread is in a final state, it is over. That is simply the design.
You can kind of get around this, either by using a java.util.concurrent.ThreadPoolExecutor, or manually by having a thread that calls Runnable.run() on each Runnable that it is given, not actually exiting when it is finished.
It's not exactly what you were asking about, but if you are worried about thread construction time then it can help solve that problem. Here's some example code for the manual method:
public class ReusableThread extends Thread {
private Queue<Runnable> runnables = new LinkedList<Runnable>();
private boolean running;
public void run() {
running = true;
while (running) {
Runnable r;
try {
synchronized (runnables) {
while (runnables.isEmpty()) runnables.wait();
r = runnables.poll();
}
}
catch (InterruptedException ie) {
// Ignore it
}
if (r != null) {
r.run();
}
}
}
public void stopProcessing() {
running = false;
synchronized (runnables) {
runnables.notify();
}
}
public void addTask(Runnable r) {
synchronized (runnables) {
runnables.add(r);
runnables.notify();
}
}
}
Obviously, this is just an example. It would need to have better error-handling code, and perhaps more tuning available.
If you are concerned with the overhead of creating a new Thread object then you can use executors.
import java.util.concurrent.Executor;
import java.util.concurrent.Executors;
public class Testes {
public static void main(String[] args) {
Executor executor = Executors.newSingleThreadExecutor();
executor.execute(new Testes.A());
executor.execute(new Testes.A());
executor.execute(new Testes.A());
}
public static class A implements Runnable{
public void run(){
System.out.println(Thread.currentThread().getId());
}
}
}
Running this you will see that the same thread is used for all Runnable objects.
A Thread is not a thread. A thread is an execution of your code. A Thread is an object that your program uses to create and, manage the life-cycle of, a thread.
Suppose you like playing tennis. Suppose you and your friend play a really awesome set. How would your friend react if you said, "That was incredible, let's play it again." Your friend might think you were nuts. It doesn't make sense even to talk about playing the same set again. If you play again you're playing a different set.
A thread is an execution of your code. It doesn't make sense to even talk about "re-using" a thread of execution for same reason that it makes no sense to talk about re-playing the same set in tennis. Even if another execution of your code executes all the same statements in the same order, it's still a different execution.
Andrzej Doyle's asked, "Why would you want to re-use a Thread?" Why indeed? If a Thread object represents a thread of execution---an ephemeral thing that you can't even talk about re-using---then why would you want or expect the Thread object to be re-useable?
i've been searching the same solution which you seem to be looking for, and i resolved it in this way. if you occur mousePressed Event you can terminate it also reuse it, but it need to be initialized, as you can see below.
class MouseHandler extends MouseAdapter{
public void mousePressed(MouseEvent e) {
if(th.isAlive()){
th.interrupt();
th = new Thread();
}
else{
th.start();
}
}
}
I've got a question related but not identical to my first question ever here:
Java: what happens when a new Thread is started from a synchronized block?
Is it a common practice to create and start() a new Thread when you're holding a lock?
Would that be a code smell?
Basically I've got a choice between doing this:
public synchronized void somemethod() {
(every time I find a callback to be notified I start a thread)
Thread t = new Thread( new Runnable() {
void run() {
notifySomeCallback();
}
}
t.start();
...
(lengthy stuff performed here, keeping the lock held)
...
}
or this:
public void somemethod() {
(create a list of callbacks to be notified)
synchronized(this){
(potentially add callbacks)
...
(lengthy stuff performed here, keeping the lock held)
...
}
(notify the callbacks without holding a lock and once
we know the lock has been released)
}
I think the latter is better but I wanted to know if there
are cases where the first option would be ok? Do you sometimes
do that? Have you seen it done?
answer3:
You should always hold on to a lock as short as possible. So only the resource which is potentially referenced to from multiple threads should be locked for the smallest amount of time when the chance of a 'corrupt' resource exists (e.g. the writer thread is updating the resource)
Don't spin off a thread for every little thing which needs to be done. In the case of your callback threads, have 1 callback thread work off a queue of things to do.
You are aware that the two code snippets will result in different execution orders.
The first one will run the callbacks asynchronously, while the lengthy stuff is being performed. The second one will finish doing the lengthy stuff first and then call the callbacks.
Which one is better depends on what the callbacks need to do. It might well be a problem if they need lengthy stuff to be done first.
Who is waiting on the lock?
If the callbacks need the lock to run, it makes little sense to fire them, while you still hold the lock. All they would do is just wait for lengthy stuff to be done anyway.
Also, in the first snippet, you have one thread per callback. The second snippet is not explicit, but if you have only one thread for all of them, this is another difference
(whether the callbacks run simultaneously or in sequence). If they all need the same lock, you might as well run them in sequence.
If you want to run many callbacks with one or more threads, consider using an Executor instead of managing the threads yourself. Makes it very easy to configure an appropriate number of threads.
It depends on whether or not you want the callbacks to be executed concurrently with the lengthy stuff or not. If we are talking about a Swing GUI, option 1 is not good, because you shouldn't do Swing operations in several concurrent threads, so I propose the following:
public void somemethod() {
Thread t = new Thread( new Runnable() {
void run() {
doLengthyStuff();
}
}
t.start();
(notify the callbacks)
}
What do you think is the best way for obtaining the results of the work of a thread? Imagine a Thread which does some calculations, how do you warn the main program the calculations are done?
You could poll every X milliseconds for some public variable called "job finished" or something by the way, but then you'll receive the results later than when they would be available... the main code would be losing time waiting for them. On the other hand, if you use a lower X, the CPU would be wasted polling so many times.
So, what do you do to be aware that the Thread, or some Threads, have finished their work?
Sorry if it looks similar to this other question, that's probably the reason for the eben answer, I suppose. What I meant was running lots of threads and know when all of them have finished, without polling them.
I was thinking more in the line of sharing the CPU load between multiple CPU's using batches of Threads, and know when a batch has finished. I suppose it can be done with Futures objects, but that blocking get method looks a lot like a hidden lock, not something I like.
Thanks everybody for your support. Although I also liked the answer by erickson, I think saua's the most complete, and the one I'll use in my own code.
Don't use low-level constructs such as threads, unless you absolutely need the power and flexibility.
You can use a ExecutorService such as the ThreadPoolExecutor to submit() Callables. This will return a Future object.
Using that Future object you can easily check if it's done and get the result (including a blocking get() if it's not yet done).
Those constructs will greatly simplify the most common threaded operations.
I'd like to clarify about the blocking get():
The idea is that you want to run some tasks (the Callables) that do some work (calculation, resource access, ...) where you don't need the result right now. You can just depend on the Executor to run your code whenever it wants (if it's a ThreadPoolExecutor then it will run whenever a free Thread is available). Then at some point in time you probably need the result of the calculation to continue. At this point you're supposed to call get(). If the task already ran at that point, then get() will just return the value immediately. If the task didn't complete, then the get() call will wait until the task is completed. This is usually desired since you can't continue without the tasks result anyway.
When you don't need the value to continue, but would like to know about it if it's already available (possibly to show something in the UI), then you can easily call isDone() and only call get() if that returns true).
You could create a lister interface that the main program implements wich is called by the worker once it has finished executing it's work.
That way you do not need to poll at all.
Here is an example interface:
/**
* Listener interface to implement to be called when work has
* finished.
*/
public interface WorkerListener {
public void workDone(WorkerThread thread);
}
Here is an example of the actual thread which does some work and notifies it's listeners:
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
/**
* Thread to perform work
*/
public class WorkerThread implements Runnable {
private List listeners = new ArrayList();
private List results;
public void run() {
// Do some long running work here
try {
// Sleep to simulate long running task
Thread.sleep(5000);
} catch (InterruptedException e) {
e.printStackTrace();
}
results = new ArrayList();
results.add("Result 1");
// Work done, notify listeners
notifyListeners();
}
private void notifyListeners() {
for (Iterator iter = listeners.iterator(); iter.hasNext();) {
WorkerListener listener = (WorkerListener) iter.next();
listener.workDone(this);
}
}
public void registerWorkerListener(WorkerListener listener) {
listeners.add(listener);
}
public List getResults() {
return results;
}
}
And finally, the main program which starts up a worker thread and registers a listener to be notified once the work is done:
import java.util.Iterator;
import java.util.List;
/**
* Class to simulate a main program
*/
public class MainProg {
public MainProg() {
WorkerThread worker = new WorkerThread();
// Register anonymous listener class
worker.registerWorkerListener(new WorkerListener() {
public void workDone(WorkerThread thread) {
System.out.println("Work done");
List results = thread.getResults();
for (Iterator iter = results.iterator(); iter.hasNext();) {
String result = (String) iter.next();
System.out.println(result);
}
}
});
// Start the worker thread
Thread thread = new Thread(worker);
thread.start();
System.out.println("Main program started");
}
public static void main(String[] args) {
MainProg prog = new MainProg();
}
}
Polling a.k.a busy waiting is not a good idea. As you mentioned, busy waiting wastes CPU cycles and can cause your application to appear unresponsive.
My Java is rough, but you want something like the following:
If one thread has to wait for the output of another thread you should make use of a condition variable.
final Lock lock = new ReentrantLock();
final Condition cv = lock.newCondition();
The thread interested in the output of the other threat should call cv.wait(). This will cause the current thread to block. When the worker thread is finished working, it should call cv.signal(). This will cause the blocked thread to become unblocked, allowing it to inspect the output of the worker thread.
As an alternative to the concurrency API as described by Saua (and if the main thread doesn't need to know when a worker thread finishes) you could use the publish/subscribe pattern.
In this scenario the child Thread/Runnable is given a listener that knows how to process the result and which is called back to when child Thread/Runnable completes.
Your scenario is still a little unclear.
If you are running a batch job, you may want to use invokeAll. This will block your main thread until all the tasks are complete. There is no "busy waiting" with this approach, where the main thread would waste CPU polling the isDone method of a Future. While this method returns a list of Futures, they are already "done". (There's also an overloaded version that can timeout before completion, which might be safer to use with some tasks.) This can be a lot cleaner than trying to gather up a bunch of Future objects yourself and trying to check their status or block on their get methods individually.
If this is an interactive application, with tasks sporadically spun off to be executed in the background, using a callback as suggested by nick.holt is a great approach. Here, you use the submit a Runnable. The run method invokes the callback with the result when it's been computed. With this approach, you may discard the Future returned by submit, unless you want to be able to cancel running tasks without shutting down the whole ExecutorService.
If you want to be able to cancel tasks or use the timeout capabilities, an important thing to remember is that tasks are canceled by calling interrupt on their thread. So, your task needs to check its interrupted status periodically and abort as needed.
Subclass Thread, and give your class a method that returns the result. When the method is called, if the result hasn't been created, yet, then join() with the Thread. When join() returns, your Thread's work will be done and the result should be available; return it.
Use this only if you actually need to fire off an asynchronous activity, do some work while you're waiting, and then obtain the result. Otherwise, what's the point of a Thread? You might as well just write a class that does the work and returns the result in the main thread.
Another approach would be a callback: have your constructor take an argument that implements an interface with a callback method that will be called when the result is computed. This will make the work completely asynchronous. But if you at all need to wait for the result at some point, I think you're still going to need to call join() from the main thread.
As noted by saua: use the constructs offered by java.util.concurrent. If you're stuck with a pre 1.5 (or 5.0) JRE, you ,might resort to kind of rolling your own, but you're still better of by using a backport: http://backport-jsr166.sourceforge.net/