I am running a multi-threaded Java web application on Apache Tomcat 6. Instead of using the new Thread(); anti-pattern, I leave thread instantiation to Tomcat (see code below).
I noticed in the last days that the web application gets slower and slower. After restarting the servlet container everything is back to normal.
Since I am not terminating threads after processing them (Don't know if I have to or if the Garbage Collector will destroy them), I am guessing that this is the cause for the performance loss.
The code basically looks like this:
Custom Server Listener (I added this to web.xml):
public class MyTaskRunner implements ServletContextListener {
public static final ExecutorService EXECUTOR_SERVICE = ExecutorService.newFixedThreadPool(10000);
public void contextDestroyed(ServletContextEvent sce) {
EXECUTOR_SERVICE.shutdownNow();
}
public void contextInitialized(ServletContextEvent sce) {
}
}
Thread instantiation:
for (Object foo : bar){
MyTaskRunner.EXECUTOR_SERVICE.submit(new Runnable() {
public void run() {
doSomethingWith(foo);
});
}
So, is there anything special that I have to do after run() has finished?
Some basic facts about threads and GC:
While a thread is running, it will not be garbage collected.
When a thread is terminated, its stack is deleted and its Thread object is removed from the ThreadGroup data structures. The remainder of the threads' state is subject to the normal reachability rules.
You don't need to do anything special to make a thread terminate. It happens when the run method call ends, either because it returned, or because it exited with an (uncaught) exception.
Now to your particular problem. Many things could be causing your performance degradation. For instance:
Some of your threads may be getting stuck; e.g. waiting on a lock, waiting for a notify that isn't going to arrive, or just stuck in an infinite CPU loop.
You may have too many threads running (doing useful, or semi-useful things), and the slowdown may be a result of lock contention, CPU resource starvation, or thrashing.
You may have a memory leak of some kind, and the slowdown may be a symptom of your application starting to run out of heap space.
To figure out what is going on, you are going to need to do some performance monitoring:
Look at the OS level stats to see if the application is thrashing.
Use a memory profile to see if your application is short of memory. If it is, see if there is a memory leak, and track it down.
Use a performance profiler to see if there are particular hotspots, or if there is a lot of lock contention.
I agree with the comment. A thread pool size of 10000 is dangerously large. It should be a small constant multiplied by the number of cores available to your application.
Are you registering the context listener in web.xml correctly?
Does your run methods end at some point? or do they keep running indefinitely?
The ExecutorSerfice will simply call Thread.start for you, you are still starting threads from inside your webapp, which is not inherently such a terrible thing as long as you shut them down properly. No matter what tecnique you use, the execution of your run() method will not be truncated.
new Thread(); anti-pattern
where do you people get that this is an anti-pattern?
anyway, 10000 threads is way too much. Each thread takes up memory and OS has its own limitations, so, as a safe path, try not to exceed a couple thousand threads.
connect jconsole (jstack might be enough) to see what your threads are doing - whether they're just sleeping, or waiting for something.
and it's not Tomcat who's managing the threads, it's ExecutorService inside java.
Related
I'm wondering why there seems to be no support for thread level 'shutdown hooks', which run when a specific thread terminates; not when the JVM terminates.
So lets say someone wrote a simple thread with a run method with sudo code like this (intentionally leaving out thread interrupt here for now...):
public void run(){
SeverSocket serverSocket=new ServerSocket(port);
while(!isStopRequested){
Socket socket=serverSocket.accept();
processRequest(socket);
}
runShutdownLogic();
}
public void stopServer(){
isStopRequested=false;
//interrupt thread potentially, see below
}
This thread could die in a few ways:
someone calls stopServer, followed by either...
a. the serversocket.accept accepting one last socket and returning
b. an interrupt sent to intterupt serverSocket.accept
an exception is thrown
Someone kills the thread, directly or through executor service.
The JVM goes down.
In any of these cases we want to run the shutdownLogic method, lets say it does something more then just close the seversocket, some interface with an external source that is important to do no matter how the thread shuts down.
As I understand it this is not very easy to do, in fact it seems hard enough that I feel I must be missing some basic threading feature. the 1a case is simple and works as is. 1b case works so long as the developer doesn't swallow interruptExceptions, something that is done way to often but is easy enough to avoid if you know what an interrupt exception is.
In case of an exception you need to move the shutdown method into a finally block.
In cases 3 & 4 though this gets harder. For 3 I think threads can be killed 'nicely', with an interrupt that one can catch, check to see it's a sigkill, and then force an exit of the code, but this requires even more intelligent handling of a InterruptException that most improperly swallow; plus would get ugly fast if this check has to be done in dozens of locations that can through interrupts. You can't do much for a hard kill, but no one expects proper shutdown logic for a hard kill so that's fine.
For a JVM shutdown...I don't actually know the exact method the threads are killed. I assume a sigkill is sent to the threads with a timeout before a hard kill, I'd have to research it more. If you want to be safe you can add a shutdown hook, but there is no gaurentee of order that shutdown hooks are run and trying to add shutdown hooks for each thread requires careful writing of the hooks to ensure you don't stall or stop the JVM shutdown with a deadlock or unexpected exception in the hook....
If instead of a thread like the one above I have a thread with a finite, but potentially long, processing time, without any waits, it gets even harder since I can't listen for an interrupted exception to know that I need to give up on my threads processing and run the shutdown logic immediately.
Basically, it seems like different method is needed to handle each manner a thread can execute, and needs to be done with every thread. And still in the case of high CPU threads without waits I still don't now how to gaurente a proper shutdown occurs if the thread (not the whole JVM) is killed midway through...
Is there not a simpler solution to all of this? For instance the equivalent of a thread level shutdown hook which will run when that specific thread is being killed, regardless of how it dies; even if JVM itself is not shutting down? Is there some reason a thread level shutdownhook is not possible or dangerous to support, assuming that such doesn't exist.
At least one of the reasons is that there really is not a safe and clean mechanism, which is also why Thread.stop() is deprecated. By creating a (seemingly) simple mechanism for it, people might think that it's a simple issue and use it wildly.
The same issue exists for finalizers and shutdownhooks. They're not reliable, so it's not a good idea to let developers think that it's a normal tool that they're supposed to use.
Yes, Java provides such a mechanism. Simply use a try/finally construction in your run() method, either in your Thread subclass or in your Runnable if you are using a Runnable:
public void run() {
try {
doBody()
}
finally {
doThreadShutdown()
}
}
This should take care of all of the cases that you are looking for, including normal shutdown of the virtual machine, since the virtual machine shuts down only after all nondaemon threads exit. Exceptions would be hard stop of the thread, hard kill of the virtual machine, or if the thread is a daemon thread and the virtual machine exits.
I have a controller class that, in the course of its operation, uses an executor it maintains to perform tasks. If I just let the gc clean up the controller when it goes out of scope, the JVM doesn't seem to die. I'm assuming that this is because the default executor doesn't time out, or times out after a long while.
Given that the executor should never be shutdown while the controller is still accessible, and that the executor will not be used after the controller is garbage-collected, would it be an safe/acceptable use of a finalizer to use:
#Override
public void finalize() {
executor.shutdown();
}
I ask because every discussion about finalizers seems to boil down to, "do not use unless very specific circumstances that I'm not going to elaborate on."
The reason the executor doesn't die is that the thread isn't a daemon thread, so it keeps the JVM alive. See Thread.setDaemon(boolean).
This isn't a good time to use a finalizer. Finalizers should only be used to clean up native resources (eg resources accessed via JNI).
If I have the following dummy code:
public static void main(String[] args) {
TestRunnable test1 = new TestRunnable();
TestRunnable test2 = new TestRunnable();
Thread thread1 = new Thread(test1);
Thread thread2 = new Thread(test2);
thread1.start();
thread2.start();
}
public static class TestRunnable implements Runnable {
#Override
public void run() {
while(true) {
//bla bla
}
}
}
In my current program I have a similar structure i.e. two threads executing the same Run() method. But for some reason only thread 1 is given CPU time i.e. thread 2 never gets a chance to run. Is this because while thread 1 is in its while loop , thread 2 waits?
I'm not exactly sure, if a thread is in a while loop is it "blocking" other threads? I would think so, but not 100% sure so it would be nice to know if anyone could inform me of what actually is happening here.
EDIT
Okay, just tried to make a really simple example again and now both threads are getting CPU time. However this is not the case in my original program. Must be some bug somewhere. Looking into that now. Thanks to everyone for clearing it up, at least I got that knowledge.
There is no guarantee by the JVM that it will halt a busy thread to give other threads some CPU.
It's good practice to call Thread.yield();, or if that doesn't work call Thread.sleep(100);, inside your busy loop to let other threads have some CPU.
At some point a modern operating system will preempt the current context and switch to another thread - however, it will also (being a rather dumb thing overall) turn the CPU into a toaster: this small "busy loop" could be computing a checksum, and it would be a shame to make that run slow!
For this reason, it is often advisable to sleep/yield manually - even sleep(0)1 - which will yield execution of the thread before the OS decides to take control. In practice, for the given empty-loop code, this would result in a change from 99% CPU usage to 0% CPU usage when yielding manually. (Actual figures will vary based on the "work" that is done each loop, etc.)
1The minimum time of yielding a thread/context varies based on OS and configuration which is why it isn't always desirable to yield - but then again Java and "real-time" generally don't go in the same sentence.
The OS is responsible for scheduling the thread. That was changed in couple of years ago. It different between different OS (Windows/Linux etc..) and it depends heavily on the number of CPUs and the code running. If the code does not include some waiting functionality like Thread.yield() or synchhonized block with a wait() method on the monitor, it's likely that the CPU will keep the thread running for a long time.
Having a machine with multiple CPUs will improve your parallelism of your application but it's a bad programming to write a code inside a run() method of a thread that doesn't let other thread to run in a multi-threaded environment.
The actual thread scheduling should be handled by the OS and not Java. This means that each Thread should be given equal running time (although not in a predictable order). In your example, each thread will spin and do nothing while it is active. You can actually see this happening if inside the while loop you do System.out.println(this.toString()). You should see each thread printing itself out while it can.
Why do you think one thread is dominating?
I saw the following example on the internet:
public class TwoThreads {
public static class Thread1 extends Thread {
public void run() {
System.out.println("A");
System.out.println("B");
}
}
public static class Thread2 extends Thread {
public void run() {
System.out.println("1");
System.out.println("2");
}
}
public static void main(String[] args) {
new Thread1().start();
new Thread2().start();
}
}
My question is :
It is guarantee that "A" will be printed Before "B" and "1" will be printed before "2", but is it possible that "1" will be printed twice successively by another thread?.In this piece of code we have at least 3 threads(1 main and 2 created). can we imagine the scheduler runs 1 thread: new Thread1().start(); then gave up immediately after System.out.println("1"); then again run another threat in Thread1().start(); that prints "1" again ?
I am using NetBeans IDE, it seems running such a program always lead to the same first result, so it seems there something with caching. From my understanding you deal with that with declaring volatile variables, can it be done here,how ? if not then what is the solution for caching ?
In today's Computer's processor, we mostly have 2 processors,and still we find many multi-threading programs on the net uses more than 2 threads! isn't this process becomes heavy and slow regarding compiling ?
1) There is no guarantee in what order the threads will proceed.
2) The order is not randomized, either, though. So if you run the program under identical (or very similar) conditions, it will probably yield the same thread interleaving. If you need to have a certain behaviour (including randomized behaviour) you need to synchronized things yourself.
3) A CPU with two cores can only run two threads at the same time, but most threads spend most of their time not actually using the CPU, but waiting for something like I/O or user interaction. So you can gain a lot from having more than two threads (only two can concurrently compute, but hundreds can concurrently wait). Take a look at node.js, a recently popular alternative to multi-threaded programming that achieves great throughput for concurrent requests while having only a single thread of execution.
Answer to your 1/2 question:
Though threads run parallel code inside run method of thread is always executed sequentially.
Answer to your 3 question you can best tune your. Application if number of processors = number of threads but this is not a complete truth since if thread is waiting for some blocking operation then it will lead to un optimized performance since during that time another thread could run.
No. You are not synchronizing your threads in any way, so the exact execution order will be at the mercy of the scheduler. Given how your threads are implemented, I don't see how you could ever having "1" (or "A") being printed twice by a single thread.
What caching? And what variables? Your example code has no variables, and therefore nothing that would be appropriate to use with the volatile keyword. It's quite likely that on a given machine running this program will always produce the same result. As noted in #1, you're at the mercy of the scheduler. If the scheduler always behaves the same way, you'll always get the same result. Caching has nothing to do with it.
That depends upon what the threads are doing. If every thread has enough work to load one CPU core to 100%, then yes, having more threads than you have CPU cores is pointless. However, this is very rarely the case. Many threads will spend most of their time sleeping, or waiting for I/O to complete, or otherwise doing things that are not demanding enough to fully load a CPU core. In such a case there's no problem whatsoever with having more threads that CPU cores. In fact, multithreading predates mainstream multicore CPU's, and even back in the days when none of us had more than one CPU core it was still extremely beneficial to be able to have more than one thread.
What's the proper way for a Java command line application to do background work without hogging resources? Should it use sleep() in the loop or is there a more elegant/efficient way?
Some heuristics:
Don't attempt to make scheduling decisions in your application. The operating system's scheduler is way better than yours will be. Let it do its job.
Don't poll if you don't have to. For instance, instead of sleeping n seconds, then waking up to check a non-blocked socket, block on the socket. This second strategy plays better with the operating system's scheduler.
Don't use an enormous heap if you don't have to, and try not to allocate enormous chunks of memory at one time. A thrashing application tends to have a negative effect on system performance.
Use buffered I/O. Always. If you think you need non-buffered I/O, be absolutely sure you're right. (You're probably wrong.)
Don't spawn a lot of threads. Threads are surprisingly expensive; beyond a certain point, more threads will reduce your application's performance profile. If you have lots of work to do concurrently, learn and use java.util.concurrent.
Of course, this is just a starter list...
I'd only use sleep() if there's no work to be done. For example, if you're doing something like polling a task queue periodically and there's nothing there, sleep for a while then check again, etc.
If you're just trying to make sure you don't hog the CPU but you're still doing real work, you could call Thread.yield() periodically. That will relinquish control of the CPU and let other threads run, but it won't put you to sleep. If other processes don't need the CPU you'll get control back and continue to do your work.
You can also set your thread to a low priority:
myThread.setPriority(Thread.MIN_PRIORITY);
As Ishmael said, don't do this in your main thread. Create a "worker thread" instead. That way your UI (GUI or CLI) will still be responsive.
There are several ways. I would use ExecutorService... for example:
ExecutorService service = Executors.newCachedThreadPool();
Callable<Result> task = new Callable<Result>() {
public Result call() throws Exception {
// code which will be run on background thread
}
};
Future<Result> future = service.submit(task);
// Next line wait until background task is complete
// without killing CPU. Of course, you can do something
// different here and check your 'future' later.
//
// Note also that future.get() may throw various exceptions too,
// you'll need to handle them properly
Result resultFromBackgroundThread = future.get();
This is Java 5 code, ExecutorService, Callable, Future and similar are in java.util.concurrent package.
One place to start is to make sure that only those resources are being used and no other objects (so that they become garbage collected).
Placing sleep() in a single threading application is only going to halt the current thread. If you're trying to accomplish data being processed in the background while information still needs to be presented to the user then it is best to put the background process in a seperate thread.