I'll try to be short.
Need a number of threads to open sockets (each thread opens one socket) and make HTTP Requests. I am new to multi-threaded and I don't know if this is possible, since each thread must be running until the request is finished (i think).
[edit after comments]
I don't know if this is possible since currently running thread can be suspended before the response is fetched.
Thanks for any help.
It sounds like a Thread pool is what you need.
There is a section in the Java Concurrency Tutorial about them.
(This is pretty heavy stuff for a beginner though)
Yep, definately possible.
In response to your further query
The fact that a thread is suspended doesn't stop it from recieving data over a socket. If any data arrives while the thread is suspended it is queued until the thread resumes.
What do you mean by "suspended"? If you refer to the context-switching between threads, then you have some holes in your understanding of multi threading. It is the same as multi tasking in a OS: You're running Word and Explorer at the same time on your machine, and the one application doesn't die when the other needs to run - the operating system instead puts one process/thread into wait by saving all its state, then retrieves all state for the next thread and then sets it into motion. This goes back and forth so fast that it seems like they run at the same time - but on a single-processor machine, only one thread really runs at any specific time.
The thread itself doesn't "know" this - only if it continuously run in a tight loop checking the time, it will notice that the time jerks: The time increases smoothly for some milliseconds, but then suddenly the time jumps forward and then still runs smoothly for a new set of milliseconds. The jump is when another thread was running. Each such period of smooth running is called a time slice, or quantum. But if the thread doesn't need the processor, e.g. when it waits for I/O, then the OS takes it back before the time slice is over.
The thread exits (dies) when you exit/return from the run() method - not before.
For fetching multiple HTTP connections, multi threading is ideal: The thread will use most of the time waiting for incoming bytes on the network - and while it waits, the OS knows this and sticks the thread into "IO wait", instead running other threads in the mean time (or just wastes away cycles if no thread needs to run, e.g. everyone is waiting for IO - or in these days, the processor throttles down).
Yes, what you describe is very typical amongst java programs that retrieve data via HTTP.
Yes, this is possible.
Look here: http://andreas-hess.info/programming/webcrawler/index.html
or google for "java multi thread web crawler"
Related
I am developing a Java application that has two threads:
A producer thread that feeds an ArrayBlockingQueue at a frequency of 10 KHz (It is really a C code through JNI).
A consumer thread that takes data from the queue, using take method, and then process it (you can't assume the processing time is always the same). Due to I am using take method, this thread can be blocked if no data is available in the queue.
I would like to know how can I monitor or profiling the consumer thread to know how many time it is waiting or blocked.
I am not interested in answers such as taking times with System.currentTimeMillis() and taking differences. I want to know how to analyze the whole thread life and sum up how many time has been in every thread state, if this is possible.
How do you do this kind of monitoring?
Thanks in advance!
Any decent Java Profiler can separate statistics by thread, even the otherwise rather basic JVisualVM included with the JDK. Here's a screenshot of JVisualVM watching itself:
The same information can also be displayed in a table:
I have a bit of an issue with an application running multiple Java threads.
The application runs a number of working threads that peek continuously at an input queue and if there are messages in the queue they pull them out and process them.
Among those working threads there is another verification thread scheduled to perform at a fixed period a check to see if the host (on which the application runs) is still in "good shape" to run the application. This thread updates an AtomicBoolean value which in turn is verified by the working thread before they start peeking to see if the host is OK.
My problem is that in cases with high CPU load the thread responsible with the verification will take longer because it has to compete with all the other threads. If the AtomicBoolean does not get updated after a certain period it is automatically set to false, causing me a nasty bottleneck.
My initial approach was to increase the priority of the verification thread, but digging into it deeper I found that this is not a guaranteed behavior and an algorithm shouldn't rely on thread priority to function correctly.
Anyone got any alternative ideas? Thanks!
Instead of peeking into a regular queue data structure, use the java.util.concurrent package's LinkedBlockingQueue.
What you can do is, run an pool of threads (you could use executer service's fixed thread pool, i.e., a number of workers of your choice) and do LinkedBlockingQueue.take().
If a message arrives at the queue, it is fed to one of the waiting threads (yeah, take does block the thread until there is something to be fed with).
Java API Reference for Linked Blocking Queue's take method
HTH.
One old school approach to throttling rate of work, that does not use a health check thread at all (and so by-passes these problems) is to block or reject requests to add to the queue if the queue is longer than say 100. This applies dynamic back pressure on to the clients generating the load, slowing them down when the worker threads are over loaded.
This approach was added to the Java 1.5 library, see java.util.concurrent.ArrayBlockingQueue. Its put(o) method blocks if the queue is full.
Are u using Executor framework (from Java's concurrency package)? If not give it a shot. You could try using ScheduledExecutorService for the verification thread.
More threads does not mean better performance. Usually if you have dual core, 2 threads gives best performance, 3 or more starts getting worse. Quad core should handle 4 threads best, etc. So be careful how much threads you use.
You can put the other threads to sleep after they perform their work, and allow other threads to do their part. I believe Thread.yield() will pause the current thread to give time to other threads.
If you want your thread to run continuously, I would suggest creating two main threads, thread A and B. Use A for the verification thread, and from B, create the other threads. Therefore thread A gets more execution time.
Seems you need to utilize Condition variables. Peeking will take cpu cycles.
http://docs.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/locks/Condition.html
I'm making a simple server that will spawn multiple threads to handle multiple clients. I was wondering the proper way to shut down and close all the various streams and threads when the server is terminated.
I added a shutdownHook that runs a method that tells the server to shutdown. The server, in turn, broadcasts the shutdown call to all of the threads it has opened, which sets a "isClosed" boolean in each thread to true.
What I'm expecting is that each thread, when reaching the end of the run() method and looping up again, hits the while(!isClosed) conditional, thereby properly terminating themselves by closing all the proper sockets/streams and returning.
However, I don't know if this will properly close everything since the program should terminate after the shutdownhook completes. It completes fairly early since all it does is propagate the closing message. Does this mean that some threads won't get enough time to properly close?
If so, would the best method be to have the shutdownhook manually close every thread, ensuring that they have closed, before returning?
You are correct that the threads will likely not have enough time to terminate properly if the server is terminated. However, depending on what you're trying to do, this may or may not be a problem. If there is no cleanup work needed, then you probably do not need to worry about it because having the threads abruptly terminate will cause no issues.
However, if there is cleanup work that needs to be done (such as writing to a database), then you need something else. The best way to do this (in Java) is using an Executor/ExecutorService and related items (http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/Executors.html). Your problem is addressed well by these, plus you get some nice freebies such as thread pool management so that scaling is much easier. If you spawn a new thread for every client you will have big problems when you try to scale later because you can't be creating a million threads per minute, for example.
Using the Excecutor stuff is a bit of an adjustment if you're used to using raw threads, but it is worth the research. Good luck!
Using an ExecutorService is the modern way of doing it. It takes so much of the fiddly bits away from the code.
Here is a good place to start.
The shutdownHook happens too late in the the cycle to be useful that way. It is expected to complete quickly and the JVM is already on the way down, which could take existing threads with it if they are daemons.
I would just set a read timeout on the connection threads of say 15-30 seconds. If the timeout happens (SocketTimeoutException), close the socket and exit the thread. The clients will have to cope with dropped connections of course, but they have to do that already. Then when you want to shutdown, just stop accepting new connections (e.g. close the ServerSocket and have its accept thread cope correctly with the resulting exception). When all the existing connection threads have exited, the JVM will exit, and that should really take no longer than the timeout period plus the length of the longest transaction. Make sure the connection threads aren't daemons.
If you don't mind clients getting chopped off in mid-transaction, just call System.exit().
Have you considered making your threads daemon threads.
just add t.setdaemon(true); before calling the start method of the thread.
If these threads should be ended when the program is ended than making them daemon will kill them once all the other non daemon thread has ended.
threads that are used in threadpool are good example for threads that should be daemons.
and i really think it can be useful for you.
I have a server program which polls a database for new requests , I want this polling to be done at 1 minute intervals so , I've set up a Thread.sleep() in the program while loop.
The problem is that whenever this program is supposed to "sleep" the CPU consumption goes up drastically (viz. about 25 - 30%).
Paradoxically, when the program is not dormant and is busy processing requests , the CPU consumption drops to 0.4%.
I read online and found out that there are performance hits associated with thread.sleep, but I could not find any viable alternative (Thread.wait requires notification on an object, something which I feel is useless in my scenario)
The main loop (when there are no new requests) doesn't do anything, here is a skeleton of all that is being done when the CPU consumption is 25%
-> poll
-> No new records ?
-> Sleep
->repeat
Check what the CPU consumption is for individual CPU cores. If you are using a 4 core machine, maybe one thread is going rogue and is eating up once core (25%). This usually happens when the thread is in a tight loop.
You could use Thread.wait with a timeout (which indeed the Timer class does), but my bet is that it won't make any difference. Both Thread.sleep and Thread.wait changes the threads' state to not runnable. Although it depends on your JVM implementation etc., the thread shouldn't consume that much CPU in such situation. So my bet is that there is some bug at work.
Another thing you can do is taking a thread dump and see what the thread is doing when this happens. Use kill -3 on a Linux box, or use ctrl+break on the java console window if you are using Windows. Then, examine the thread dump that is dumped to the standard output. Then you can be sure if the thread was actually sleeping or was doing something else.
As many people pointed out, Thread.sleep should and actually does help with dropping the CPU usage drastically.
I omitted certain facts from my original question as I thought they were not relevant.
The main thread was the producer, there was another thread running asynchronously which was the consumer. It turns out that the "sleep" on this thread was inside some weird condition that wasn't getting triggered properly. So the loop on that thread was never sleeping.
Once the sleep thing was eliminated I went ahead and analyzed it closely to realize the problem.
Inline Java IDE hint states, "Invoking Thread.sleep in loop can cause performance problems." I can find no elucidation elsewhere in the docs re. this statement.
Why? How? What other method might there be to delay execution of a thread?
It is not that Thread.sleep in a loop itself is a performance problem, but it is usually a hint that you are doing something wrong.
while(! goodToGoOnNow()) {
Thread.sleep(1000);
}
Use Thread.sleep only if you want to suspend your thread for a certain amount of time. Do not use it if you want to wait for a certain condition.
For this situation, you should use wait/notify instead or some of the constructs in the concurrency utils packages.
Polling with Thread.sleep should be used only when waiting for conditions external to the current JVM (for example waiting until another process has written a file).
It depends on whether the wait is dependent on another thread completing work, in which case you should use guarded blocks, or high level concurrency classes introduced in Java 1.6. I recently had to fix some CircularByteBuffer code that used Thread sleeps instead of guarded blocks. With the previous method, there was no way to ensure proper concurrency. If you just want the thread to sleep as a game might, in the core game loop to pause execution for a certain amount of time so that over threads have good period in which to execute, Thread.sleep(..) is perfectly fine.
It depends on why you're putting it to sleep and how often you run it.
I can think of several alternatives that could apply in different situations:
Let the thread die and start a new one later (creating threads can be expensive too)
Use Thread.join() to wait for another thread to die
Use Thread.yield() to allow another thread to run
Let the thread run but set it to a lower priority
Use wait() and notify()
http://www.jsresources.org/faq_performance.html
1.6. What precision can I expect from Thread.sleep()?
The fundamental problem with short sleeps is that a call to sleep finishes the current scheduling time slice. Only after all other threads/process finished, the call can return.
For the Sun JDK, Thread.sleep(1) is reported to be quite precise on Windows. For Linux, it depends on the timer interrupt of the kernel. If the kernel is compiled with HZ=1000 (the default on alpha), the precision is reported to be good. For HZ=100 (the default on x86) it typically sleeps for 20 ms.
Using Thread.sleep(millis, nanos) doesn't improve the results. In the Sun JDK, the nanosecond value is just rounded to the nearest millisecond. (Matthias)
why? that is because of context switching (part of the OS CPU scheduling)
How? calling Thread.sleep(t) makes the current thread to be moved from the running queue to the waiting queue. After the time 't' reached the the current thread get moved from the waiting queue to the ready queue and then it takes some time to be picked by the CPU and be running.
Solution: call Thread.sleep(t*10); instead of calling Thread.Sleep(t) inside loop of 10 iterations ...
I have face this problem before when waiting for asynchronous process to return a result.
Thread.sleep is a problem on multi thread scenario. It tends to oversleep. This is because internally it rearrange its priority and yields to other long running processes (thread).
A new approach is using ScheduledExecutorService interface or the ScheduledThreadPoolExecutor introduce in java 5.
Reference: http://download.oracle.com/javase/1,5.0/docs/api/java/util/concurrent/ScheduledExecutorService.html
It might NOT be a problem, it depends.
In my case, I use Thread.sleep() to wait for a couple of seconds before another reconnect attempt to an external process. I have a while loop for this reconnect logic till it reaches the max # of attemps. So in my case, Thread.sleep() is purely for timing purpose and not coordinating among multithreads, it's perfectly fine.
You can configure you IDE in how this warning should be handled.
I suggest looking into the CountDownLatch class. There are quite a few trivial examples out there online. Back when I just started multithreaded programming they were just the ticket for replacing a "sleeping while loop".