Multithreading and Thead pools- Need Design suggestion

Multithreading and Thead pools- Need Design suggestion - java

I want to implement something like this.
1.A background process which will be running forever
2.The background process will check the database for any requests in pending state. If any found,will assign a separate thread to process the request.So one thread per request.Max threads at any point of time should be 10. Once the thread has finished execution,the status of the request will be updated to something,say "completed".
My code outline looks something like this.
public class SimpleDaemon {
private static final int MAXTHREADS = 10;
public static void main(String[] args) {
ExecutorService executor = Executors.newFixedThreadPool(MAXTHREADS);
RequestService requestService = null; //init code omitted
while(true){
List<Request> pending = requestService.findPendingRequests();
List<Future<MyAppResponse>> completed = new ArrayList<Future<MyAppResponse>>(pending.size());
for (Request req:pending) {
Callable<MyAppResponse> worker = new MyCallable(req);
Future<MyAppResponse> submit = executor.submit(worker);
completed.add(submit);
}
// Now retrieve the result
for (Future<MyAppResponse> future : completed) {
try {
requestService.updateStatus(future.getRequestId());
} catch (InterruptedException e) {
e.printStackTrace();
} catch (ExecutionException e) {
e.printStackTrace();
}
}
try {
Thread.sleep(10000); // Sleep sometime
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
}
Can anyone spend sometime to review this and comment any suggestion/optimization (from multi threading perspective) ? Thanks.

Using a max threads of ten seems somewhat arbitrary. Is this the maximum available connections to your database?
I'm a little confused as to why you are purposefully introducing latency into your applications. Why aren't pending requests submitted to the Executor immediately?
The task submitted to the Executor could then update the RequestService, or you could have a separate worker Thread belonging to the RequestService which calls poll on a BlockingQueue of Future<MyAppResponse>.
You have no shutdown/termination strategy. Nothing indicates that main is run on a Thread that is set to Daemon. If it is, I think the ExecutorService's worker threads will inherit the daemon status, but then your application could shutdown with live connection to the database, no? Isn't that bad?
If the thread isn't really a Daemon, then you need to handle that InterruptedException and treat it as an indication that you are being asked to exit the application.

Your calls to requestService appear to be single threaded resulted in any long running queries preventing completed queries from being completed.
Unless the updateStatus has to be called in a specific order, I suggest you call this as part of your query in MyCallable. This could simplify your code and allow results to be processed as they become available.

You need to handle the potential throwing of a RejectedExecutionException by executor.submit() because the thread-pool has a finite number of threads.
You'd probably be better off using an ExecutorCompletionService rather than an ExecutorService because the former can tell you when a task completes.
I strongly recommend reading Brian Goetz's book "Java Concurrency in Practice".

Related

Handling the Hanging Tasks [duplicate]

This question already has answers here:
ExecutorService that interrupts tasks after a timeout
(11 answers)
Closed 7 years ago.
This is just an example to explain my problem...
I am using ExecutorService with 20 active threads and 75K max queued items...
In my case, a normal task should not take more than 10 seconds, if it takes more time that means there's some problem with the task.
If all the threads are hung due to problematic tasks my RejectionHandler would restart the entire service.
I have two questions here:
I do not like the idea of restarting the service, instead if there's
way to detect hanging thread and we could just restart that hung
thread that would be great. I have gone through couple of articles to handle hung threads with ThreadManager but have not found anything
with ExecutorService.
I am very much fascinated about the Executors.newCachedThredPool()
because on peak days we are heavily loaded with incoming tasks, and
on other days they are very few. Any suggestions would be greatly
appreciated.
public class HangingThreadTest {
// ExecutorService executorService = Executors.newCachedThreadPool()
private static ExecutorService executorService = new ThreadPoolExecutor(10,
20, 5L, TimeUnit.SECONDS, new LinkedBlockingQueue<Runnable>(75000));
public static void main(String... arg0) {
for (int i = 0; i < 50000; i++) {
executorService.submit(new Task());
}
}
}
/**
* Task to be completed
*/
class Task implements Runnable {
private static int count = 0;
#Override
public void run() {
count++;
if (count%5 == 0) {
try {
System.out.println("Hanging Thread task that needs to be reprocessed: "
+ Thread.currentThread().getName()+" count: "+count);
Thread.sleep(11000);
} catch (InterruptedException e) {
// Do something
}
}
else{
System.out.println("Normal Thread: "
+ Thread.currentThread().getName()+" count: "+count);
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
//Do something
}
}
}
}

There is no build-in mechanism in Executors framework that would help terminate a thread if it has been running for more than a threshold value.
But we can achieve this with some extra code as below:
Get the Future object returned by the executorService.submit(...);.
Future future = executorService.submit(new Task());
Call the get method on this future object to and make it wait only for threshold interval for task completion. Below, an example that is waits for only 2 secs.
try {
f.get(2, TimeUnit.SECONDS);
} catch (TimeoutException e) {
f.cancel(true);
} catch (Exception e) {}
The above code waits for 2 seconds for task completion it throws a TimeoutException if it doesn't get completed during that time. Subsequently we can call cancel method on the future object. This results in setting the interrupt flag in the thread that is executing the task.
Now the final change is, in the Task class code we need to check at necessary points (application dependent), whether the interrupt flag has been set to true using isInterrupted() method of Thread class. If interrupted==true, we can do the necessary clean up and return from the run method immediately. The critical piece here is to identify the necessary points in your Task class where you want to check for this interrupted flag.
This makes the thread available for processing next task.

You may have a look at this article, it was very helpful for me before when I was facing the same problem : Java Hanging Thread Detection

how can i make a thread sleep for a while and then start working again?

I have the following code:
public void run()
{
try
{
logger.info("Looking for new tasks to fetch... ");
// definitions ..
for(Task t: tasks)
{
logger.info(" Task " + t.getId() + " is being fetched ");
// processing ... fetching task info from db using some methods
}
Thread.sleep(FREQUENCY);
//t.start();
} catch (Exception e)
{
logger.info("FetcherThread interrupted: "+e.getMessage());
}
}
I'm trying to make the thread to sleep for a specific time "FREQUENCY" and then work again. when I execute this code in eclipse, the thread works only once and then nothing happens and process terminates. If I remove the comment from the statement: t.start(), I get "FetcherThread interrupted: null".
Can anyone tell me where I'm going wrong?
N.B.: I want the thread to be working all the time, but fetching on periods (say every 5 minutes)

You're missing any sort of loop in that code.
It seems that the thread is actually doing what you tell it to do: it runs all the tasks, then sleeps for a bit - then it has no more work to do, and so exits. There are several ways to address this, in ascending order of complexity and correctness:
The simple (and naive) way to address this is to wrap the try-catch block in an infinite loop (while(true) { ... }). This way after the thread finishes sleeping, it will loop back to the top and process all the tasks again.
However this isn't ideal, as it's basically impossible to stop the thread. A better approach is to declare a boolean field (e.g. boolean running = true;), and change the loop to while(running). This way, you have a way to make the thread terminate (e.g. expose a method that sets running to false.) See Sun's Why is Thread.stop() deprecated article for a longer explanation of this.
And taking a step further back, you may be trying to do this at too low a level. Sleeping and scheduling isn't really part of the job of your Runnable. The actual solution I would adopt is to strip out the sleeping, so that you have a Runnable implementation that processes all the tasks and then terminates. Then I would create a ScheduledExecutorService, and submit the "vanilla" runnable to the executor - this way it's the job of the executor to run the task periodically.
The last solution is ideal from an engineering perspective. You have a class that simply runs the job once and exits - this can be used in other contexts whenever you want to run the job, and composes very well. You have an executor service whose job is the scheduling of arbitrary tasks - again, you can pass different types of Runnable or Callable to this in future, and it will do the scheduling bit just as well. And possibly the best part of all, is that you don't have to write any of the scheduling stuff yourself, but can use a class in the standard library which specifically does this all for you (and hence is likely to have the majority of bugs already ironed out, unlike home-grown concurrency code).

Task scheduling has first-class support in Java, don't reinvent it. In fact, there are two implementations: Timer (old-school) and ScheduledExecutorService (new). Read up on them and design your app aroud them.

Try executing the task on a different thread.

You need some kind of loop to repeat your workflow. How shall the control flow get back to the fetching part?

You can put the code inside a loop.( May be while)
while(condition) // you can make it while(true) if you want it to run infinitely.
{
for(Task t: tasks)
{
logger.info(" Task " + t.getId() + " is being fetched ");
// processing ... fetching task info from db using some methods
}
Thread.sleep(FREQUENCY);
}
Whats happening in your case its running the Task loop then sleeping for some time and exiting the thread.

Put the thread in a loop as others have mentioned here.
I would like to add that calling Thread.start more than once is illegal and that is why you get an exception.
If you would like to spawn multiple thread create one Thread object per thread you want to start.
See http://docs.oracle.com/javase/6/docs/api/java/lang/Thread.html#start()

public void run()
{
while (keepRunning) {
try
{
logger.info("Looking for new tasks to fetch... ");
// definitions ..
for(Task t: tasks)
{
logger.info(" Task " + t.getId() + " is being fetched ");
// processing ... fetching task info from db using some methods
t.start();
}
Thread.sleep(FREQUENCY);
} catch (Exception e) {
keepRunning = false;
logger.info("FetcherThread interrupted: "+e.getMessage());
}
}
}
Add a member call keepRunning to your main thread and implement an accessor method for setting it to false (from wherever you need to stop the thread from executing the tasks)

You need to put the sleep in an infinite loop (or withing some condition specifying uptill when you want to sleep). As of now the sleep method is invoked at the end of the run method and behavior you observe is correct.
The following demo code will print "Sleep" on the console after sleeping for a second. Hope it helps.
import java.util.concurrent.TimeUnit;
public class Test implements Runnable {
/**
* #param args
*/
public static void main(String[] args) {
Test t = new Test();
Thread thread = new Thread(t);
thread.start();
}
public void run() {
try {
// logger.info("Looking for new tasks to fetch... ");
// definitions ..
// for(Task t: tasks)
// {
// logger.info(" Task " + t.getId() + " is being fetched ");
// // processing ... fetching task info from db using some methods
// }
while (true) { // your condition here
TimeUnit.SECONDS.sleep(1);
System.out.println("Sleep");
}
// t.start();
} catch (Exception e) {
// logger.info("FetcherThread interrupted: "+e.getMessage());
}
}
}

You could try ScheduledExecutorService (Javadoc).
And us it's scheduleAtFixedRate, which:
Creates and executes a periodic action that becomes enabled first after the given initial delay, and subsequently with the given period; that is executions will commence after initialDelay then initialDelay+period, then initialDelay + 2 * period, and so on.

How should I execute external commands using multithreading in Java?

I want to run an external programs repeated N times, waiting for output each time and process it. Since it's too slow to run sequentially, I tried multithreading.
The code looks like this:
public class ThreadsGen {
public static void main(String[] pArgs) throws Exception {
for (int i =0;i < N ; i++ )
{
new TestThread().start();
}
}
static class TestThread extends Thread {
public void run() {
String cmd = "programX";
String arg = "exArgs";
Process pr;
try {
pr = new ProcessBuilder(cmd,arg).start();
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
try {
pr.waitFor();
} catch (InterruptedException e) {
e.printStackTrace();
}
//process output files from programX.
//...
}
However, it seems to me that only one thread is running at a time (by checking CPU usage).
What I want to do is getting all threads (except the one that is waiting for programX to finish) working? What's wrong with my code?
Is it because pr.waitFor(); makes the main thread wait on each subthread?

The waitFor() calls are not your problem here (and are actually causing the spawned Threads to wait on the completion of the spawned external programs rather than the main Thread to wait on the spawned Threads).
There are no guarantees around when Java will start the execution of Threads. It is quite likely, therefore, that if the external program(s) that you are running finish quickly then some of the Threads running them will complete before all the programs are launched.
Also note that CPU usage is not necessarily a good guide to concurrent execution as your Java program is doing nothing but waiting for the external programs to complete. More usefully you could look at the number of executed programs (using ps or Task Manager or whatever).

Isn't yours the same problem as in this thread: How to wait for all threads to finish, using ExecutorService?

On FutureTask, finally and TimeoutExceptions in Java

I'm trying to understand how to ensure that a specific action completes in a certain amount of time. Seems like a simple job for java's new util.concurrent library. However, this task claims a connection to the database and I want to be sure that it properly releases that connection upon timeout.
so to call the service:
int resultCount = -1;
ExecutorService executor = null;
try {
executor = Executors.newSingleThreadExecutor();
FutureTask<Integer> task = new CopyTask<Integer>();
executor.execute(task);
try {
resultCount = task.get(2, TimeUnit.MINUTES);
} catch (Exception e) {
LOGGER.fatal("Migrate Events job crashed.", e);
task.cancel(true);
return;
}
} finally {
if (executor != null) {
executor.shutdown();
}
The task itself simply wrapps a callable, here is the call method:
#Override
public Integer call() throws Exception {
Session session = null;
try {
session = getSession();
... execute sql against sesssion ...
}
} finally {
if (session != null) {
session.release();
}
}
}
So, my question for those who've made it this far, is: Is session.release() garaunteed to be called in the case that the task fails due to a TimeoutException? I postulate that it is no, but I would love to be proven wrong.
Thanks
edit: The problem I'm having is that occasionally the sql in question is not finishing due to wierd db problems. So, what I want to do is simply close the connection, let the db rollback the transaction, get some rest and reattempt this at a later time. So I'm treating the get(...) as if it were like killing the thead. Is that wrong?

When you call task.get() with a timeout, that timeout only applies to the attempt to obtain the results (in your current thread), not the calculation itself (in the worker thread). Hence your problem here; if a worker thread gets into some state from which it will never return, then the timeout simply ensures that your polling code will keep running but will do nothing to affect the worker.
Your call to task.cancel(true) in the catch block is what I was initially going to suggest, and this is good coding practice. Unfortunately this only sets a flag on the thread that may/should be checked by well-behaved long-running, cancellable tasks, but it doesn't take any direct action on the other thread. If the SQL executing methods don't declare that they throw InterruptedException, then they aren't going to check this flag and aren't going to be interruptable via the typical Java mechanism.
Really all of this comes down to the fact that the code in the worker thread must support some mechanism of stopping itself if it's run for too long. Supporting the standard interrupt mechanism is one way of doing this; checking some boolean flag intermittently, or other bespoke alternatives, would work too. However there is no guaranteed way to cause another thread to return (short of Thread.stop, which is deprecated for good reason). You need to coordinate with the running code to signal it to stop in a way that it will notice.
In this particular case, I expect there are probably some parameters you could set on the DB connection so that the SQL calls will time out after a given period, meaning that control returns to your Java code (probably with some exception) and so the finally block gets called. If not, i.e. there's no way to make the database call (such as PreparedStatement.execute()) return control after some predetermined time, then you'll need to spawn an extra thread within your Callable that can monitor a timeout and forcibly close the connection/session if it expires. This isn't very nice though and your code will be a lot cleaner if you can get the SQL calls to cooperate.
(So ironically despite you supplying a good amount of code to support this question, the really important part is the bit you redacted: "... execute sql against sesssion ..." :-))

You cannot interrupt a thread from the outside, so the timeout will have no effect on the code down in the JDBC layer (perhaps even over in JNI-land somewhere.) Presumably eventually the SQL work will end and the session.release() will happen, but that may be long after the end of your timeout.

The finally block will eventually execute.
When your Task takes longer then 2 minutes, a TimeoutException is thrown but the actual thread continues to perform it's work and eventually it will call the finally block. Even if you cancel the task and force an interrupt, the finally block will be called.
Here's a small example based in your code. You can test these situations:
public static void main(String[] args) {
int resultCount = -1;
ExecutorService executor = null;
try {
executor = Executors.newSingleThreadExecutor();
FutureTask<Integer> task = new FutureTask<Integer>(new Callable<Integer>() {
#Override
public Integer call() throws Exception {
try {
Thread.sleep(10000);
return 1;
} finally {
System.out.println("FINALLY CALLED!!!");
}
}
});
executor.execute(task);
try {
resultCount = task.get(1000, TimeUnit.MILLISECONDS);
} catch (Exception e) {
System.out.println("Migrate Events job crashed: " + e.getMessage());
task.cancel(true);
return;
}
} finally {
if (executor != null) {
executor.shutdown();
}
}
}

Your example says:
copyRecords.cancel(true);
I assume this was meant to say:
task.cancel(true);
Your finally block will be called assuming that the contents of the try block are interruptible. Some operations are (like wait()), some operations are not (like InputStream#read()). It all depends on the operation that that the code is blocking on when the task is interrupted.

producer/consumer work queues

I'm wrestling with the best way to implement my processing pipeline.
My producers feed work to a BlockingQueue. On the consumer side, I poll the queue, wrap what I get in a Runnable task, and submit it to an ExecutorService.
while (!isStopping())
{
String work = workQueue.poll(1000L, TimeUnit.MILLISECONDS);
if (work == null)
{
break;
}
executorService.execute(new Worker(work)); // needs to block if no threads!
}
This is not ideal; the ExecutorService has its own queue, of course, so what's really happening is that I'm always fully draining my work queue and filling the task queue, which slowly empties as the tasks complete.
I realize that I could queue tasks at the producer end, but I'd really rather not do that - I like the indirection/isolation of my work queue being dumb strings; it really isn't any business of the producer what's going to happen to them. Forcing the producer to queue a Runnable or Callable breaks an abstraction, IMHO.
But I do want the shared work queue to represent the current processing state. I want to be able to block the producers if the consumers aren't keeping up.
I'd love to use Executors, but I feel like I'm fighting their design. Can I partially drink the Kool-ade, or do I have to gulp it? Am I being wrong-headed in resisting queueing tasks? (I suspect I could set up ThreadPoolExecutor to use a 1-task queue and override it's execute method to block rather than reject-on-queue-full, but that feels gross.)
Suggestions?

I want the shared work queue to
represent the current processing
state.
Try using a shared BlockingQueue and have a pool of Worker threads taking work items off of the Queue.
I want to be able to block the
producers if the consumers aren't
keeping up.
Both ArrayBlockingQueue and LinkedBlockingQueue support bounded queues such that they will block on put when full. Using the blocking put() methods ensures that producers are blocked if the queue is full.
Here is a rough start. You can tune the number of workers and queue size:
public class WorkerTest<T> {
private final BlockingQueue<T> workQueue;
private final ExecutorService service;
public WorkerTest(int numWorkers, int workQueueSize) {
workQueue = new LinkedBlockingQueue<T>(workQueueSize);
service = Executors.newFixedThreadPool(numWorkers);
for (int i=0; i < numWorkers; i++) {
service.submit(new Worker<T>(workQueue));
}
}
public void produce(T item) {
try {
workQueue.put(item);
} catch (InterruptedException ex) {
Thread.currentThread().interrupt();
}
}
private static class Worker<T> implements Runnable {
private final BlockingQueue<T> workQueue;
public Worker(BlockingQueue<T> workQueue) {
this.workQueue = workQueue;
}
#Override
public void run() {
while (!Thread.currentThread().isInterrupted()) {
try {
T item = workQueue.take();
// Process item
} catch (InterruptedException ex) {
Thread.currentThread().interrupt();
break;
}
}
}
}
}

"find an available existing worker thread if one exists, create one if necessary, kill them if they go idle."
Managing all those worker states is as unnecessary as it is perilous. I would create one monitor thread that constantly runs in the background, who's only task is to fill up the queue and spawn consumers... why not make the worker threads daemons so they die as soon as they complete? If you attach them all to one ThreadGroup you can dynamically re-size the pool... for example:
**for(int i=0; i<queue.size()&&ThreadGroup.activeCount()<UPPER_LIMIT;i++ {
spawnDaemonWorkers(queue.poll());
}**

You could have your consumer execute Runnable::run directly instead of starting a new thread up. Combine this with a blocking queue with a maximum size and I think that you will get what you want. Your consumer becomes a worker that is executing tasks inline based on the work items on the queue. They will only dequeue items as fast as they process them so your producer when your consumers stop consuming.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.