Is there such thing as 'too much cleanup' when handling thread interrupts?

Is there such thing as 'too much cleanup' when handling thread interrupts? - java

This great article about best practices for handling interrupts mentions the following:
Sometimes it is necessary to do some amount of cleanup before propagating the exception. In this case, you can catch InterruptedException, perform the cleanup, and then rethrow the exception.
He then goes on to give an example of a method that catches InterruptedException, does a couple lines of cleanup, and then propagates the exception onward.
His small example makes perfect sense, but let's say I have a much longer interruptible method, whose task is not so simple, and it must be performed atomically. In other words, the amount of 'cleanup' it would need to perform when interrupted is substantial. Is this acceptable? If so, could I be cheeky and just catch the interrupt, perform all of the method's normal workflow (pretend it's 'cleanup'), and then propagate the interrupt at the very end?
In other words, I get that it's important to properly handle and propagate interrupts; my question is, how important is it to handle interrupts in a timely manner, and what counts as 'timely'?
Here's the example (real world) scenario where I'm coming from: I have a thread listening to a message queue; handling each message involves multiple HTTP calls and expensive DB operations, and, as currently (unfortunately) designed, these operations must all be performed atomically. Can I define my thread's interrupt-handling behavior to be: 'when interrupted, finish everything you're doing as normal before propagating the interrupt', or is this stretching the definition of 'cleanup' a little too much?

I don't think there is any useful notion of "too little" or "too much" cleanup. Certainly there is no general way to decide that you have done too little or too much.
Specifically ...
Can I define my thread's interrupt-handling behavior to be: 'when interrupted, finish everything you're doing as normal before propagating the interrupt', or is this stretching the definition of 'cleanup' a little too much?
There's no definite answer to this. If it makes sense (e.g. this behaviour is required), then it would be correct to do that. Whether you call it "cleanup" or not is irrelevant.
On the other hand, one of the common use-cases of Java interrupts is to signal to some part of the application to stop what ever it is doing because, for example:
the server is shutting down, or
the requested action is taking too long, or
the client that made the request has "gone away", and there is no other reason to complete the request.
In such cases, "finish everything as normal" may be the wrong strategy, especially if that is going to be expensive. (Or it may be the right strategy; for example, if there is no reliable way to back out of the sequence of actions that need to be done atomically.)
In short ... we can't tell you whether this is the right thing to do.
In other words, I get that it's important to properly handle and propagate interrupts; my question is, how important is it to handle interrupts in a timely manner, and what counts as 'timely'?
Again. These are questions that only make sense (and can only be answered) in the context of your application. It depends ...
But I don't think that this (cleanup) is restricted to interrupts. Consider the example in the article:
public class PlayerMatcher {
private PlayerSource players;
public PlayerMatcher(PlayerSource players) {
this.players = players;
}
public void matchPlayers() throws InterruptedException {
Player playerOne, playerTwo;
try {
while (true) {
playerOne = playerTwo = null;
// Wait for two players to arrive and start a new game
playerOne = players.waitForPlayer(); // could throw IE
playerTwo = players.waitForPlayer(); // could throw IE
startNewGame(playerOne, playerTwo);
}
}
catch (InterruptedException e) {
// If we got one player and were interrupted, put that player back
if (playerOne != null)
players.addFirst(playerOne);
// Then propagate the exception
throw e;
}
}
}
What happens (for example) if waitForPlayers or startNewGame could throw some other exception (checked or unchecked)? In that case, you could end up with lost players ... just like if you had an InterruptedException.
My point ... is that if you are concerned about making the code resilient in general (or "atomic") then it would be better to use a finally block to do the recovery; e.g.
finally {
// Make sure that we *always* put the players back
if (playerOne != null)
players.addFirst(playerOne);
if (playerTwo != null)
players.addFirst(playerTwo);
}
And if you need to do atomic operations that also entail changing state outside of the JVM and/or "the application" ... then even finally is not enough. There are some situations where code in a finally block won't be executed; e.g. if the JVM crashes or is terminated by System.exit(). (This is #EJP's point ...)

The approach you describe is sometimes referred to as "entering 'lame duck' mode", wherein you'll finish what you've already started but won't accept or initiate any new work.
It's fine as long as you document it, so that callers know what to expect. Encountering an InterruptedException means that some upstream caller wants to terminate the thread's activity, but safety trumps responsiveness. If you believe that these operations must all complete together (to the best of your ability), and stopping the unit of work with only part of it done would violate some requirement, then you are in your right to adhere to those requirements and put them above the implied requirement for timely cooperation with an interruption request.
Ideally, you'd cease any further progress with the transaction and attempt to roll back what you've already completed. However, there's subtlety in that design; you could be far enough along that just finishing the transaction would be faster than rolling back your nearly-complete accomplishments.
Again, the key here is documentation. If you document the behavior and find that your callers complain, then you have to push back on the competing requirement for transactional atomicity.

The end of that same article discusses "noncancelable tasks", which finish what they're doing (even if it may take a long time) before responding to the interruption. It sounds like that's what you have.
You don't necessarily have to abort what you're doing immediately, but you should set a flag to remember that an interrupt was requested, then re-throw the InterruptedException later when the atomic work is done.

No. What if you get interrupted again? If you need a method to be atomic you have much bigger problems than propagating exceptions.

Related

What's the point of using Future without multithreading?

I've inherited some code and there is nobody of the original developers left. The code uses heavily CompletableFuture, and it's the first time I use it, so I'm still trying to wrap my head around it. As I understand it, a (Completable)Future is typically used with some multithreading mechanism that will allow us to do some other thing while a time consuming task is executing, and then simply fetch its result via the Future. As in the javadoc:
interface ArchiveSearcher { String search(String target); }
class App {
ExecutorService executor = ...
ArchiveSearcher searcher = ...
void showSearch(final String target) throws InterruptedException {
Future<String> future = executor.submit(new Callable<String>() {
public String call() {
return searcher.search(target);
}});
displayOtherThings(); // do other things while searching
try {
displayText(future.get()); // use future
} catch (ExecutionException ex) { cleanup(); return; }
}
}
However, in this application that I've inherited, the following pattern that doesn't use any multithreading appears a bunch of times:
public Object serve(Object input) throws ExecutionException, InterruptedException {
CompletableFuture<Object> result = delegate1(input);
return result.get();
}
private CompletableFuture<Object> delegate1(Object input) {
// Do things
return delegate2(input);
}
private CompletableFuture<Object> delegate2(Object input) {
return CompletableFuture.completedFuture(new Object());
}
To me, this is equivalent to:
public Object serve(Object input) {
Object result = delegate1(input);
return result;
}
private Object delegate1(Object input) {
// Do things
return delegate2(input);
}
private Object delegate2(Object input) {
return new Object();
}
Of course the code is much more complex, and returns exceptionallyCompletedFuture in case of error, but there are is Callable, no Runnable, no Executor, no supplyAsync() no sign of multithreading. What am I missing? What's the point of using a Future in a singled-threaded context?

Futures are critical for situations where there is asynchronous programming. One of the biggest advantages of asynchronous programming is it allows you to write very efficient code with a single thread.
Furthermore, futures tend to be an all-or-nothing proposition. If you want to write asynchronous code you have to do so from top to bottom, even if not every method does something asynchronous.
For example, consider you want to write a single threaded HTTP server like twisted or express. The top level of your server (very liberal pseudocode here) might look something like:
while (true) {
if (serverSocket.ready()) {
connection = serverSocket.accept();
futures.add(server.serve(connection));
}
for (Future future : futures) {
if (future.isDone()) {
Object result = future.get();
sendResult(result);
}
}
//Some kind of select-style wait here
}
There is only one thread but any time an operation happens that would normally require a wait (reading from database, file, reading in the request, etc.) it uses futures and doesn't block the one thread so you have a highly performant single threaded HTTP server.
Now, imagine what would happen if the highest level of your application was like the above and at some point some request at a very low level had to read something from a file. That file read would generate a future. If all of your middle layers in between didn't handle futures then you would have to block and it would defeat the purpose. This is why I say futures tend to be all-or-nothing.
So my guess is either:
Your friend does something asynchronous currently and you haven't caught it yet (does he ever read from a file or database or anything? If so, is he blocking?).
He was planning on someday doing something asynchronous and wanted to plan for it.
He spent a lot of time in other asynchronous frameworks and grew to like the style even if he isn't using it correctly.

Yes, for now there is no multithreading used in that code. Looks like there was an intention to write single-threaded code in such a way that if developer later decides to use multithreading then only
delegate2()
method should be modified.

ExecutorService implementations typically manage threads. I've used the ThreadPoolExecutor, which does exactly that. You commented out which ExecutorService your code uses.

The main point of asynchronous code is to defer the continuation code.
The most common scenario is I/O, where instead of waiting for an operation to finish, you say "do your thing and notify me when you're finished", or more commonly, "do your thing and do this when you're finished".
This doesn't imply threads at all. Reading from any device, be it a network card or a hard drive, usually has some sort of signal or interrupt sent from the device to the CPU. You could use the CPU in the meantime. The "notify me" is more common in lower-level code, where you implement a dispatching loop or scheduler; the "do this" is more common in higher-level code, where you use an established library or framework that dispatches and/or schedules for you.
Less common scenarios include deferring execution without blocking a thread (think of a timer versus Thread.sleep()) and splitting work. Actually, splitting work is very common with multiple threads, where you can improve performance with a bit of overhead, but not so much with a single thread, where the overhead is just, well, overhead.
The code you provide as an example that just builds completed CompletableFutures, whether successfully or exceptionally, is a part of the overhead of asynchronous code that isn't really asynchronous. That is, you must still follow a defined async style, which in this case requires a small amount of memory allocation for results, even if you can provide results immediately.
This may become noticeable on thousands of calls per second, or hundreds of calls per second per thread with dozens of threads.
Sometimes, you can optimize by having predefined completed futures for e.g. null, 0, 1, -1, an empty array/list/stream, or any other very common or even fixed result you may have specifically in your domain. A similar approach is to cache a wrapping future, not just the result, while the result remains the same. But I suggest you first profile before going this way, you may end up optimizing prematurely something that most probably is not a bottleneck.

java try finally unlock idiom

Javadoc and some answers(Threads - Why a Lock has to be followed by try and finally) state that:
In most cases, the following idiom should be used:
Lock l = ...;
l.lock();
try {
// access the resource protected by this lock
} finally {
l.unlock();
}
I have seen examples of this idiom in standard Java libraries.
This is my example of using it. Fields acc1 and acc2 represent a wellknown example of bank accounts. The main constraint is the sum of values of acc's - it should be 0.
public class Main {
int acc1;
int acc2;
ReadWriteLock lock = new ReentrantReadWriteLock();
public int readSum() {
lock.readLock().lock();
try {
return acc1 + acc2;
} finally {
lock.readLock().unlock();
}
}
public void transfer() {
lock.writeLock().lock();
try {
acc1--; // constraint is corrupted
// exception throwed here
acc2++; // constraint is regained
} finally {
lock.writeLock().unlock();
}
}
}
I understand using the idiom at read case: if exception thrown in read method other threads still can read/write consistent resource. But if excteption thrown in write method read methods can read inconsisted resource.
Why reading inconsistent values is more preferable then infinity lock waiting?
Why Java libraries authors prefer this behavior?

You can rollback or give some warning info to user but you can do nothing if program is blocked.
I agree with what you talk about data consistency. It's dangerous to unlock in finally part without any rollback operation or warning.

You are mixing up different concepts here. There is:
Locking, and the preventing of dead-locks, and then
Another dimension, lets call it "data integrity".
The point: those two are basically orthogonal. The fact that you are addressing one of them doesn't magically resolve the other for you!
Even when you look at your own example, you find that you put a try/finally there. But there is no catch there!
at all cost
Meaning: if some exception is thrown, that exception is still thrown, and some catcher will have to deal with it.
In other words: if your "locked" code can run into exceptions, then it is your responsibility to handle that in the way that makes the most sense.
And, from a "systems" view: when you got an indefinite lock, that will sooner or later degrade your whole system. If you run into that exception once, then you will run into it more often. So, chances are, that you will run out of threads/locks soon; and your whole application will be affected. That is exactly the kind of problem that can take down a distributed infrastructure - one component stopping to process incoming requests.
Long story short: infinite lock-waiting is something that you want to prevent, because it can seriously impact the ability of your application to function!
Finally: of course, this is about balancing of different requirements. But, example: assume your online shop looses the information that you just deleted an item from your shopping cart. Yes, that is annoying for the customer. But compare that to: the whole online shopping application stops handling requests; because of locking issues. Which problem will hurt your business more?

Catch and logging for irrelevant operations

I've a method to save timing information of each operation:
public void queueTimerInfo(long start, long end, String msg) {
try {
timer.queue(start, end, msg);
} catch (InterruptedException e) {
Logger.info(e.getMessage());
}
}
I call the above method after each operation. What matters is the operation itself, whereas the timing is just a secondary task. That's why I decided not to do anything when the method fails, except logging it.
But I was always told that logging without managing the exception is a bad practice. So how should I rewrite the above code?

If you know the consequences, i.e. the timer.queue() call might be interrupted and not queue the data, and you can live with that, then it is OK to ignore the Exception. As with most rules, you need to know when to break them.
However, I would document your decision with a comment in the catch block, so that whoever maintains the code later knows that not handling the Exception was not an oversight, but a deliberate decision.

But I was always told that logging without managing the exception is a bad practice
What does "managing" mean? Rethrowing them? Blindly following steps 1, 2, 3 because "zOMG an exception was thrown!!111"?
If you blindly follow best practices and other sorts of advice regardless of your context, then you'll probably end up with really problematic and awkward decisions. Don't do things just because it's best practice. Acknowledge the best practices, but at the same time make sure they actually make sense in your situation.
Ask yourself: does that exception make a difference? Does it break a contract? Does it change the flow of your application? Do you absolutely not want that to happen and if it does, then the situation is truly exceptional and you should really deal with it somehow?
If it doesn't make a difference and so on, then just logging it is perfectly acceptable. It really comes down to your context and to the significance of your exception.
LE: Of course, as Thomas suggests, you may want to document your decision.

InterruptedException is special because it does not signal an error.
When a method declares a InterruptedException it tells you that it is a blocking method which can be cancelled by interrupting its thread.
Brian Goetz explains:
When a method throws InterruptedException, it is telling you that if
the thread executing the method is interrupted, it will make an
attempt to stop what it is doing and return early and indicate its
early return by throwing InterruptedException. Well-behaved blocking
library methods should be responsive to interruption and throw
InterruptedException so they can be used within cancelable activities
without compromising responsiveness.
You should either
don't catch the exception and add a throws InterruptedException
catch it, do clean up and rethrow it
when you can't throw it (e.g. in a Runnable) call Thread.getCurrentThread().interrupt();
If you just swallow the exception you will compromise responsiveness of your application.

Can you wrap it and rethrow? And then your common exception handler can take care of logging and reporting. And provide context with it if possible.

Thread.sleep() in a while loop

I notice that NetBeans is warning me about using Thread.sleep() in a while loop in my Java code, so I've done some research on the subject. It seems primarily the issue is one of performance, where your while condition may become true while the counter is still sleeping, thus wasting wall-clock time as you wait for the next iteration. This all makes perfect sense.
My application has a need to contact a remote system and periodically poll for the state of an operation, waiting until the operation is complete before sending the next request. At the moment the code logically does this:
String state = get state via RPC call
while (!state.equals("complete")) {
Thread.sleep(10000); // Wait 10 seconds
state = {update state via RPC call}
}
Given that the circumstance is checking a remote operation (which is a somewhat expensive process, in that it runs for several seconds), is this a valid use of Thread.sleep() in a while loop? Is there a better way to structure this logic? I've seen some examples where I could use a Timer class, but I fail to see the benefit, as it still seems to boil down to the same straightforward logic above, but with a lot more complexity thrown in.
Bear in mind that the remote system in this case is neither under my direct control, nor is it written in Java, so changing that end to be more "cooperative" in this scenario is not an option. My only option for updating my application's value for state is to create and send an XML message, receive a response, parse it, and then extract the piece of information I need.
Any suggestions or comments would be most welcome.

Unless your remote system can issue an event or otherwise notify you asynchronously, I don't think the above is at all unreasonable. You need to balance your sleep() time vs. the time/load that the RPC call makes, but I think that's the only issue and the above doesn't seem of concern at all.

Without being able to change the remote end to provide a "push" notification that it is done with its long-running process, that's about as well as you're going to be able to do. As long as the Thread.sleep time is long compared to the cost of polling, you should be OK.

You should (almost) never use sleep since its very inefficient and its not a good practice. Always use locks and condition variables where threads signal each other. See Mike Dahlin's Coding Standards for Programming with threads
A template is:
public class Foo{
private Lock lock;
private Condition c1;
private Condition c2;
public Foo()
{
lock = new SimpleLock();
c1 = lock.newCondition();
c2 = lock.newCondition();
...
}
public void doIt()
{
try{
lock.lock();
...
while(...){
c1.awaitUninterruptibly();
}
...
c2.signal();
}
finally{
lock.unlock();
}
}
}

Java while loop and Threads! [duplicate]

This question already has answers here:
How can I abort a running JDBC transaction?
(4 answers)
Closed 5 years ago.
I have a program that continually polls the database for change in value of some field. It runs in the background and currently uses a while(true) and a sleep() method to set the interval. I am wondering if this is a good practice? And, what could be a more efficient way to implement this? The program is meant to run at all times.
Consequently, the only way to stop the program is by issuing a kill on the process ID. The program could be in the middle of a JDBC call. How could I go about terminating it more gracefully? I understand that the best option would be to devise some kind of exit strategy by using a flag that will be periodically checked by the thread. But, I am unable to think of a way/condition of changing the value of this flag. Any ideas?

I am wondering if this is a good practice?
No. It's not good. Sometimes, it's all you've got, but it's not good.
And, what could be a more efficient way to implement this?
How do things get into the database in the first place?
The best change is to fix programs that insert/update the database to make requests which go to the database and to your program. A JMS topic is good for this kind of thing.
The next best change is to add a trigger to the database to enqueue each insert/update event into a queue. The queue could feed a JMS topic (or queue) for processing by your program.
The fall-back plan is your polling loop.
Your polling loop, however, should not trivially do work. It should drop a message into a queue for some other JDBC process to work on. A termination request is another message that can be dropped into the JMS queue. When your program gets the termination message, it absolutely must be finished with the prior JDBC request and can stop gracefully.
Before doing any of this, look at ESB solutions. Sun's JCAPS or TIBCO already have this. An open source ESB like Mulesource or Jitterbit may already have this functionality already built and tested.

This is really too big an issue to answer completely in this format. Do yourself a favour and go buy Java Concurrency in Practice. There is no better resource for concurrency on the Java 5+ platform out there. There are whole chapters devoted to this subject.
On the subject of killing your process during a JDBC call, that should be fine. I believe there are issues with interrupting a JDBC call (in that you can't?) but that's a different issue.

As others have said, the fact that you have to poll is probably indicative of a deeper problem with the design of your system... but sometimes that's the way it goes, so...
If you'd like to handle "killing" the process a little more gracefully, you could install a shutdown hook which is called when you hit Ctrl+C:
volatile boolean stop = false;
Runtime.getRuntime().addShutdownHook(new Thread("shutdown thread") {
public void run() {
stop = true;
}
});
then periodically check the stop variable.
A more elegant solution is to wait on an event:
boolean stop = false;
final Object event = new Object();
Runtime.getRuntime().addShutdownHook(new Thread("shutdown thread") {
public void run() {
synchronized(event) {
stop = true;
event.notifyAll();
}
}
});
// ... and in your polling loop ...
synchronized(event) {
while(!stop) {
// ... do JDBC access ...
try {
// Wait 30 seconds, but break out as soon as the event is fired.
event.wait(30000);
}
catch(InterruptedException e) {
// Log a message and exit. Never ignore interrupted exception.
break;
}
}
}
Or something like that.

Note that a Timer (or similar) would be better in that you could at least reuse it and let it do with all of the details of sleeping, scheduling, exception handling, etc...
There are many reasons your app could die. Don't focus on just the one.
If it's even theoretically possible for your JDBC work to leave things in a half-correct state, then you have a bug you should fix. All of your DB work should be in a transaction. It should go or not go.

This is Java. Move your processing to a second thread. Now you can
Read from stdin in a loop. If someone types "QUIT", set the while flag to false and exit.
Create a AWT or Swing frame with a STOP button.
Pretend you are a Unix daemon and create a server socket. Wait for someone to open the socket and send "QUIT". (This has the added bonus that you can change the sleep to a select with timeout.)
There must be hundreds of variants on this.

Set up a signal handler for SIGTERM that sets a flag telling your loop to exit its next time through.

Regarding the question "The program could be in the middle of a JDBC call. How could I go about terminating it more gracefully?" - see How can I abort a running jdbc transaction?
Note that using a poll with sleep() is rarely the correct solution - implemented improperly, it can end up hogging CPU resources (the JVM thread-scheduler ends up spending inordinate amount of time sleeping and waking up the thread).

I‘ve created a Service class in my current company’s utility library for these kinds of problems:
public class Service implements Runnable {
private boolean shouldStop = false;
public synchronized stop() {
shouldStop = true;
notify();
}
private synchronized shouldStop() {
return shouldStop;
}
public void run() {
setUp();
while (!shouldStop()) {
doStuff();
sleep(60 * 1000);
}
}
private synchronized sleep(long delay) {
try {
wait(delay);
} catch (InterruptedException ie1) {
/* ignore. */
}
}
}
Of course this is far from complete but you should get the gist. This will enable you to simply call the stop() method when you want the program to stop and it will exit cleanly.

If that's your application and you can modify it, you can:
Make it read a file
Read for the value of a flag.
When you want to kill it, you just modify the file and the application will exit gracefully.
Not need to work it that harder that that.

You could make the field a compound value that includes (conceptually) a process-ID and a timestamp. [Better yet, use two or more fields.] Start a thread in the process that owns access to the field, and have it loop, sleeping and updating the timestamp. Then a polling process that is waiting to own access to the field can observe that the timestamp has not updated in some time T (which is much greater than the time of the updating loop's sleep interval) and assume that the previously-owning process has died.
But this is still prone to failure.
In other languages, I always try to use flock() calls to synchronize on a file. Not sure what the Java equivalent is. Get real concurrency if you at all possibly can.

I'm surprised nobody mentioned the interrupt mechanism implemented in Java. It's supposed to be a solution to the problem of stopping a thread. All other solutions have at least one flaw, that's why this mechanism is needed to be implemented in the Java concurrency library.
You can stop a thread by sending it an interrupt() message, but there are others ways that threads get interrupted. When this happens an InterruptedException is thrown. That's why you have to handle it when calling sleep() for example. That's where you can do cleanup and end gracefully, like closing the database connection.

Java9 has another "potential" answer to this: Thread.onSpinWait():
Indicates that the caller is momentarily unable to progress, until the occurrence of one or more actions on the part of other activities. By invoking this method within each iteration of a spin-wait loop construct, the calling thread indicates to the runtime that it is busy-waiting. The runtime may take action to improve the performance of invoking spin-wait loop constructions.
See JEP 285 for more details.

I think you should poll it with timertask instead.
My computer is running a while loop 1075566 times in 10 seconds.
Thats 107557 times in one second.
How often is it truly needed to poll it? A TimerTask runs at its fastest 1000 times in 1 second. You give it a parameter in int (miliseconds) as parameters. If you are content with that - that means you strain your cpu 108 times less with that task.
If you would be happy with polling once each second that is (108 * 1000). 108 000 times less straining. That also mean that you could check 108 000 values with the same cpu strain that you had with your one while loop - beause the you dont assign your cpu to check as often. Remember the cpu has a clock cycle. Mine is 3 600 000 000 hertz (cycles per second).
If your goal is to have it updated for a user - you can run a check each time the user logs in (or manually let him ask for an update) - that would practically not strain the cpu whatsoever.
You can also use thread.sleep(miliseconds); to lower the strain of your polling thread (as it wont be polling as often) you where doing.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.