Are Thread.stop and friends ever safe in Java?

Are Thread.stop and friends ever safe in Java? - java

The stop(), suspend(), and resume() in java.lang.Thread are deprecated because they are unsafe. The Oracle recommended work around is to use Thread.interrupt(), but that approach doesn't work in all cases. For example, if you are call a library method that doesn't explicitly or implicitly check the interrupted flag, you have no choice but to wait for the call to finish.
So, I'm wondering if it is possible to characterize situations where it is (provably) safe to call stop() on a Thread. For example, would it be safe to stop() a thread that did nothing but call find(...) or match(...) on a java.util.regex.Matcher?
(If there are any Oracle engineers reading this ... a definitive answer would be really appreciated.)
EDIT: Answers that simply restate the mantra that you should not call stop() because it is deprecated, unsafe, whatever are missing the point of this question. I know that that it is genuinely unsafe in the majority of cases, and that if there is a viable alternative you should always use that instead.
This question is about the subset cases where it is safe. Specifically, what is that subset?

Here's my attempt at answering my own question.
I think that the following conditions should be sufficient for a single thread to be safely stopped using Thread.stop():
The thread execution must not create or mutate any state (i.e. Java objects, class variables, external resources) that might be visible to other threads in the event that the thread is stopped.
The thread execution must not use notify to any other thread during its normal execution.
The thread must not start or join other threads, or interact with then using stop, suspend or resume.
(The term thread execution above covers all application-level code and all library code that is executed by the thread.)
The first condition means that a stopped thread will not leave any external data structures or resources in an inconsistent state. This includes data structures that it might be accessing (reading) within a mutex. The second condition means that a stoppable thread cannot leave some other thread waiting. But it also forbids use of any synchronization mechanism other that simple object mutexes.
A stoppable thread must have a way to deliver the results of each computation to the controlling thread. These results are created / mutated by the stoppable thread, so we simply need to ensure that they are not visible following a thread stop. For example, the results could be assigned to private members of the Thread object and "guarded" with a flag that is atomically by the thread to say it is "done".
EDIT: These conditions are pretty restrictive. For example, for a "regex evaluator" thread to be safely stopped, if we must guarantee that the regex engine does not mutate any externally visible state. The problem is that it might do, depending on how you implement the thread!
The Pattern.compile(...) methods might update a static cache of compiled
patterns, and if they did they would (should) use a mutex to do it. (Actually, the OpenJDK 6.0 version doesn't cache Patterns, but Sun might conceivably change this.)
If you try to avoid 1) by compiling the regex in the control thread and supplying a pre-instantiated Matcher, then the regex thread does mutate externally visible state.
In the first case, we would probably be in trouble. For example, suppose that a HashMap was used to implement the cache and that the thread was interrupted while the HashMap was being reorganized.
In the second case, we would be OK provided that the Matcher had not been passed to some other thread, and provided that the controller thread didn't try to use the Matcher after stopping the regex matcher thread.
So where does this leave us?
Well, I think I have identified conditions under which threads are theoretically safe to stop. I also think that it is theoretically possible to statically analyse the code of a thread (and the methods it calls) to see if these conditions will always hold. But, I'm not sure if this is really practical.
Does this make sense? Have I missed something?
EDIT 2
Things get a bit more hairy when you consider that the code that we might be trying to kill could be untrusted:
We can't rely on "promises"; e.g. annotations on the untrusted code that it is either killable, or not killable.
We actually need to be able to stop the untrusted code from doing things that would make it unkillable ... according to the identified criteria.
I suspect that this would entail modifying JVM behaviour (e.g. implementing runtime restrictions what threads are allowed to lock or modify), or a full implementation of the Isolates JSR. That's beyond the scope of what I was considering as "fair game".
So lets rule the untrusted code case out for now. Or at least, acknowledge that malicious code can do things to render itself not safely killable, and put that problem to one side.

The lack of safety comes from the idea idea of critical sections
Take mutex
do some work, temporarily while we work our state is inconsistent
// all consistent now
Release mutex
If you blow away the thread and it happend to be in a critical section then the object is left in an inconsistent state, that means not safely usable from that point.
For it to be safe to kill the thread you need to understand the entire processing of whatever is being done in that thread, to know that there are no such critical sections in the code. If you are using library code, then you may not be able to see the source and know that it's safe. Even if it's safe today it may not be tomorrow.
(Very contrived) Example of possible unsafety. We have a linked list, it's not cyclic. All the algorithms are really zippy because we know it's not cyclic. During our critical section we temporarily introduce a cycle. We then get blown away before we emerge from the critical section. Now all the algorithms using the list loop forever. No library author would do that surely! How do you know? You cannot assume that code you use is well written.
In the example you point to, it's surely possible to write the requreid functionality in an interruptable way. More work, but possible to be safe.
I'll take a flyer: there is no documented subset of Objects and methods that can be used in cancellable threads, because no library author wants to make the guarantees.

Maybe there's something I don't know, but as java.sun.com said, it is unsafe because anything this thread is handling is in serious risk to be damaged. Other objects, connections, opened files... for obvious reasons, like "don't shut down your Word without saving first".
For this find(...) exemple, I don't really think it would be a catastrophe to simply kick it away with a sutiless .stop()...

A concrete example would probably help here. If anyone can suggest a good alternative to the following use of stop I'd be very interested. Re-writing java.util.regex to support interruption doesn't count.
import java.util.regex.*;
import java.util.*;
public class RegexInterruptTest {
private static class BadRegexException extends RuntimeException { }
final Thread mainThread = Thread.currentThread();
TimerTask interruptTask = new TimerTask() {
public void run() {
System.out.println("Stopping thread.");
// Doesn't work:
// mainThread.interrupt();
// Does work but is deprecated and nasty
mainThread.stop(new BadRegexException());
}
};
Timer interruptTimer = new Timer(true);
interruptTimer.schedule(interruptTask, 2000L);
String s = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab";
String exp = "(a+a+){1,100}";
Pattern p = Pattern.compile(exp);
Matcher m = p.matcher(s);
try {
System.out.println("Match: " + m.matches());
interruptTimer.cancel();
} catch(BadRegexException bre) {
System.out.println("Oooops");
} finally {
System.out.println("All over");
}
}
}

There are ways to use Thread.stop() relatively stable w/o leaking memory or file descriptors (FDs are exceptionally leak prone on *NIX) but you shall rely on it only if you are forced to manage 3rd party code. Never do use it to achieve the result if you can have control over the code itself.
If I use Thread.stop along w/ interrupt() and some more hacks stuff like adding custom logging handlers to re-throw the trapped ThreadDeath, adding unhandleExceltionHandler, running into your own ThreadGroup (sync over 'em), etc...
But that deserves an entire new topic.
But in this case it's the Java Designers telling you; and
they're more authorative on their language then either of us :)
Just a note: quite a few of them are pretty clueless

If my understanding is right, the problem has to do with synchronization locks not being released as the generated ThreadInterruptedException() propagates up the stack.
Taking that for granted, it's inherently unsafe because you can never know whether or not any "inner method call" you happened to be in at the very moment stop() was invoked and effectuated, was effectively holding some synchronization lock, and then what the java engineers say is, seemingly, unequivocally right.
What I personally don't understand is why it should be impossible to release any synchronization lock as this particular type of Exception propagates up the stack, thereby passing all the '}' method/synchronization block delimiters, which do cause any locks to be release for any other type of exception.
I have a server written in java, and if the administrator of that service wants a "cold shutdown", then it is simply NECESSARY to be able to stop all running activity no matter what. Consistency of any object's state is not a concern because all I'm trying to do is to EXIT. As fast as I can.

There is no safe way to kill a thread.
Neither there is a subset of situations where it is safe. Even if it is working 100% while testing on Windows, it may corrupt JVM process memory under Solaris or leak thread resources under Linux.
One should always remember that underneath the Java Thread there is a real, native, unsafe thread.
That native thread works with native, low-level, data and control structures. Killing it may leave those native data structures in an invalid state, without a way to recover.
There is no way for Java machine to take all possible consequences into account, as the thread may allocate/use resources not only within JVM process, but within the OS kernel as well.
In other words, if native thread library doesn't provide a safe way to kill() a thread, Java cannot provide any guarantees better than that. And all known to me native implementations state that killing thread is a dangerous business.

All forms of concurrency control can be provided by the Java synchronization primitives by constructing more complex concurrency controls that suit your problem.
The reasons for deprecation are clearly given in the link you provide. If you're willing to accept the reasons why, then feel free to use those features.
However, if you choose to use those features, you also accept that support for those features could stop at any time.
Edit: I'll reiterate the reason for deprecation as well as how to avoid them.
Since the only danger is that objects that can be referenced by the stoped thread could be corrupted, simply clone the String before you pass it to the Thread. If no shared objects exist, the threat of corrupted objects in the program outside the stoped Thread is no longer there.

Related

Is there a way to know all possible places in code where the system may interchange threads

I'm reading a book called "Java Concurrency In Practice" and in the first chapter the following code is demonstrated as thread unsafe
public class UnsafeSequence {
private int value;
/** Returns a unique value. */
public int getNext() {
return value++;
}
}
So if two threads run this code we can get unwanted results because they will interchange in different steps such as reading, modifying and writing the value. Is this determined only by OS, or do threads switch between each other on different "bytecode commands" for example? Is there any way to know all possible places where threads might switch from one to another, not just for this code but in general?

As several comments note, no. Two things you can do:
Write your classes in a thread-safe manner, so that thread scheduling isn't an issue.
Use concurrency support to prevent issues.
Keep reading the book.

Is there any way to know all possible places where threads might switch from one to another, not just for this code but in general?
This question is a bit vagiue. Let me split it up in two parts:
Two threads can wander over the same piece of code and happily interleave, except:
inside atomic operations (including complex operations inside of thread-safe classes)
inside guarded blocks (e.g. using a synchronized block, lock, semaphore, or some other memory fence)
Threads can switch all the time, which is 100% up to the OS. In theory a thread might even never get a chance to be 'scheduled in' again if the OS decides so. Threads may die spuriously (e.g. killed in ProcessExplorer). You never know when a thread will be stopped in it's tracks (suspended), but you do know that if it happens inside an atomic operation, no other thread will enter that code until the suspended thread resumes and completes the operation.

It happens whenever the system scheduler feels like. It has nothing to do with the JVM if the JVM only passes that scheduling to the native processor.

Why would you use both a boolean AND interrupt() to signal thread termination?

I was reading about ways to end a thread in https://docs.oracle.com/javase/7/docs/technotes/guides/concurrency/threadPrimitiveDeprecation.html, which is a link I found from the question How do you kill a thread in Java?
In the first link, they first discuss using a volatile variable to signal to the thread that you want to terminate. The thread is supposed to check this variable and cease operation if the variable has a value that means cease (e.g. if it is null). So to terminate the thread, you would set that variable to null.
Then they discuss adding interrupts to help with threads that block for long periods of time. They give the following example of a stop() method that sets the volatile variable (waiter) to null and then ALSO throws an interrupt.
public void stop() {
Thread moribund = waiter;
waiter = null;
moribund.interrupt();
}
I am just wondering, why would you need both? Why not ONLY use interrupt(), and then handle it properly? It seems redundant to me.

(First part of this is in general, arguably I was not paying attention to the specifics of the question. Skip to the end for the part that addresses the techspec discussed in the question.)
There is no good technical reason. This is partly about human limitations and partly about confusing api design.
First consider application developers’ priority is creating working code that solves business problems. Thoroughly learning low level apis like this gets lost in the rush to get work done.
Second there’s a tendency when you’re learning things to get to a good enough state and leave it there. Things like threading and exception handling are back-of-the-book topics that get neglected. Interruption is a back of the book topic for threading books.
Now imagine a codebase worked on by multiple people with varying skill level and attention to detail, who may forget that throwing InterruptedException from wait and sleep resets the interrupt flag, or that interrupted isn’t the same as isInterrupted, or that InterruptedIoException resets the interrupt flag too. If you have a finally block that catches IOException, you may miss that InterruptedException is a subclass of IOException and you could be missing out on restoring the interrupt flag. Probably people in a hurry decided to hell with it, I can’t count on this interrupted flag
Is it right? No.
The hand rolled flag doesn’t help with short circuiting wait or sleep the way interruption does.
the Java 5 concurrency tools expect tasks to use interruption for cancellation. Executors expect tasks submitted to them to check for interruption in order to quit gracefully. Your tasks may use other components, like a blocking queue. That queue needs to be able to respond to interruption, the thing using it needs to be aware of the interruption. The handrolled flag doesn’t work for this since the java 5 classes can’t know about it.
Having to use interruption because you’re using tools that expect it, but not having confidence in the flag value due to unmanageable technicalities, would lead to this kind of code.
(end of rant, now actually responding to the specific techspec example)
OK, looking at this techguide article in particular. Three things stand out:
1) it's not making use of interruption at all, except to cut the sleep time short. Then it just squelches the exception, and doesn't bother to restore the flag or check it. You could use the InterruptedException to terminate by catching it outside the while loop, but that's not what this does. This seems like a strange way to do this.
2) as the example is fleshed out it becomes clear the flag is being used to turn the waiting on and off. Somebody might use interruption for this but it's not idiomatic. So having a flag is ok here.
3) this is a toy example in a techguide. Not all the Oracle content is as authoritative as the techspecs or the API documentation. Some tutorials have misstatements or are incomplete. It might be the reason the code was written like this was that the author figured readers would not be familiar with how interruption worked and it was better to minimize its usage. Technical writers have been known to make choices like that.
If I rewrote this to use interruption I would still keep the flag; I'd use interrupt to terminate, and use the flag for suspend/resume functionality.

Please see this documentation.
Your thread should check thread.isInterrupted() status, for example:
Thread myThread = new Thread() {
#Override
public void run() {
while (!this.isInterrupted()) {
System.out.println("I'm working");
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
//We would like also exit from the thread
return;
}
}
}
};
And when you would like to stop the thread, you should invoke
myThread.interrupt();
Besides we can use static method Thread.interrupted() that also checks the status but after that, the method clears it and you have to invoke again myThread.interrupt() to set the status again. But I don't recommend to use Thread.interrupted() method.
This approach helps gracefully stop the thread.
So I also do not see any reasons to use an additional volatile variable.
The same behavior can be reached via additional volatile flag as in #lexicore's answer, but I think it is redundant code.

#NathanHughes is completely right, but I am going to rephrase his long answer into few words: third-party code.
This is not just about "dumb junior developers", — at some point in application's life you will be using lots of third-party libraries. Those libraries will not gentlemanly respect your assumptions about concurrency. Sometimes they will silently reset interruption flag. Using separate flag solves that.
No, you can not get rid of third-party code and call it a day.
Suppose, that you are writing a library, — for example, ThreadPoolExecutor. Some of the code inside your library needs to handle interruption… Or even unconditionally reset the interruption flag. Why? Because previous Runnable is done, and a new Runnable is on the way. Unfortunately, at that point there may be a stale interruption flag, that was aimed… wait, whom was it for again? It could have been addressed for previous Runnable or for new (not yet running) Runnable, — there is no way to tell. And this is why you add isCancelled method to FutureTask. And unconditionally reset interruption flag before executing new Runnable. Sounds familiar?
Thread#interrupt is completely detached from the actual work units, running on that thread, so adding an additional flag is necessary. And once you have started doing so, you have to do that all the way — up to the outermost and down to the innermost work unit. In effect, running unknown user-supplied code on your threads makes Thread#interrupted unreliable (even if Thread#interrupt still works fine!)

How to Junit test a synchronized object not accessed by two threads at same time?

Is there any way I can make a Junit test to make sure that a synchronized object (in my case HashMap in synchronized block) is not accessed by two threads simultaneously? e.g. forcing two threads to try to access and having exception thrown.
Thanks for your help!

The best framework I've seento help with thread testing is Thread Weaver. At the very least it offers some deterministic way of thread scheduling, and a limited (yet useful) way of trying to find race conditions.
You can even code up some more intricate thread scheduling scenarios, but those tests will inevitably be white box tests. Still, those can have their use too.

Is there any way I can make a Junit test to make sure that a synchronized object (in my case HashMap in synchronized block) is not accessed by two threads simultaneously?
I'm not sure there is a testing framework to test this but you can certainly write some code that tries to access the protected HashMap over and over with many threads. Unfortunately this is very hard to do reliably since, as #Bohemian mentions, there is no way to be sure how threads run and access the map, especially in concert.
e.g. forcing two threads to try to access and having exception thrown. Thanks for your help!
Yeah this won't happen for 2 reasons. As mentioned, there is no "forcing" of threads. You just don't have that level of control. Also, threads do not throw exceptions because of synchronization problems unless you are doing something other than synchronized(hashMap) { ... }. When a thread is holding the lock on the map, other threads will block until it releases the lock. This is hard to detect and control. If you add code to do the detection and thread control then you get into a Heisenberg situation where you will be affecting the thread behavior because of your monitoring code.
Testing proper synchronization is very difficult and often impossible to do. Reviewing code with other developers to make sure that your HashMap is fully synchronized every time it is used, maybe be more productive.
Lastly, if you are worried about the HashMap then you maybe should consider to moving to ConcurrentHashMap or Collections.synchronizedMap(new HashMap). These take care of the synchronization and protection of the actual map for you although they don't handle race conditions if you are making multiple map calls with one operation. Btw, HashTable is considered and old class and should not be used.
Hope this helps.

Essentially, you can't, because you can't control when threads are scheduled, let alone coordinate them to test a particular behaviour.
Secondly, not all build servers are multi threaded (I got bitten by this only a couple of days ago - cheap AWS instances have only 1 CPU), so you can't rely on even having more than know thread available to test with.
Try to refactor your code so the locking part is separated from your application and test that logic in isolation... if you can.

As I understand, you have a code similar to this one
synchronized(myHashMap) {
...
}
... which means that a thread acquires the lock provided by myHashMap when it enters synchronized block and all other threads that try to enter the same block have to wait, i.e. no other thread can acquire the same lock.
Is there any way I can make a Junit test to make sure that a synchronized object (in my case HashMap in synchronized block) is not accessed by two threads simultaneously?
Knowing the above, why would you do that? If you still want to try then you might want to take a look at this answer.
And last, but not least. I'd recommend you to use Hashtable because it's synchronized. Use ConcurrentHashMap.

Is java native method calls are Atomic calls?

I need certain operation to be Atomic.
Here 'Atomic' means to say "once that method started execution, it shouldn't be interrupted till it finishes, And even thread scheduler also should not make that thread to Runnable state once its in Running state".
All methods are Native methods and multiple threads are running concurrently to execute these methods.
First Question: Is the native methods execution is Atomic in nature?
Second Question : To achieve the Atomicity, Can we use java concurent/lock APIs, if yes, please provide any example/link if possible?
Thanks.

If i understand the question right, you are asking for: Is it possible for me to achieve that a certain thread will always be running on the CPU until it is finished? To be clear, it should not be replaced by another thread during that time.
The answer is: Even if it was (and i don't think it is), this would have nothing to do with atomicity. Because if you have multiple CPU's, multiple Threads can still access and change the same data while all are truely running uninterrupted at the same time. Therefore, even with your definition of "atomicity", you would still end-up with concurrency problems.
If you just want thread-safety in the conventional sense, then yes, you can use java-locks etc. to achieve that, even if you call native methods. see http://journals.ecs.soton.ac.uk/java/tutorial/native1.1/implementing/sync.html for an example of java thread synchronization from within native methods.

To prevent an interrupt you must the the machine code instructions cli and sti to disable and enable interrupts (on x86/x64). It not something you can even do in plain C language.
This is so low level as it is rarely done. Largely because its rarely very useful. The main concern is usually the behaviour of interaction with other threads which is why Atomic is defined this way for most use cases, not atomic in terms of interrupts.

Your definition of atomic is non-standard. This is a more standard definition.
Are native methods atomic (by your non standard definition)? Since I could at any point during the operation of a native method, yank the power cord and interrupt execution, I would say that native methods are incorrigibly non atomic.
Can we use the java concurrent lock APIs to achieve atomicity (non standard) in native methods? As stated previously, native methods are incorrigibly non atomic. Nothing can help.

What is the benefit of ThreadGroup in java over creating separate threads?

Many methods like stop(), resume(), suspend() etc are deprecated.
So is it useful to create threads using ThreadGroup?

Using ThreadGroup can be a useful diagnostic technique in big application servers with thousands of threads. If your threads are logically grouped together, then when you get a stack trace you can see which group the offending thread was part of (e.g. "Tomcat threads", "MDB threads", "thread pool X", etc), which can be a big help in tracking down and fixing the problem.

Don't use ThreadGroup for new code. Use the Executor stuff in java.util.concurrent instead.

Somewhat complimentary to the answer provided (6 years ago or so). But, while the Concurrency API provides a lot of constructs, the ThreadGroup might still be useful to use. It provides the following functionality:
Logical organisation of your threads (for diagnostic purposes).
You can interrupt() all the threads in the group. (Interrupting is perfectly fine, unlike suspend(), resume() and stop()).
You can set the maximum priority of the threads in the group. (not sure how widely useful is that, but there you have it).
Sets the ThreadGroup as a daemon. (So all new threads added to it will be daemon threads).
It allows you to override its uncaughtExceptionHandler so that if one of the threads in the group throws an Exception, you have a callback to handle it.
It provides you some extra tools such as getting the list of threads, how many active ones you have etc. Useful when having a group of worker threads, or some thread pool of some kind.

The short answer is - no, not really. There's little if any benefit to using one.
To expand on that slightly, if you want to group worker threads together you're much better off using an ExecutorService. If you want to quickly count how many threads in a conceptual group are alive, you still need to check each Thread individually (as ThreadGroup.activeCount() is an estimation, meaning it's not useful if the correctness of your code depends on its output).
I'd go so far as to say that the only thing you'd get from one these days, aside from the semantic compartmentalisation, is that Threads constructed as part of a group will pick up the daemon flag and a sensible name based on their group. And using this as a shortcut for filling in a few primitives in a constructor call (which typically you'd only have to write once anyway, sicne you're probably starting the threads in a loop and/or method call).
So - I really don't see any compelling reason to use one at all. I specifically tried to, a few months back, and failed.
EDIT - I suppose one potential use would be if you're running with a SecurityManager, and want to assert that only threads in the same group can interrupt each other. Even that's pretty borderline, as the default implementation always returns true for a Thread in any non-system thread group. And if you're implementing your own SecurityManager, you've got the possibility to have it make its decision on any other criteria (including the typical technique of storing Threads in collections as they get created).

Great answer for #skaffman. I want to add one more advantage:
Thread groups helps manipulating all the threads which are defined in this at once.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.