Java: Monitor the acquisition of Locks

Java: Monitor the acquisition of Locks - java

I am currently experimenting a lot with java's security mechanisms in order to understand how to best execute untrusted code in a sandbox. One of the things you want to protect against are infinite loops which is why ideally you want to run the untrusted code in its own thread. Now, of course, malicious code could, for example, do some heavy processing resulting in a hanging thread. To get rid of this thread essentially the only way is to use java's deprecated Thread.stop() mechanism. The main reason this is problematic is that all of the locks held by the thread are released which could result in corrupted objects.
The question is: with Java's SecurityManager and custom classloaders I am able to track, for example, what classes can be loaded and what system resources can be accessed. Is there a way to be informed (and to potentially prohibit) code from acquiring locks (for example, defining a callback that is informed before the current thread goes into a synchronized block).

If you are already using a custom classloader, you could inspect the bytecode of each class before loading it and detect it if contains an instruction that grabs a lock (monitorenter).
You should also consider that locks released with stop() are only a problem if they are acquired on shared objects that other code might potentially lock. If you can avoid such objects to be accessible in the "evil" thread, you're safe.

Related

Possibility of exception after locking lock but before try-finally

I'm wondering if given code like
lock.lock();
try {
count++;
} finally {
lock.unlock();
}
is there any chance that executing thread can be somehow terminated after executing lock method but before entering try-finally block? That would result in a lock that is taken but never freed. Is there some line in Java/JVM specification that gives us any assurance that if the code is written using that idiom there is no chance of leaving locked-forever lock?
My question is inspired by the answer to C# related question Can Monitor.Enter throw an exception? that references two posts on MSDN
https://blogs.msdn.microsoft.com/ericlippert/2007/08/17/subtleties-of-c-il-codegen/
https://blogs.msdn.microsoft.com/ericlippert/2009/03/06/locks-and-exceptions-do-not-mix/
about problems with code like
Monitor.Enter(...)
try
{
...
}
finally
{
Monitor.Exit(..)
}
that in case of C# there is small chance that execution never reaches try-finally based on machine code generated by JIT.
I'm aware that this may be seen as a contrived question/nitpicking but my curiosity got better of me.

Is there some line in Java/JVM specification that gives us any
assurance that if the code is written using that idiom there is no
chance of leaving locked-forever lock?
First of all i want to notice that in java there are structured and unstructured locks. Structured locks are that kind of locks where the locked code is encapsulated inside some structure (block) and is autoreleased in the end of this structure(block). There are as synchronized methods as synchronized blocks. In java you will get monitorenter (https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-6.html#jvms-6.5.monitorenter) instruction directly in bytecode only in case of using synchronized block. So if monitorenter instruction fails then this will be a fatal error which will cause jvm to stop. With syncrhonized method there is no monitorenter and monitorexit instructions directly in the the compile bytecode but that code in the syncrhonized block is marked for the jvm as synchronized and jvm will do that job by itself. So in this case if smth will go wrong then again this will be a fatal jvm error. So here the answer to your question is no because synchronized blocks or methods are compiled to native jvm instructions and their crashing will cause the whole jvm to crash.
Now let's talk about unstructered locks. These are locks where you have to take care about locking and unlocking by calling direct methods of that lock. Here you gain a lot of advantages of creating complex interesting constructions like chain locking and others. And again the answer to your question is no and actually it is absolutely possible for exception to be thrown in that method and it is also absolutely possible to get live or dead lock here. All this is possible due to the fact that unstructured locking is absolutely programmatic java concept. JVM knows nothing about unstructured locks. Lock in java is an interface (https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/locks/Lock.html) and you can use OOB unstructered locks like Reentrant lock, Reentrant RW locks etc or write your custom implementation. But in the real life if you are using for example reentrant locks then there is almost no chance to get exception there. Even static analyzers will say you that there are no points in RLock where it is possible for exception to be thrown (as checked as unchecked). But what is possible is gettingan Error (https://docs.oracle.com/javase/7/docs/api/java/lang/Error.html) there. And again we come to a fatal JVM failure after which you wont need any locks. And instead of bytecode's monitorenter RLock and almost all other OOB java locks use AbstractQueuedSynchronizer (https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/locks/AbstractQueuedSynchronizer.html). So you can get sure that it is completely programmatic and JVM knows almost nothing about that.
Now from architecture perspective. If in some implementation you've got an unexpected exception inside the lock method and after that lock is still available for further usage then may be it would be better to get a forever living lock there instead of a lock with broken internal state. Its not safe to use it anymore and nobody guarantee correct further locking cause u have at least one precedent of incorrect behaviour. Any unexpected exception in lock should be considered as an issue which need deep investigation for it's initial reasons before further usage. And long living lock will prevent it's usage by other threads and, what is more important, system will preserve it's correct state. Then of course one day smb will m Concurrent computations in general are mostly about correctness.
Now about this question:
is there any chance that executing thread can be somehow terminated
after executing lock method but before entering try-finally block?
Answer for this is of course yes. You can even suspend thread holding the lock or just call sleep so other threads wont be able to acquire it. This is how locking algorithms work and nothing we can do about that. This will be classified as a bug. Actually lock-free 2+ threads algorithms are not vulnerable for such cases. Concurrent programming is not a simple thing. There are lot of things that you should think during it's design and even after that you wont avoid failures.

Why can't I directly access (and lock) the implicit lock that Objects use for synchronized block

rephrased for clarity
I would like to be able to mix the use of the synchronized block with more explicit locking via calling lock and release methods directly when appropriate. Thus allowing me the syntaxtical sugar of using sychtronized(myObject) when I can get away with it, but also being able to call myObject.lock & myObject.unlock dierctly for those times when a synchronized block is not flexible enough to do what I need done.
I know that every single Object has, implicitly, a rentrant lock built into it which is used by the synchronized block. Effectively the the lock mthod is caled on the Objects internal reentrant lock every time one enters a sychronized block, and unlock is called on the same reentrant lock when you leave the synchronized block. I would seem as if it wouldn be easy enough to allow one the ability to manually lock/unlock this inplicit reentrant lock; thus allowing a mixing of synchronized blocks and explcit locking.
However, as far as I know there is no way to do this. And because of the way synchronized blocks work I don't believe there is a convenient way to mix them with explicit locking othewrise. It seems as if this would be a rather convenient, and easily added by expending the Object api to add lock/unlock methods.
The question I have is, why doesn't this exist? I'm certain there is a reason, but I don't know what it is. I thought the issue may be with encapsulation; same reason you don't want to do synchronize(this). However, if I am already calling sycnhronized(myObject) then by defination anyone who knows about myObject can likewise synchronize on it and cause a deadlock if done foolishly. The question of encapsulation comes down to who can access the object you synchronized on regardless of rather you use a sychtronized block or manually locked the object; at least as I see it. So is there some other advantage to not allowing one to manually lock an object?

The locks of a certain object is highly tied to the instance itself. The structure of the synchronized blocks and methods are very strict. If you, as a programmer, would have the possibility to interfere with the system (virtual machine), it could cause serious problems.
You could eventually release a lock that was created by a synchronized block
You create a lock that another synchronized block will release
You create more lock entries than exits
You create more lock exits than entries
There are even specific bytecodes defined for the lock and release operations. If you would have a "method" for this lock/unlock operation, it should be compiled to these bytecodes. So, it is really a low-level operation, and very much different from other Java object level implementations.
Synchronisation is a very strong contract. I think that the designers of the JLS did not want to allow the possibility to break this contract.
The Chapter 17 of the JLS describes more about the expected behaviour.

Is java native method calls are Atomic calls?

I need certain operation to be Atomic.
Here 'Atomic' means to say "once that method started execution, it shouldn't be interrupted till it finishes, And even thread scheduler also should not make that thread to Runnable state once its in Running state".
All methods are Native methods and multiple threads are running concurrently to execute these methods.
First Question: Is the native methods execution is Atomic in nature?
Second Question : To achieve the Atomicity, Can we use java concurent/lock APIs, if yes, please provide any example/link if possible?
Thanks.

If i understand the question right, you are asking for: Is it possible for me to achieve that a certain thread will always be running on the CPU until it is finished? To be clear, it should not be replaced by another thread during that time.
The answer is: Even if it was (and i don't think it is), this would have nothing to do with atomicity. Because if you have multiple CPU's, multiple Threads can still access and change the same data while all are truely running uninterrupted at the same time. Therefore, even with your definition of "atomicity", you would still end-up with concurrency problems.
If you just want thread-safety in the conventional sense, then yes, you can use java-locks etc. to achieve that, even if you call native methods. see http://journals.ecs.soton.ac.uk/java/tutorial/native1.1/implementing/sync.html for an example of java thread synchronization from within native methods.

To prevent an interrupt you must the the machine code instructions cli and sti to disable and enable interrupts (on x86/x64). It not something you can even do in plain C language.
This is so low level as it is rarely done. Largely because its rarely very useful. The main concern is usually the behaviour of interaction with other threads which is why Atomic is defined this way for most use cases, not atomic in terms of interrupts.

Your definition of atomic is non-standard. This is a more standard definition.
Are native methods atomic (by your non standard definition)? Since I could at any point during the operation of a native method, yank the power cord and interrupt execution, I would say that native methods are incorrigibly non atomic.
Can we use the java concurrent lock APIs to achieve atomicity (non standard) in native methods? As stated previously, native methods are incorrigibly non atomic. Nothing can help.

Should I synchronize to avoid visiblity issues when using java.util.concurrent classes?

When using any of the java.util.concurrent classes, do I still need to synchronize access on the instance to avoid visibility issues between difference threads?
Elaborating the question a bit more
When using an instance of java.util.concurrent, is it possible that one thread modify the instance (i.e., put an element in a concurrent hashmap) and a subsequent thread won't be seeing the modification?
My question arises from the fact that The Java Memory Model allows threads to cache values instead of fetching them directly from memory if the access to the value is not synchronized.

On the java.util.concurrent package Memory Consistency Properties, you can check the Javadoc API for the package:
The methods of all classes in
java.util.concurrent and its
subpackages extend these guarantees to
higher-level synchronization. In
particular:
Actions in a thread prior to placing an object into any
concurrent collection
happen-before actions subsequent to the access or removal of that
element from the collection in
another thread.
[...]
Actions prior to "releasing" synchronizer methods such as
Lock.unlock, Semaphore.release, and
CountDownLatch.countDown
happen-before actions subsequent to a successful "acquiring" method
such as Lock.lock,
Semaphore.acquire, Condition.await,
and CountDownLatch.await on the
same synchronizer object in another
thread.
[...]
So, the classes in this package make sure of the concurrency, making use of a set of classes for thread control (Lock, Semaphore, etc.). This classes handle the happen-before logic programmatically, i.e. managing a FIFO stack of concurrent threads, locking and releasing current and subsequent threads (i.e. using Thread.wait() and Thread.resume(), etc.
Then, (theoretically) you don't need to synchronize your statements accessing this classes, because they are controlling concurrent threads access programmatically.

Because the ConcurrentHashMap (for example) is designed to be used from a concurrent context, you don't need to synchronise it further. In fact, doing so could undermine the optimisations it introduces.
For example, Collections.synchronizedMap(...) represents a way to make a map thread safe, as I understand it, it works essentially by wrapping all the calls within the synchronized keyword. Something like ConcurrentHashMap on the other hand creates synchronized "buckets" across the elements in the collection, causing finer grained concurrency control and therefore giving less lock contention under heavy usage. It may also not lock on reads for example. If you wrap this again with some synchronised access, you could undermine this. Obviously, you have to be careful that all access to the collection is syncrhronised etc which is another advantage of the newer library; you don't have to worry (as much!).
The java.lang.concurrent collections may implement their thread safety via syncrhonised. in which case the language specification guarantees visibility. They may implement things without using locks. I'm not as clear on this, but I assume it the same visibility would be in place here.
If you're seeing what looks like lost updates in your code, it may be that its just a race condition. Something like the ConcurrentHashpMap will give you the most recent value on a read and the write may not have yet been written. It's often a trade off between accuracy and performance.
The point is; java.util.concurrent stuff is meant to do this stuff so I'd be confident that it ensures visibility and use of volatile and/or addition syncrhonisation shouldn't be needed.

Are Thread.stop and friends ever safe in Java?

The stop(), suspend(), and resume() in java.lang.Thread are deprecated because they are unsafe. The Oracle recommended work around is to use Thread.interrupt(), but that approach doesn't work in all cases. For example, if you are call a library method that doesn't explicitly or implicitly check the interrupted flag, you have no choice but to wait for the call to finish.
So, I'm wondering if it is possible to characterize situations where it is (provably) safe to call stop() on a Thread. For example, would it be safe to stop() a thread that did nothing but call find(...) or match(...) on a java.util.regex.Matcher?
(If there are any Oracle engineers reading this ... a definitive answer would be really appreciated.)
EDIT: Answers that simply restate the mantra that you should not call stop() because it is deprecated, unsafe, whatever are missing the point of this question. I know that that it is genuinely unsafe in the majority of cases, and that if there is a viable alternative you should always use that instead.
This question is about the subset cases where it is safe. Specifically, what is that subset?

Here's my attempt at answering my own question.
I think that the following conditions should be sufficient for a single thread to be safely stopped using Thread.stop():
The thread execution must not create or mutate any state (i.e. Java objects, class variables, external resources) that might be visible to other threads in the event that the thread is stopped.
The thread execution must not use notify to any other thread during its normal execution.
The thread must not start or join other threads, or interact with then using stop, suspend or resume.
(The term thread execution above covers all application-level code and all library code that is executed by the thread.)
The first condition means that a stopped thread will not leave any external data structures or resources in an inconsistent state. This includes data structures that it might be accessing (reading) within a mutex. The second condition means that a stoppable thread cannot leave some other thread waiting. But it also forbids use of any synchronization mechanism other that simple object mutexes.
A stoppable thread must have a way to deliver the results of each computation to the controlling thread. These results are created / mutated by the stoppable thread, so we simply need to ensure that they are not visible following a thread stop. For example, the results could be assigned to private members of the Thread object and "guarded" with a flag that is atomically by the thread to say it is "done".
EDIT: These conditions are pretty restrictive. For example, for a "regex evaluator" thread to be safely stopped, if we must guarantee that the regex engine does not mutate any externally visible state. The problem is that it might do, depending on how you implement the thread!
The Pattern.compile(...) methods might update a static cache of compiled
patterns, and if they did they would (should) use a mutex to do it. (Actually, the OpenJDK 6.0 version doesn't cache Patterns, but Sun might conceivably change this.)
If you try to avoid 1) by compiling the regex in the control thread and supplying a pre-instantiated Matcher, then the regex thread does mutate externally visible state.
In the first case, we would probably be in trouble. For example, suppose that a HashMap was used to implement the cache and that the thread was interrupted while the HashMap was being reorganized.
In the second case, we would be OK provided that the Matcher had not been passed to some other thread, and provided that the controller thread didn't try to use the Matcher after stopping the regex matcher thread.
So where does this leave us?
Well, I think I have identified conditions under which threads are theoretically safe to stop. I also think that it is theoretically possible to statically analyse the code of a thread (and the methods it calls) to see if these conditions will always hold. But, I'm not sure if this is really practical.
Does this make sense? Have I missed something?
EDIT 2
Things get a bit more hairy when you consider that the code that we might be trying to kill could be untrusted:
We can't rely on "promises"; e.g. annotations on the untrusted code that it is either killable, or not killable.
We actually need to be able to stop the untrusted code from doing things that would make it unkillable ... according to the identified criteria.
I suspect that this would entail modifying JVM behaviour (e.g. implementing runtime restrictions what threads are allowed to lock or modify), or a full implementation of the Isolates JSR. That's beyond the scope of what I was considering as "fair game".
So lets rule the untrusted code case out for now. Or at least, acknowledge that malicious code can do things to render itself not safely killable, and put that problem to one side.

The lack of safety comes from the idea idea of critical sections
Take mutex
do some work, temporarily while we work our state is inconsistent
// all consistent now
Release mutex
If you blow away the thread and it happend to be in a critical section then the object is left in an inconsistent state, that means not safely usable from that point.
For it to be safe to kill the thread you need to understand the entire processing of whatever is being done in that thread, to know that there are no such critical sections in the code. If you are using library code, then you may not be able to see the source and know that it's safe. Even if it's safe today it may not be tomorrow.
(Very contrived) Example of possible unsafety. We have a linked list, it's not cyclic. All the algorithms are really zippy because we know it's not cyclic. During our critical section we temporarily introduce a cycle. We then get blown away before we emerge from the critical section. Now all the algorithms using the list loop forever. No library author would do that surely! How do you know? You cannot assume that code you use is well written.
In the example you point to, it's surely possible to write the requreid functionality in an interruptable way. More work, but possible to be safe.
I'll take a flyer: there is no documented subset of Objects and methods that can be used in cancellable threads, because no library author wants to make the guarantees.

Maybe there's something I don't know, but as java.sun.com said, it is unsafe because anything this thread is handling is in serious risk to be damaged. Other objects, connections, opened files... for obvious reasons, like "don't shut down your Word without saving first".
For this find(...) exemple, I don't really think it would be a catastrophe to simply kick it away with a sutiless .stop()...

A concrete example would probably help here. If anyone can suggest a good alternative to the following use of stop I'd be very interested. Re-writing java.util.regex to support interruption doesn't count.
import java.util.regex.*;
import java.util.*;
public class RegexInterruptTest {
private static class BadRegexException extends RuntimeException { }
final Thread mainThread = Thread.currentThread();
TimerTask interruptTask = new TimerTask() {
public void run() {
System.out.println("Stopping thread.");
// Doesn't work:
// mainThread.interrupt();
// Does work but is deprecated and nasty
mainThread.stop(new BadRegexException());
}
};
Timer interruptTimer = new Timer(true);
interruptTimer.schedule(interruptTask, 2000L);
String s = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab";
String exp = "(a+a+){1,100}";
Pattern p = Pattern.compile(exp);
Matcher m = p.matcher(s);
try {
System.out.println("Match: " + m.matches());
interruptTimer.cancel();
} catch(BadRegexException bre) {
System.out.println("Oooops");
} finally {
System.out.println("All over");
}
}
}

There are ways to use Thread.stop() relatively stable w/o leaking memory or file descriptors (FDs are exceptionally leak prone on *NIX) but you shall rely on it only if you are forced to manage 3rd party code. Never do use it to achieve the result if you can have control over the code itself.
If I use Thread.stop along w/ interrupt() and some more hacks stuff like adding custom logging handlers to re-throw the trapped ThreadDeath, adding unhandleExceltionHandler, running into your own ThreadGroup (sync over 'em), etc...
But that deserves an entire new topic.
But in this case it's the Java Designers telling you; and
they're more authorative on their language then either of us :)
Just a note: quite a few of them are pretty clueless

If my understanding is right, the problem has to do with synchronization locks not being released as the generated ThreadInterruptedException() propagates up the stack.
Taking that for granted, it's inherently unsafe because you can never know whether or not any "inner method call" you happened to be in at the very moment stop() was invoked and effectuated, was effectively holding some synchronization lock, and then what the java engineers say is, seemingly, unequivocally right.
What I personally don't understand is why it should be impossible to release any synchronization lock as this particular type of Exception propagates up the stack, thereby passing all the '}' method/synchronization block delimiters, which do cause any locks to be release for any other type of exception.
I have a server written in java, and if the administrator of that service wants a "cold shutdown", then it is simply NECESSARY to be able to stop all running activity no matter what. Consistency of any object's state is not a concern because all I'm trying to do is to EXIT. As fast as I can.

There is no safe way to kill a thread.
Neither there is a subset of situations where it is safe. Even if it is working 100% while testing on Windows, it may corrupt JVM process memory under Solaris or leak thread resources under Linux.
One should always remember that underneath the Java Thread there is a real, native, unsafe thread.
That native thread works with native, low-level, data and control structures. Killing it may leave those native data structures in an invalid state, without a way to recover.
There is no way for Java machine to take all possible consequences into account, as the thread may allocate/use resources not only within JVM process, but within the OS kernel as well.
In other words, if native thread library doesn't provide a safe way to kill() a thread, Java cannot provide any guarantees better than that. And all known to me native implementations state that killing thread is a dangerous business.

All forms of concurrency control can be provided by the Java synchronization primitives by constructing more complex concurrency controls that suit your problem.
The reasons for deprecation are clearly given in the link you provide. If you're willing to accept the reasons why, then feel free to use those features.
However, if you choose to use those features, you also accept that support for those features could stop at any time.
Edit: I'll reiterate the reason for deprecation as well as how to avoid them.
Since the only danger is that objects that can be referenced by the stoped thread could be corrupted, simply clone the String before you pass it to the Thread. If no shared objects exist, the threat of corrupted objects in the program outside the stoped Thread is no longer there.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.