Is it safe to cache and re use the instances of java.lang.invoke.MethodHandle?
I check the JavaDoc and couldn't find anything about thread safety.
Yes, it should be perfectly safe to share MethodHandle objects between threads.
Note the API documentation says the following about it:
Method handles are immutable and have no visible state. Of course, they can be bound to underlying methods or data which exhibit state. With respect to the Java Memory Model, any method handle will behave as if all of its (internal) fields are final variables. This means that any method handle made visible to the application will always be fully formed. This is true even if the method handle is published through a shared variable in a data race.
MethodHandle is an abstraction for code invocation, not the management of state behind the code. Thus the reasoning for thread safety is that it is reliant on the target method that is being actually executed, not the MethodHandle object itself.
Related
In the book Java Concurrency in Practice by Brian Goetz et al.:
If you do not ensure that publishing the shared reference happens-before another thread loads that shared reference, then the write of the reference to the new object can be reordered (from the perspective of the thread consuming the object) with writes to its fields. In that case, another thread could see an up-to-date value for the object reference but out-of-date values for some or all of that object's state-a partially constructed object.
Does this mean that: in the thread publishing the object, the write of the reference to the new object is not reordered with writes to its fields; the write to its fields happens before the write of the reference. However, that publishing thread may flush the updated reference to main memory before it flushes the updated object fields. Therefore, the thread consuming the object may see a non-null reference for the object, yet see outdated values for the object fields? And in that sense, the operations are reordered for the consuming thread.
Yes.
The answer to your question is right there in the paragraph that you quoted, and you seem to echo the answer in your question.
One comment though: You said that, "[the] publishing thread may flush the updated reference to main memory before it flushes the updated object fields." If you're talking about Java code, then it's best to stick with what is written in the Java Language Specification (JLS).
The JLS tells you how a Java program is allowed to behave. It says nothing about "main memory" or "caches" or "flushing." It only says that without explicit synchronization, the updates that one thread performs in a certain order on two or more variables may seem to have happened in a different order when viewed from the perspective of some other thread. How or why that can happen is "implementation details."
in the thread publishing the object, the write of the reference to the new object is not reordered with writes to its fields; the write to its fields happens before the write of the reference.
Yes. Because in one single thread this process happens in Program Order which doesn't allow reordering: "If x and y are actions of the same thread and x comes before y in program order, then hb(x, y)." (https://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html#jls-17.4.5). We may rephrase you a bit: "the write of the reference to the new object is not reordered with writes to its fields", which means that if you read the reference to an object, it is guaranteed that you will read all its fields consecutively.
the thread consuming the object may see a non-null reference for the object, yet see outdated values for the object fields?
Yes, it may when you publish the object in unsafe manner, without appropriate HB edges implemented with memory barriers. Literally speaking, in absence of the HB/membars you get undefined behavior. This means that in other thread you can see/read anything (except out-of-thin-air (OoTA) values, explicitly forbidden by JMM). Safe publication makes all the values written before the publication visible to all readers that observed the published object. There are few most popular and simple ways to make the publication safe:
Publish the reference through a properly locked field (https://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html#jls-17.4.5)
Use the static initializer to do the initializing stores (http://docs.oracle.com/javase/specs/jls/se8/html/jls-12.html#jls-12.4)
Publish the reference via a volatile field (https://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html#jls-17.4.5), or as the consequence of this rule, via the AtomicX classes
Initialize the value into a final field, which leads to the freeze action (https://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html#jls-17.5).
You also can use other actions which produces HBs like Thread.start() etc., but my day-to-day favorites are:
the final fields for immutable data
volatile/AtomicXXX fields and locks (explicit synchronized block/ReadWriteLock, implicit locks in BlockingQueue) for mutable data.
I am trying to understand at a high level how java's concurrent API built using AbstractQueuedSynchronizer as a building block. I didn't see any use of synchronized, wait() + notify() inside this class. Then how it is possible to achieve a thread-safe code?
Although I saw unsafe CAS operations to achieve some atomicity, but that is not enough to have fully thread safe code.
The Unsafe class is not as well documented as classes publicly exposed by the JDK, so not all guarantees its methods make are obvious.
However, if you look at the latest source code of AbstractQueuedSynchronizer, you will see that it now uses VarHandle whose methods are well documented. For compareAndSet the documentation says:
Atomically sets the value of a variable to the newValue with the memory semantics of setVolatile(java.lang.Object...) if the variable's current value, referred to as the witness value, == the expectedValue, as accessed with the memory semantics of getVolatile(java.lang.Object...).
This means there will not be race conditions since for two concurrent threads only one thread will update the value, the other will fail. And you get the needed memory visibility guarantees.
Recently, I was asked in interview why wait, notify, and notifyAll are used. I explained them.
After that they asked me to assume an application is always single threaded. Is it really required? My answer was no.
Then, they asked why is design like wait, notify, and notifyAll are methods on the Object class. Why doesn't Java have an interface and these methods are in that interface and which ever class wants to implement it can use it. So, I was kind of stuck and unable to think over this design. Can anyone please sow light over this?
JVM uses OS-level threads. That means that each concrete JVM for each concrete OS handles threads differently. And these methods are not only implemented in Object class, they are marked as native, which kind of means that the are implemented in system layer of JVM.
And if those methods were in some interface, that would mean that anybody can redefine them.
Wait and notify and notifyAll is not just normal methods or synchronization utility, more than that they are communication mechanism between two threads in Java. And Object class is correct place to make them available for every object if this mechanism is not available via any java keyword like synchronized. Remember synchronized and wait notify are two different area and don’t confuse that they are same or related. Synchronized is to provide mutual exclusion and ensuring thread safety of Java class like race condition while wait and notify are communication mechanism between two thread.
Then, they asked why is design like wait, notify, and notifyAll are methods on the Object class. Why doesn't Java have an interface and these methods are in that interface and which ever class wants to implement it can use it.
All of these methods are implemented in native code and they integrate closely with the synchronized block that wraps them. They are part of the Java language definition and have specific behaviors that programmers rely on. It would not be appropriate for them just to be interface methods that any object would implement.
When one object calls obj.wait(); on another object, it doesn't have to worry about the implementation of wait. It needs to make sure that it has a mutex lock on that object so it can make critical updates to it or other storage and if the wait method was implemented by the object itself, then that object could violate the language requirements and, for example, allow multiple threads into the protected block at the same time. A thread can synchronize and call wait/notify/notifyAll on another object or not without having to worry about whether or not that object has implemented those methods appropriately. By making them final methods on Object the behavior will work the same regardless of the object type or local implementation.
Also, as I mentioned, wait/notify/notifyAll are integrated closely with the surrounding synchronized block. When a thread is blocked in wait() the surrounding synchronized lock is released so that other threads can get access to the protected block. This coordination would not be possible if the wait() was just a simple method call without other strange language features.
This reminds me of my other answer here: Concept behind putting wait(),notify() methods in Object class
It was a design goal from the start that Java programs would be multithreaded. Remember the plan was for Java to make embedded programming less intimidating, the whole serverside web application thing (leading to the commoditization of Sun's core business) was an accident.
Since the goal was to enable creating embedded applications that would talk to other devices, it had to be multithreaded in order to be network-friendly and event-driven. But writing efficient multithreaded servers wasn't high on the list for java.
Java didn't have ReentrantLock or nonblocking i/o for a long time. Initially the main data structures available were Vector, Hashtable, and StringBuffer (all of which had synchronized on all public methods). From that choice it seems like the goal was good-enough, as opposed to being as efficient as possible. Later on it was clear Java needed to be more efficient for the use case of server applications and 1.2 introduced equivalents of Vector and Hashtable that were unsynchronized. This seemed like an afterthought, a course adjustment made once it was apparent Java had a new role it wasn't previously designed for.
If Java had stayed in the niche it was created for then possibly intrinsic locks might have been adequate. It seems the initial plan was for intrinsic locks only, so that the lock might as well be wired into the Object.
What is the use of ThreadLocal when a Thread normally works on variable keeping it in its local cache ?
Which means thread1 do not know the value of same var in thread2 even if no ThreadLocal is used .
With multiple threads, although you have to do work to make sure you read the "most recent" value of a variable, you expect there to be effectively one variable per instance (assuming we're talking about instance fields here). You might read an out of date value unless you're careful, but basically you've got one variable.
With ThreadLocal, you're explicitly wanting to have one value per thread that reads the variable. That's typically for the sake of context. For example, a web server with some authentication layer might set a thread-local variable early in request handling so that any code within the execution of that request can access the authentication details, without needing any explicit reference to a context object. So long as all the handling is done on the one thread, and that's the only thing that thread does, you're fine.
A thread doesn't have to keep variables in its local cache -- it's just that it's allowed to, unless you tell it otherwise.
So:
If you want to force a thread to share its state with other threads, you have to use synchronization of some sort (including synchronized blocks, volatile variables, etc).
If you want to prevent a thread from sharing its state with other threads, you have to use ThreadLocal (assuming the object that holds the variable is known to multiple threads -- if it's not, then everything is thread-local anyway!).
It's kind of a global variable for the thread itself, so that any code running in the thread can access it directly. (A "really" global variable can be accessed by any code running in the "process"; we could call it ProcessLocal:)
Is global variable bad? Maybe; it should be avoided if we can. But sometimes we have no choice, we cannot pass the object through method parameters, and ThreadLocal proves to be useful in many designs without causing too much trouble.
Use of ThreadLocal is when an object is not thread-safe, but you want to avoid synchronizing access. So each thread stores data on its own Thread local storage memory. By default, data is shared between threads.
Is java.lang.reflect.Method thread safe?
Profiling result of my program showed that Class.getMethod() took considerable computing time when called many times, a little more than I expected.
I can call this once and store the resulting method somewhere easily accessible.
But then, multiple web worker threads will use the stored Method object concurrently.
Is this safe?
The Method is safe to use accross multiple threads provided you don't change the Method's state after making it available to multiple threads.e.g. You can call setAccessible(true) and setAccessible(false) in two threads and the result would be not thread safe. However this has no really good use.
In short, Method.setAccessible() is not techincally thread safe, but you should be able to use it in a thread safe way.
Java classes are guaranteed to be defined only once per ClassLoader instance, so you can safely assume that the definition, including methods and their signatures will not change through time, so you can safely "cache" them for use by multiple threads.
However, keep in mind that classes with the same fully qualified name (package + class name) can be defined differently by separate ClassLoader instances.
The class definition isn't going to change, so unless you are loading different classes in different threads (from separate libraries, say), the Method object should be thread safe. (Of course, whether the method itself being called by reflection is thread-safe is a different question.)