What is the best way to test value visibility between threads?
class X {
private volatile Object ref;
public Object getRef() {
return ref;
}
public void setRef(Object newRef) {
this.ref = newRef;
}
}
The class X exposes a reference to the ref object. If concurrent threads read and and write the object reference every Thread has to see the latest object that was set. The volatile modifier should do that. The implementation here is an example it could also be synchronized or a lock-based implementation.
Now I'm looking for a way to write a test that informs me when the value visibility is not as specified (older values were read).
It's okay if the test does burn some cpu cycles.
The JLS says what you are supposed to do in order to get guaranteed consistent execution in an application involving "inter-thread actions". If you don't do these things, the execution may be inconsistent. But whether it actually will be inconsistent depends on the JVM you are using, the hardware that you are using, the application, the input data, and ... whatever else might be happening on the machine when you run the application.
I cannot see what your proposed test would tell you. If the test shows inconsistent executions, it would confirm the wisdom of doing synchronization properly. But if running the test a few time shows only (apparently) consistent executions, this doesn't tell you that executions are always going to be consistent.
Example:
Suppose that you run your tests on (say) JDK 1.4.2 (rev 12) / Linux / 32bit with the 'client' JVM and options x, y, z running on a single processor machine. And that after running the test 1000 times, you observe that it does not seem to make any difference if you leave out the volatile. What have you actually learned in that scenario?
You have NOT learned that it really makes no difference? If you change the test to use more threads, etc, you may get a different answer. If you run the test a few more thousand or million or billion times, you might get a different answer.
You have NOT learned anything about what might happen on other platforms; e.g. different Java version, different hardware, or different system load conditions.
You have NOT learned if it is safe to leave out the volatile keyword.
You only learn something if the test shows a difference. And the only thing that you learn is that synchronization is important ... which is what all of the text books, etc have been telling you all along :-)
Bottom line: this is the worst kind of black box testing. It gives you no real insight as to what is going on inside the box. To get that insight you need to 1) understand the Memory Model and 2) deeply analyze the native code emitted by the JIT compiler (on multiple platforms ...)
If I understand correctly, then you want a test-case that passes if the variable is defined as volatile and fails if not.
However I think there is no reliable way to do this. Depending on the implementation of the jvm concurrent access may work correctly even without volatile.
So a unit test will work correctly when volatile is specified but it still might work correctly without volatile.
Wow, that's much tougher than I initially thought.
I might be completely off, but how about this?
class Wrapper {
private X x = new X();
private volatile Object volatileRef;
private final Object setterLock = new Object();
private final Object getterLock = new Object();
public Object getRef() {
synchronized(getterLock) {
Object refFromX = x.getRef();
if (refFromX != volatileRef) {
// FAILURE CASE!
}
return refFromX;
}
}
public void setRef(Object ref) {
synchronized(setterLock) {
volatileRef = ref;
x.setRef(ref);
}
}
}
Could this help?
Of course, you will have to create many Threads to hit this wrapper, hoping for the bad case to appear.
How about this ?
public class XTest {
#Test
public void testRefIsVolatile() {
Field field = null;
try {
field = X.class.getDeclaredField("ref");
} catch (SecurityException e) {
e.printStackTrace();
Assert.fail(e.getMessage());
} catch (NoSuchFieldException e) {
e.printStackTrace();
Assert.fail(e.getMessage());
}
Assert.assertNotNull("Ref field", field);
Assert.assertTrue("Is Volatile", Modifier.isVolatile(field
.getModifiers()));
}
}
So basicaly you want this scenario: one thread writes the variable, while another reads it at the same time, and you want to ensure that the variable read has the correct value, right?
Well, I don't think you can use unit testing for that, because you can't ensure the right environment. That is done by the JVM, by how it schedules instructions. Here's what I would do. Use a debugger. Start one thread to write the data and put a breakpoint on the line that does this. Start the second thread and have it read the data, also stopping at that point. Now, step the first thread to execute the code that writes, and then read with the second one. In your example, you won't achieve anything with this, because read and write are single instructions. But usually if these operations are more complex, you can alternate the execution of the two threads and see if everything is consistent.
This will take some time, because it's not automated. But I wouldn't go and write a unit test that tries reading and writing a lot of times, hoping to catch that case where it fails, because you wouldn't achieve anything. The role of a unit test is to assure you that code you wrote is working as expected. But in this case, if the test passes, you're not assured of anyhing. Maybe it was just lucky and the conflict didn't appera on this run. And that defeats the purpose.
Related
For example:
class Main {
public boolean hasBeenUpdated = false;
public void updateMain(){
this.hasBeenUpdated = true;
/*
alternative:
if(!hasBeenUpdated){
this.hasBeenUpdated = true;
}
*/
}
public void persistUpdate(){
this.hasBeenUpdated = false;
}
}
public Main instance = new Main()
instance.updateMain()
instance.updateMain()
instance.updateMain()
Does instance.hasBeenUpdated get updated 3 times in memory?
The reason I ask this is because I hoped to use a boolean("hasBeenUpdated") as a flag, and this could theoretically be "changed" many, many times, before I call "instance.persistUpdate()".
Does the JVM's JIT see this and perform an optimization?
JIT will collapse redundant statements only when it can PROVE that removing the code will not change the behavior. For example, if you did this:
int i;
i = 1;
i = 1;
i = 1;
The first two assignments are provably redundant, and the JIT could eliminate them. If instead it's
int i;
i = someMethodReturningInt();
i = someMethodReturningInt();
i = someMethodReturningInt();
the JIT has no way of knowing what someMethodReturnintInt() does, including whether it has any side effects, so it must invoke the method 3 times. Whether or not it actually stores any but the final value is immaterial, as the code would behave the same either way. (Declaring volatile int i; instead would force it to store each value)
Of course if you're doing other things in between the method invocations the it will be forced to perform the assignment.
The whole topic is part of the more general "happens-before" and "happens-after" concepts documented in the language and JVM specifications.
Optimization is NEVER supposed to change the behavior of a program, except possibly to reduce its runtime. There have been instances where bugs in the optimizer inadvertently did introduce errors, but these have been few and far between. In general you don't need to worry about whether optimization will break your code.
It can perform an optimization, yes.
As a matter of fact, it can issue a single write, or a single call to updateMain. All those three calls will be collapsed to one, only.
But for that to happen, JIT has to prove that nothing else breaks, or more specifically that code does not break the JMM rules. In this specific case, as far as I understand it, it does not.
Given the choice is between JVM code that implements
move new value to variable
and
compare new value with current value of variable
if not the same
move new value to variable
the JVM would have to be fairly nutty to implement it the latter way. That's a pessimization, not an optimization.
The JVM to a large extent relies on the real machine to do simple operations, and real machines store values in memory when you tell them to store values in memory.
I've encountered this code in a book. It states NoVisibility could loop forever because the value of ready might never become
visible to the reader thread.
I'm confused by this statement. In order for the loop to run forever, ready must always be false, which is the default value. This means it must fail at executing ready = true; because the reader thread will always read the ready variable from memory. the assignment happens in CPU and it must have some problem in flushing the data back to Main Memory. I think I need some explanation on a situation how it can fail, or I may have missed some other part.
public class NoVisibility {
private static boolean ready;
private static int number;
private static class ReaderThread extends Thread {
public void run() {
while (!ready)
Thread.yield();
System.out.println(number);
}
}
public static void main(String[] args) {
new ReaderThread().start();
number = 42;
ready = true;
}
}
Your understanding is flawed. You are assuming that Java will behave intuitively here. In fact, it may not. And, indeed, the Java Language specification allows non-intuitive behavior if you don't follow the rules.
To be more specific, in your example it is NOT GUARANTEED that the second thread will see the results of the first thread's assignment to ready1. This is due to such things as:
The compiler caching the value of ready in a register in the first or second thread.
The compiler not including instructions to force the write to be flushed from one core's memory cache to main memory, or similar.
If you want a guarantee that the second thread will see the result of the write then either reads and writes of ready by the two threads must be (properly) synchronized, or the ready variable must be declared to be volatile.
So ...
This means it must fail at executing ready = true; because the reader thread will always read the ready variable from memory.
is incorrect. The "because" is not guaranteed by the Java language specification in this example.
Yes. It is nonintuitive. Relying on your intuition based on your understanding of single-threaded programs is not reliable. If you want to want to understand what is and is not guaranteed, please study the specification of the "Java Memory Model" in Section 17.4 of the JLS.
In short, the book is correct.
1 - It might see the results immediately, or after a short or long delay. Or it might never see them. And the behavior is liable to vary from one system to the next, and with versions of the Java platform. So your program that (by luck) works all of the time on one system may not always work on another system.
The value of ready may be updated but the other thread may never know about it. There you need volatile variables! A thread assumes that the variable is only used by this and only thread. So, it reads its value from the stack that it created.
private static volatile boolean ready;
What volatile does is that it says to your program to ready from the memory, not from the stack.
Actually what jvm does is it translates:
while(flag){...}
To:
if(flag){
while(true){
}
The stack is created when the thread is created. It collectes the values of the variables in order to use them later.
This is what I have understand, correct me if I am wrong!
I have a thread issue in my code that should not be happening - but is. So I'm trying to make some work around. I will try to explain my problems with simple code as I can - because the code that I'm experiencing the issue is big and complicated
so in short the code:
...................
..................
void createAndRunThreads(){
List<Path> pathList = //read path from DB readPath();
for(Path p : pathList){
RunJob rj = new RunJob(p);
Thred t = new Thread(rj);
t.start();
}
}
class RunJob implements Runnable {
private Path path;
private ExecuteJob execJob;
public RunJob(Path path){
this.path = path;
this.execJob = new ExecuteJob();
}
public void run() {
execJob.execute(path);
}
}
class ExecuteJob {
private static Job curentExecutingJob;
public void execute(Path path){
//here every thread should get different job list from others but this is not happening
//so what happens eventually two threads are executing the same Job at once and it gets messy
List<Job> jobList = getJobsFromPath(path);
for(Job job : jobList) {
curentExecutingJob=job;
//work around that I'm trying to do. So if other thread tries to run the same job has to wait on lock(I dont know if this is posible do)
synchronized(curentExecutingJob){
if(job.getStatus.equals("redy")){
//do sum initialization
//and databese changes
job.run();
}
}
}
}
}
So my concern is if this going to work - I don know if the object in the lock is compared by memory(need to be the exact object) or by equals(to implement equal on it)
What happens when the static curentExecutingJob member has one value-object in first thread and creates lock on that(in synchronized block) and second thread changes that value and tries to enter synchronized block(My expectation that I'm hoping to be is that thread-2 will continue with executing and only time that it would be block is when he will get the same Job from DB that previously the first thread got it)
I don't know if this approach can be done and has sense
Two thread are running the following code that is inside method
1 Job j = getJobByIdFromDB(1111);
2 if(j.status.equals("redye")){
3 do staff
4 make database changes
5 j.run();
6 j.state="running";
7 }
The ThreadA is stop from executing in line 3 from JVM and his state is changed from running to runnable and is set to wait in the poll.
The ThreadB is given chance by the JVM and ThreadB executes lines 1, 2, 3, 4, 5, 6 that I don't want to happen. I want the first thread that enters the code in lines 2,3 to finish before someone from the rest threads have chances to enter the same code
Problem accomplish this is that the two threads are executing the example method with different instance so synchronized the whole method wont work - also I have other code that is been executed in this method and I don't want that to be synchronizing to
So is there solution for my problem
Also if I make synchronized(this.class){} it will lose the benefits and sense of multithreading
The problem is that the 'currentExecutingJob' is defined as static, meaning that all instances of ExecuteJob share the same 'instance' of this variable. In addition, you are setting the value of this variable outside of a synchronization block, which means that each thread will set it in an uncontrolled way. Your following synchronization block should have no practical impact whatsoever.
Given the way your sample code is written, it appears to me that you don't need any static variables and you don't need any synchronization, as there are no resources shared across multiple threads.
However, Your comments in the code indicate that you want to prevent two threads from executing the same job at the same time. Your code does not achieve this, as there is no comparison of running jobs to see if the same job is running, and even if there was a comparison, your getJobsFromPath() would need to to construct a job list such that the same object instance would need to be reused when two threads/paths encounter the same 'job'.
I don't see any of this in your code.
Can't comment so I'll put it as an answer. Sorry.
The block
synchronized(curentExecutingJob)
will synchronize on the object curentExecutingJob (in your terms, memory). If you synchronize on another object otherExecutingJob with currentExecutingJob.equals(otherExecutingJob) == true, both synchronize statements will not influence each other.
To your question/problem: It would be helpful if you describe what getJobsFromPath is doing or should do and what you actually want to do and what your problem actually is. It's not really clear to me.
i saw your code that it check's for the status of job, if it is ready or not, well as i think this is not a afeasible way
you can use the Callable Interface instead of Runnable
here is an example detailed which may help you.
Java Concurrency
Is there a way to tell, for a Java object, which Thread (or null) currently owns its monitor? Or at least a way to tell if the current thread owns it?
I've found out some answers myself. To test if the current thread holds the monitor, Thread.holdsLock exists!
if (!Thread.holdsLock(data)) {
throw new RuntimeException(); // complain
}
This is really fast (sub-microsecond) and has been available since 1.4.
To test in general, which thread (or thread ID) holds the lock, it's possible to do this with java.lang.management classes (thanks #amicngh).
public static long getMonitorOwner(Object obj) {
if (Thread.holdsLock(obj)) return Thread.currentThread().getId();
for (java.lang.management.ThreadInfo ti :
java.lang.management.ManagementFactory.getThreadMXBean()
.dumpAllThreads(true, false)) {
for (java.lang.management.MonitorInfo mi : ti.getLockedMonitors()) {
if (mi.getIdentityHashCode() == System.identityHashCode(obj)) {
return ti.getThreadId();
}
}
}
return 0;
}
There's a few caveats with this:
It's a little slow (~½ millisecond in my case and presumably increases linearly with the number of threads).
It requires Java 1.6, and a VM for which ThreadMXBean.isObjectMonitorUsageSupported() is true, so it's less portable.
It requires the "monitor" security permission so presumably wouldn't work from a sandboxed applet.
Turning the thread ID into a Thread object, if you need to, is a bit non-trivial, as I imagine you'd have to use Thread.enumerate and then loop through to find out which one has the ID, but this has theoretical race conditions because by the time you call enumerate, that thread might not exist any more, or a new thread might have appeared which has the same ID.
But if you only want to test the current thread, Thread.holdsLock works great! Otherwise, implementations of java.util.concurrent.locks.Lock may provide more information and flexibility than ordinary Java monitors (thanks #user1252434).
The java classes monitor is internal to the JVM and you cannot really play with it.
If you know that the object is locked, you can try to obtain the monitor again - if you can get it, it means that you're locking the object from your thread (because java locks are recursive - you can lock twice from the same thread).
The problem is that you cannot try to synchronize.
You can use the unsafe object to do that.
unsafe has a tryMonintorEnter() method that does just that. see unsafe.
Unsafe might be able to actually help you get the thread that holds the monitor, but I don't know how to do that...
Instead of using synchronized, you might want to take a look at ReentrantLock, especially its methods getOwner() and isHeldByCurrentThread(). It takes a bit more discipline to use, though, since you explicitly have to unlock() it, preferrably in a finally block.
In Java 1.6 you can use Reflection to get this information.
ThreadMXBean tBean = ManagementFactory.getThreadMXBean();
ThreadInfo[] threadInfo = tBean .getThreadInfo(bean.getAllThreadIds(), true, true);
So I've been reading on concurrency and have some questions on the way (guide I followed - though I'm not sure if its the best source):
Processes vs. Threads: Is the difference basically that a process is the program as a whole while a thread can be a (small) part of a program?
I am not exactly sure why there is a interrupted() method and a InterruptedException. Why should the interrupted() method even be used? It just seems to me that Java just adds an extra layer of indirection.
For synchronization (and specifically about the one in that link), how does adding the synchronize keyword even fix the problem? I mean, if Thread A gives back its incremented c and Thread B gives back the decremented c and store it to some other variable, I am not exactly sure how the problem is solved. I mean this may be answering my own question, but is it supposed to be assumed that after one of the threads return an answer, terminate? And if that is the case, why would adding synchronize make a difference?
I read (from some random PDF) that if you have two Threads start() subsequently, you cannot guarantee that the first thread will occur before the second thread. How would you guarantee it, though?
In synchronization statements, I am not completely sure whats the point of adding synchronized within the method. What is wrong with leaving it out? Is it because one expects both to mutate separately, but to be obtained together? Why not just have the two non-synchronized?
Is volatile just a keyword for variables and is synonymous with synchronized?
In the deadlock problem, how does synchronize even help the situation? What makes this situation different from starting two threads that change a variable?
Moreover, where is the "wait"/lock for the other person to bowBack? I would have thought that bow() was blocked, not bowBack().
I'll stop here because I think if I went any further without these questions answered, I will not be able to understand the later lessons.
Answers:
Yes, a process is an operating system process that has an address space, a thread is a unit of execution, and there can be multiple units of execution in a process.
The interrupt() method and InterruptedException are generally used to wake up threads that are waiting to either have them do something or terminate.
Synchronizing is a form of mutual exclusion or locking, something very standard and required in computer programming. Google these terms and read up on that and you will have your answer.
True, this cannot be guaranteed, you would have to have some mechanism, involving synchronization that the threads used to make sure they ran in the desired order. This would be specific to the code in the threads.
See answer to #3
Volatile is a way to make sure that a particular variable can be properly shared between different threads. It is necessary on multi-processor machines (which almost everyone has these days) to make sure the value of the variable is consistent between the processors. It is effectively a way to synchronize a single value.
Read about deadlocking in more general terms to understand this. Once you first understand mutual exclusion and locking you will be able to understand how deadlocks can happen.
I have not read the materials that you read, so I don't understand this one. Sorry.
I find that the examples used to explain synchronization and volatility are contrived and difficult to understand the purpose of. Here are my preferred examples:
Synchronized:
private Value value;
public void setValue(Value v) {
value = v;
}
public void doSomething() {
if(value != null) {
doFirstThing();
int val = value.getInt(); // Will throw NullPointerException if another
// thread calls setValue(null);
doSecondThing(val);
}
}
The above code is perfectly correct if run in a single-threaded environment. However with even 2 threads there is the possibility that value will be changed in between the check and when it is used. This is because the method doSomething() is not atomic.
To address this, use synchronization:
private Value value;
private Object lock = new Object();
public void setValue(Value v) {
synchronized(lock) {
value = v;
}
}
public void doSomething() {
synchronized(lock) { // Prevents setValue being called by another thread.
if(value != null) {
doFirstThing();
int val = value.getInt(); // Cannot throw NullPointerException.
doSecondThing(val);
}
}
}
Volatile:
private boolean running = true;
// Called by Thread 1.
public void run() {
while(running) {
doSomething();
}
}
// Called by Thread 2.
public void stop() {
running = false;
}
To explain this requires knowledge of the Java Memory Model. It is worth reading about in depth, but the short version for this example is that Threads have their own copies of variables which are only sync'd to main memory on a synchronized block and when a volatile variable is reached. The Java compiler (specifically the JIT) is allowed to optimise the code into this:
public void run() {
while(true) { // Will never end
doSomething();
}
}
To prevent this optimisation you can set a variable to be volatile, which forces the thread to access main memory every time it reads the variable. Note that this is unnecessary if you are using synchronized statements as both keywords cause a sync to main memory.
I haven't addressed your questions directly as Francis did so. I hope these examples can give you an idea of the concepts in a better way than the examples you saw in the Oracle tutorial.