The part of the language specification dedicated to the Java Memory Model (JMM) (link) mentions "execution trace" a lot.
For example right from the start:
A memory model describes, given a program and an execution trace of that program, whether the execution trace is a legal execution of the program. The Java programming language memory model works by examining each read in an execution trace and checking that the write observed by that read is valid according to certain rules.
But I cannot find there any description/definition of this term.
So, what is "execution trace" exactly according to the JMM, and what exactly does it consist of?
References to specific places in the language specification text are most welcome.
You're right; it's not very clear. They also refer to it as "program trace", and simply "trace" on its own.
The following is a quote:
Consider, for example, the example program traces shown in Table 17.4-A.
Table 17.4-A.
Thread 1
Thread 2
B = 1;
A = 2;
r2 = A;
r1 = B;
So, it's simply an ordered list of statements, per thread, representing one possible permutation of how the statements may be executed (since statement may be reordered). A trace may be valid or invalid within the JMM; they are used to exemplify what is legal and what is not.
This is not a full-fledged answer, but I think this is worth mentioning.
Even if we don't know what an "execution trace" is in details, we can deduce which information it should provide.
Let's read the first paragraph of 17.4. Memory Model:
A memory model describes, given a program and an execution trace of that program, whether the execution trace is a legal execution of the program. The Java programming language memory model works by examining each read in an execution trace and checking that the write observed by that read is valid according to certain rules.
This means that "a program" (i.e. source code) and "an execution trace" should provide all the information required to determine whether the program execution is legal.
The information is described in 17.4.6. Executions.
I'm not going to copy-paste it here because it's too long.
I'll try to explain it in simple words instead:
a program consists of statements, each statement consists of (possibly nested) expressions evaluated in some order
an execution of a thread can be represented as a sequence of actions: one action per every simple expression
an execution of a program is several threads executing in parallel
an execution trace should provide information about actions performed during the program execution, i.e. it should provide provide the following information:
all executed actions: a sequence of actions per every thread
Note: the JMM only cares about so called inter-thread actions (17.4.2. Actions):
An inter-thread action is an action performed by one thread that can be detected or directly influenced by another thread
Inter-thread action kinds:
read/write
volatile read/write
lock/unlock
various special and synthetic actions (e.g. thread start/stop, etc.)
for every action it should store:
thread id
action kind
what expression in the source code it corresponds to
for write and volatile write: the written value
for read and volatile read: the write action, which provided the value
for lock/unlock: the monitor being locked/unlocked
various relations with other actions (e.g. position in a so-called synchronization order for synchronization actions)
Related
According to the Java Memory Model (JMM):
A program is correctly synchronized if and only if all sequentially consistent executions are free of data races.
If a program is correctly synchronized, then all executions of the program will appear to be sequentially consistent (§17.4.3).
I don't see how the the fact that no SC execution has data races guarantees that every execution has no data races (which means that every execution is SC).
Is there a proof of that?
What I found:
A blog post by Jeremy Manson (one of the authors of the JMM).
The following paragraph might mean that the guarantee is provided by causality (but I don't see how):
So there is the intuition behind the model. When you want to justify the fact that a read returns the value of a write, you can do so if:
a) That write happened before it, or
b) You have already justified the write.
The way that the model works is that you start with an execution that has all of the actions justified by property a), and you start justifying actions iteratively. So, in the second example above, you create successive executions that justify 0, then 2, then 3, then 4, then 1.
This definition has the lovely property of being reasonably intuitive and also guaranteeing SC for DRF programs.
Foundations of the C++ Concurrency Memory Model.
This article describes C++ memory model (which has similarities with the JMM).
Section 8 of the article has a proof of a similar guarantee for C++:
THEOREM 8.1. If a program allows a type 2 data race in a consistent execution, then there exists a sequentially consistent execution, with two conflicting actions, neither of which happens before the other.8
In effect, we only need to look at sequentially consistent executions in order to determine whether there is a data race in a consistent execution.
[...]
8 The latter is essentially the condition used in [25] to define “correctly synchronized” for Java.
Unfortunately, I'm not sure this proof holds for the JMM because the following doesn't work:
Consider the longest prefix P of <T that contains no data race. Note that each load in P must see a store that precedes it in either the synchronization or happens-before orders.
It seems to me that the above doesn't work in JMM because causality allows a read to return a later store.
The proof is in The Java Memory Model by J.Manson, W.Pugh and S.Adve:
9.2.1 Correctly synchronized programs exhibit only sequentially consistent behaviors
Lemma 2. Consider an execution E of a correctly synchronized program P that is legal under the Java memory model. If, in E, each read sees a write that happens-before it, E has sequentially consistent behavior.
Proof:
Since the execution is legal under the memory model, the execution is happens-before consistent and synchronization order consistent.
A topological sort on the happens-before edges of the actions in an execution gives a total order consistent with program order and synchronization order. Let r be the first read in E that doesn’t see the most recent conflicting write w in the sorted order but instead sees w′. Let the topological sort of E be αw′βwγrδ.
Let αw′βwγr′δ′ be the topological sort of an execution E′. E′ is obtained exactly as E, except that instead of r, it performs the action r′, which is the same as r except that it sees w; δ′ is any sequentially consistent completion of the program such that each read sees the previous conflicting write.
The execution E′ is sequentially consistent, and it is not the case that w′ ‒hb→ w ‒hb→ r, so P is not correctly synchronized.
Thus, no such r exists and the program has sequentially consistent behavior. □
Theorem 3. If an execution E of a correctly synchronized program is legal under the Java memory model, it is also sequentially consistent.
Proof: By Lemma 2, if an execution E is not sequentially consistent, there must be a read r that sees a write w such that w does not happen-before r. The read must be committed, because otherwise it would not be able to see a write that does not happen-before it. There may be multiple reads of this sort; if so, let r be the first such read that was committed. Let Eᵢ be the execution that was used to justify committing r.
The relative happens-before order of committed actions and actions being committed must remain the same in all executions considering the resulting set of committed actions. Thus, if we don’t have w ‒hb→ r in E, then we didn’t have w ‒hb→ r in Eᵢ when we committed r.
Since r was the first read to be committed that doesn’t see a write that happens-before it, each committed read in Eᵢ must see a write that happens-before it. Non-committed reads always sees writes that happens-before them. Thus, each read in Eᵢ sees a write that happens-before it, and there is a write w in Eᵢ that is not ordered with respect to r by happens-before ordering.
A topological sort of the actions in Eᵢ according to their happens-before edges gives a total order consistent with program order and synchronization order. This gives a total order for a sequentially consistent execution in which the conflicting accesses w and r are not ordered by happens-before edges. However, Lemma 2 shows that executions of correctly synchronized programs in which each read sees a write that happens-before it must be sequentially consistent. Therefore, this program is not correctly synchronized. This is a contradiction. □
This paragraph is from the jvm specs:
A Java Virtual Machine may permit a small but bounded
amount of execution to occur before an asynchronous exception
is thrown. This delay is permitted to allow optimized code
to detect and throw these exceptions at points where it is
practical to handle them while obeying the semantics of the Java
programming language.
I'm having trouble understanding the second part, ie the reason jvm lets the thread run for some time before stopping it.
Let’s recall the definition of asynchronous exceptions:
Most exceptions occur synchronously as a result of an action by the thread in which they occur. An asynchronous exception, by contrast, can potentially occur at any point in the execution of a program.
So when an exception occurs as a result of an action, you simply know that, e.g. when executing an athrow instruction an exception will occur unconditionally, when executing an integer division, the divisor may be zero, or when accessing an object member, the reference may be null. This is a limited set of actions and the optimizer tries its best to reduce it further, using code analysis to prove that the divisor can not be zero, resp. the reference can not be null at specific code locations. Otherwise, it must insert a check for the erroneous condition to generate and handle an exception if necessary. But only at these specific code locations.
In contrast, the asynchronous exception can occur at every code location and may require an explicit check of the “did another thread call stop on my thread since the last check” kind. You don’t want such checks to be performed after every instruction, as that would imply spending more time on such checks than on the actual work.
Hence, it is allowed to perform more than one instruction until the next check, as long as it is guaranteed that the time to reach the next check will be bounded, so this will rule out backward branches with an unpredictable number of iterations without a check. Also keep in mind that in optimized code, there might be uncommitted actions, e.g. the values of modified variables are held in CPU registers. So even after detecting that an asynchronous exception occurred, the code must commit these pending actions, e.g. write back these values to the shared memory, before leaving the code to respond to the exception.
I am experimenting with a game mechanic in which players can run scripts on in-game computers. Script execution will be resource limited at a gameplay level to some amount of instructions per tick.
The following proof-of-concept demonstrates a basic level of sandboxing and throttling of arbitrary user code. It successfully runs ~250 instructions of poorly crafted 'user input' and then discards the coroutine. Unfortunately, the Java process never terminates. A little investigation in shows that the LuaThread created by LuaJ for the coroutine is hanging around forever.
SandboxTest.java:
public static void main(String[] args) {
Globals globals = JsePlatform.debugGlobals();
LuaValue chunk = globals.loadfile("res/test.lua");
chunk.call();
}
res/test.lua:
function sandbox(fn)
-- read script and set the environment
f = loadfile(fn, "t")
debug.setupvalue(f, 1, {print = print})
-- create a coroutine and have it yield every 50 instructions
local co = coroutine.create(f)
debug.sethook(co, coroutine.yield, "", 50)
-- demonstrate stepped execution, 5 'ticks'
for i = 1, 5 do
print("tick")
coroutine.resume(co)
end
end
sandbox("res/badfile.lua")
res/badfile.lua:
while 1 do
print("", "badfile")
end
The docs suggest that a coroutine that is considered unresumable will be garbage collected and an OrphanedThread exception will be thrown, signalling the LuaThread to end - but this is never happening. My question is in two parts:
Am I doing something fundamentally wrong to cause this behaviour?
If not, how should I handle this situation? From the source it appears that if I can get a reference to the LuaThread in Java I may be able to forcibly abandon it by issuing an interrupt(). Is this a good idea?
Reference: Lua / Java / LuaJ - Handling or Interrupting Infinite Loops and Threads
EDIT: I have posted a bug report over at the LuaJ SourceForge. It discusses the underlying issue (threads not being garbage collected as in the Lua spec) and suggests some ways to work around it.
It seems to be a limitation of LuaJ. I submitted a ticket earlier this year on Sourceforge as I see you've also done. The LuaThread class doesn't store references to the Java threads it creates, so you can't interrupt() those threads without modifying the LuaJ core to expose them:
new Thread(this, "Coroutine-"+(++coroutine_count)).start();
It may be dangerous to interrupt those threads without adding appropriate cleanup code to LuaJ.
Documentation that you provided for OrphanedThread also tells us that scope is the defining condition:
"Error sublcass that indicates a lua thread that is no longer referenced has been detected. The java thread in which this is thrown should correspond to a LuaThread being used as a coroutine that could not possibly be resumed again because there are no more references to the LuaThread with which it is associated. Rather than locking up resources forever, this error is thrown, and should fall through all the way to the thread's Thread.run() method."
Your code example doesn't cause all LuaThread references to disappear, so you shouldn't expect an exception to be thrown. CoroutineLib documentation indicates: Coroutines that are yielded but never resumed to complete their execution may not be collected by the garbage collector, so an OutOfMemoryError should actually be expected from the code you listed on SourceForge, if I'm not mistaken. LuaThread:52 also specifies: Applications should not catch OrphanedThread, because it can break the thread safety of luaj., which is yet another obstacle.
There also seem to be differences between empty and non-empty while loops in Lua/J. IIRC, empty loops (while true do end) don't obey all coroutine hook/tick rules. *Because no actions occur in an empty loop, there's no opportunity for certain hooks to occur (I need to test this again so please correct me otherwise!).
A forked version of LuaJ with the functionality we're looking for is used in the ComputerCraft mod for Minecraft, though it's designed only for the mod and isn't open source.
I was trying to learn the Java Memory Model, but still cannot understand how people use it in practice.
I know that many just rely on appropriate memory barriers (as described in the Cookbook), but in fact the model itself does not operate such terms.
The model introduces different orders defined on a set of actions and defines so called "well-formed executions".
Some people are trying to explain the memory model restrictions using one of such orders, namely "happens-before", but it seems like the order, at least by itself, does not define acceptable execution:
It should be noted that the presence of a happens-before relationship between two actions does not necessarily imply that they have to take place in that order in an implementation. If the reordering produces results consistent with a legal execution, it is not illegal
My question is how can one verify that certain code or change can lead to an "illegal execution" in practice (according to the model) ?
To be more concrete, let's consider a very simple example:
public class SomeClass {
private int a;
private int b;
public void someMethod() {
a = 2; // 1
b = 3; // 2
}
// other methods
}
It's clear that within the thread w(a = 2) happens before w(b = 3) according to the program order.
How can compiler/optimizer be sure that reordering 1 and 2 won't produce an "illegal execution" (strictly in terms of the model) ? And why if we set b to be volatile it will ?
Are you asking about how the VM/JIT analyzes the bytecode flow? Thats far too broad to answer, entire research papers have been written about that. And what the VM actually implements may change from release to release.
Or is the question simply about which rules of the memory model govern what is "legal"? For the executing thread, the memory model already makes the strong guarantee that every action on a given thread appears to happen in program order for that thread. That means if the JIT determines by whatever method(s) it implements for reordering that the reordering produces the same observable result(s) is legal.
The presence of actions that establish happens-before guarantees with respect to other threads (such as volatile accesses) simply adds more constraints to the legal reorderings.
Simplified it could be memorized as that everything that happened in program order before also appears to have (already) happened to other threads when a happend-before establishing action is executed.
For your example that means, in case of non-volatile (a, b) only the guarantee "appears to happen in program order" (to the executing thread) needs to be upheld, that means any reordering of the writes to (a, b) is legal, even delaying them until they are actually read (e.g. holding the value in a CPU register and bypassing main memory) would be valid. It could even omit writting the members at all if the JIT detects they are never actually read before the object goes out of scope (and to be precise, there is also no finalizer using them).
Making b volatile in your example changes the constraints in that other threads reading b would also be guaranteed to see the last update of a because it happened before the write to b. Again simplified, happens-before actions extend some of the perceived ordering guarantees from the executing thread to other threads.
It seems you are making the common mistake of thinking too much about low level aspects of the JMM in isolation. Regarding your question “how people use it in practice”, if you are talking about an application programmer, (s)he will use it in practice by not thinking about memory barriers or possible reorderings all the time.
Regarding your example:
public void someMethod() {
a = 2; // 1
b = 3; // 2
}
Given a and b are non-final, non-volatile.
It's clear that within the thread w(a = 2) happens before w(b = 3) according to the program order. How can compiler/optimizer be sure that reordering 1 and 2 won't produce an "illegal execution" (strictly in terms of the model) ?
Here, it backfires that you are focusing on re-ordering in isolation. First of all, the resulting code (of HotSpot optimization, JIT compilation, etc.) does not need to write the values to the heap memory at all. It might hold the new values in CPU registers and use it from there in subsequent operations of the same thread. Only when reaching a point were these changes have to be made visible to other threads they have to be written to the heap. Which may happen in arbitrary order.
But if, for example, the caller of the method enters an infinite loop after calling this method, the values don’t have to be written ever.
And why if we set b to be volatile it will ?
Declaring b as volatile does not guaranty that a and b are written. This is another mistake which arises from focusing on memory barriers.
Let’s go more abstract:
Suppose you have two concurrent actions, A and B. For concurrent execution in Java, there are several perfectly valid behaviors, including:
A might be executed entirely before B
B might be executed entirely before A
All or parts of A and B run in parallel
in the case B is executed entirely before A, there is no sense in having a write barrier in A and a read barrier in B, B will still not notice any activities of A. You can draw your conclusions about different parallel scenarios from this starting point.
This is where the happens-before relationship comes into play: a write of a value to a volatile variable happens before a read of that value from that variable by another thread. If the read operation is executed before the write operation, the reading thread will not see the value and hence there’s no happens-before relationship and so there is no statement about the other variables we can make.
To stay at your example with b being volatile: this implies that if a reading thread reads b and reads the value 3, and only then it is guaranteed to see the value of 2 (or an even more recent value if there are other writes) for a on subsequent reads.
So if a JVM can prove that there will never be a read operation on b seeing the written value, maybe because the entire instance we are modifying will never be seen by another thread, there is no happens-before relationship to be ever established, in other words, b being volatile has no impact on the allowed code transformations in this case, i.e. it might be reordered as well, or even never written to the heap at all.
So the bottom line is that it is not useful to look at a small piece of code and ask whether it will allow reordering or whether it will contain a memory barrier. This might not even be answerable as the answer might change depending on how the code is actually used. Only if your view is wide enough to see how threads will interact when accessing the data and you can safely deduce whether a happens-before relationship will be established you can start drawing conclusions about the correct working of the code. As you found out by yourself, correct working does not imply that you know whether reordering will happen or not on the lowest level.
i need help to under stand the threads in java.
A thread is a thread of execution in a program. The Java Virtual Machine allows an application to have multiple threads of execution running concurrently.
What do we mean when we say that Java aims to be ‘Threaded’
This means that various operations can and should be executed concurrently. This can be achieve by using threads. You can either use "low level" thread API (Thread, Runnable) or higher level API (Timer, Executors).
I hope this is enough to start googling and learn. I'd recommend you to start from low level threading API to understand how to work with threads and synchronization. Then go forward and learn facilities of concurrency package introduced in java 1.5. Do not start from higher level API. You need low level to understand later what happens behind the scene when you are submitting task to executor.
threads are a popular way to implement concurrency in languages. java has them. that's what it means.
"Java is threaded" means that Java could execute two or more jobs at the same time.
If you want to learn more about that look at Oracle Java concurrency tutorial: http://docs.oracle.com/javase/tutorial/essential/concurrency/
What do we mean when we say that Java aims to be ‘Threaded’
Well, literally we don't say that, because calling a runtime environment "threaded" means something rather different; see http://en.wikipedia.org/wiki/Threaded_code. (And note that that page takes care to distinguish between "threaded" and "multi-threaded"!)
In fact, we describe Java as being a language that supports "Multi-threaded" programming. The quotation in your question is a succinct description of what that means. A more long-winded description is as follows.
A program normally executes statements in sequence. So for example:
int i = 1;
i = i + j;
if (i < 10) {
...
}
In the above, the statements are executed one after another in sequence.
A thing that controls the execution of statements like that is called a "thread of control" or (more commonly) a thread. You can think of it as an automaton that executes statements one after another, and that is only capable of doing one at a time. It keeps a record of the state of the local variables and the procedure calls. (It typically uses a stack and a set of private registers to do this ... but that's an implementation detail.)
In a multi-threaded program, there are potentially many of these automatons, each executing a different sequence of statements (using its own stack and registers). Each thread is potentially able to communicate with other threads (by observing shared objects, etc) and can synchronize with them in various was and for various reasons.
Depending on the hardware (and the operating system), the threads may either all run on the same processor, or they may (at different times) run on different processors. It is typically a combination of the two, and it is typically up to the operating system to decide which of the threads that can do work is allowed to run. (This is handled by the thread scheduler.)
From a Java perspective, multi-threaded programming is implemented at the low level using the Thread class, synchronized methods and blocks, and the Object level wait and notify methods. Higher level APIs provide standard building blocks for solving common problems.