What is the best way to simulate java.lang.Thread? - java

I'm developing the transformer for Java 6*1) that performs a kind of partial evaluation but let's consider, for simplicity, abstract-syntax-tree interpretation of a Java program.
How to simulate the Thread's behavior by an interpreted program?
At the moment I have in mind the following:
AstInterpreter should implement java.lang.Runnable. It also should rewrite every new-instance-expression of the java.lang.Thread (or its sub-class) replacing the Thread's target (java.lang.Runnable) with the new AstInterpreter instance:
EDIT: more complex examples provided.
EDIT 2: remark 1.
Target program:
class PrintDemo {
public void printCount(){
try {
for(int i = 5; i > 0; i--) {
System.out.println("Counter --- " + i );
}
} catch (Exception e) {
System.out.println("Thread interrupted.");
}
}
}
class ThreadDemo extends Thread {
private Thread t;
private String threadName;
PrintDemo PD;
ThreadDemo( String name, PrintDemo pd){
threadName = name;
PD = pd;
}
public void run() {
synchronized(PD) {
PD.printCount();
}
System.out.println("Thread " + threadName + " exiting.");
}
public void start ()
{
System.out.println("Starting " + threadName );
if (t == null)
{
t = new Thread (this, threadName);
t.start ();
}
}
}
public class TestThread {
public static void main(String args[]) {
PrintDemo PD = new PrintDemo();
ThreadDemo T1 = new ThreadDemo( "Thread - 1 ", PD );
ThreadDemo T2 = new ThreadDemo( "Thread - 2 ", PD );
T1.start();
T2.start();
// wait for threads to end
try {
T1.join();
T2.join();
} catch( Exception e) {
System.out.println("Interrupted");
}
}
}
program 1 (ThreadTest - bytecode interpreted):
new Thread( new Runnable() {
public void run(){
ThreadTest.main(new String[0]);
}
});
program 2 (ThreadTest - AST interpreted):
final com.sun.source.tree.Tree tree = parse("ThreadTest.java");
new Thread( new AstInterpreter() {
public void run(){
interpret( tree );
}
public void interpret(com.sun.source.tree.Tree javaExpression){
//...
}
});
Does the resulting program 2 simulate the Thread's behavior of the initial program 1 correctly?
1) Currently, source=8 / target=8 scheme is accepted.

I see two options:
Option 1: JVM threads. Every time the interpreted program calls Thread.start you also call Thread.start and start another thread with another interpreter. This is simple, saves you from having to implement locks and other things, but you get less control.
Option 2: simulated threads. Similar to how multitasking is implemented on uniprocessors - using time slicing. You have to implement locks and sleeps in the interpreter, and track the simulated threads to know which threads are ready to run, which have finished, which are blocked, etc.
You can execute instructions of one thread until it blocks or some time elapses or some instruction count is reached, and then find another thread which may run now and switch to running that thread. In the context of operating systems this is called process scheduling - you may want to study this topic for inspiration.

You can't do partial evaluation sensibly using a classic interpreter that computes with actual values. You need symbolic values.
For partial evaluation, what you want is to compute the symbolic program state at each program point, and then simplify the program point based on the state known at that program point. You start your partial evaluation process by writing down what you know about the state when the program starts.
If you decorated each program point with its full symbolic state and kept them all around at once, you'd run out of memory fast. So a more practical approach is to enumerate all control flow paths through a method using a depth-first search along the control flow paths, computing symbolic state as you go. When this search backtracks, it throws away the symbolic state for the last node on the current path being explored. Now your saved state is linear in the size of the depth of the flow graph, which is often pretty shallow in a method. (When a method calls another, just extend the control flow path to include the call).
To handle runnables, you have to model the interleavings of the computations in the separate runnables. Interleaving the (enormous) state of two threads will get huge fast. The one thing that might save you here is most state computed by a thread is completely local to that thread, thus is by definition invisible to another thread, and you don't have to worry about interleaving that part of the state. So we are left with simulating interleaving of state seen by both two threads, along with simulation of the local states of each thread.
You can model this interleaving by implied but simulated parallel forks in the control flow: at each simulated step, either one thread makes one step progress, or the other (generalize to N threads). What you get is a new state for each program point for each fork; the actual state for the program point is disjunction of the states generated by this process for each state.
You can simplify the actual state disjunction by taking "disjunctions" of properties of individual properties. For instance, if you know that one thread sets x to a negative number at a particular program point, and another sets it to a positive number at that same point, you can summarize the state of x as "not zero". You'll need a pretty rich type system to model possible value characterizations, or you can live with an impoverished one that computes disjunctions of properties of a variable conservatively as "don't know anything".
This scheme assumes that memory accesses are atomic. They often aren't in real code so you sort of have to model that, too. Probably best to have the interpreter simply complain your program has a race condition if you end up with conflicting read and write operations to a memory location from two threads at the "same" step. A race condition doesn't make your program wrong, but only really clever code use races in ways that aren't broken.
If this scheme is done right, when one thread A makes a call on a synchronous method on an object already in use by another thread B, you can stop interleaving A with B until B leaves the synchronous method.
If there is never interference between threads A and B over the same abstract object, you can remove the synchronized declaration from the object declaration. I think this was your original goal
All this isn't easy to organize, and it is likely very expensive time/spacewise to run. Trying to draw up an example of all this pretty laborious, so I won't do it here.
Model checkers https://en.wikipedia.org/wiki/Model_checking do a very similar thing in terms of generating the "state space", and have similar time/space troubles. If you want to know more about how to manage state do this, I'd read the literature on this.

Related

Why does java Thread execute after the independent code?

I am new to java and learning multi-threading. I wrote the following code.
class BackgroundTask implements Runnable {
private int counter = 0;
public int getCounter() {
return counter;
}
public void setCounter(int counter) {
this.counter = counter;
}
#Override
public void run() {
System.out.println("Thread started");
while (true) {
this.setCounter(this.getCounter() + 1);
}
}
}
public class Main {
public static void main(String[] args) {
BackgroundTask bgTask = new BackgroundTask();
Thread bgCount = new Thread(bgTask);
try {
bgCount.start();
System.out.println("counter in background is running");
bgCount.interrupt();
System.out.println(bgTask.getCounter());
} catch (Exception e) {
System.out.println(e.getMessage());
}
}
}
Output of the code:
counter in background is running
0
Thread started
Q.1 why is bgCount.start() executing after the print statement when it is written before it?
Q.2 Why did the thread start after calling the getCounter() method?
Edit: Thanks everyone for all the cool answers, now I understand the concept of threads.
When you do things in two different threads, they are unsynchronized unless you force some kind of synchronization. One thread might execute first or the other thread might or they might interleave in unpredictable ways. That's kind of the point of threads.
Explanation
Q.1 why is bgCount.start() executing after the print statement when it is written before it?
It is not, this is an incorrect conclusion. start() is executed exactly when you have written it. It just does not necessarily mean that the thread immediatly starts. The OS scheduler determines when to start a thread. The only guarantee that you get is that it will eventually start, so maybe now, maybe in a hour, maybe next year. Obviously, in practice it will usually be almost instant. But that does not mean it is executed as the very next thing.
Q.2 Why did the thread start after calling the getCounter() method?
As explained before, the OS scheduler decides. You just had bad luck. Seriously, you have no control over this and should not do any assumptions on this.
Multi-threading
If you have multiple threads and they have a series of operations to execute, the OS scheduler is completely free to decide how to interleave the operations. That also means that not interleaving anything is valid as well.
Lets take a look at an example with thread A and B that have 2 operations to execute each. The following orders of executions are all valid outcomes of the scheduler:
A.1
A.2
B.1
B.2
A.1
B.1
A.2
B.2
A.1
B.1
B.2
A.2
B.1
B.2
A.1
A.2
B.1
A.1
B.2
A.2
B.1
A.1
A.2
B.2
So you must not make any assumptions about when a thread starts and especially not about in which order operations are executed in regards to other threads. It might be fully interleaved, might be sequential, might be partially interleaved, everything could happen.
If you want to take control over the mechanism, the correct tool is synchronization. With that you can tell that you want to wait for a certain thing to happen first in another thread before you continue. A very simple example for your above code would be to wait until bgCount is fully done before continuing to print the count. You can do so by using join():
bgCount.start();
System.out.println("counter in background is running");
bgCount.join(); // waiting until down
System.out.println(bgTask.getCounter());
However, if you do it like that, you defeated the purpose of having a thread in the first place. There is no benefit in computing something in-parallel if you completely block the other thread and wait. Then it is basically just like executing something in the ordinary sequential way.

Memory Consistency - happens-before relationship in Java [duplicate]

This question already has answers here:
How to understand happens-before consistent
(5 answers)
Closed 4 years ago.
While reading Java docs on Memory Consistency errors. I find points related to two actions that creates happen - before relationship:
When a statement invokes Thread.start(), every statement that has a
happens-before relationship with that statement also has a
happens-before relationship with every statement executed by the new
thread. The effects of the code that led up to the creation of the
new thread are visible to the new thread.
When a thread terminates and causes a Thread.join() in another thread
to return, then all the statements executed by the terminated
thread have a happens-before relationship with all the statements
following the successful join. The effects of the code in the thread
are now visible to the thread that performed the join.
I am not able to understand their meaning. It would be great if someone explain it with a simple example.
Modern CPUs don't always write data to memory in the order it was updated, for example if you run the pseudo code (assuming variables are always stored to memory here for simplicity);
a = 1
b = a + 1
...the CPU may very well write b to memory before it writes a to memory. This isn't really a problem as long as you run things in a single thread, since the thread running the code above will never see the old value of either variable once the assignments have been made.
Multi threading is another matter, you'd think the following code would let another thread pick up the value of your heavy computation;
a = heavy_computation()
b = DONE
...the other thread doing...
repeat while b != DONE
nothing
result = a
The problem though is that the done flag may be set in memory before the result is stored to memory, so the other thread may pick up the value of memory address a before the computation result is written to memory.
The same problem would - if Thread.start and Thread.join didn't have a "happens before" guarantee - give you problems with code like;
a = 1
Thread.start newthread
...
newthread:
do_computation(a)
...since a may not have a value stored to memory when the thread starts.
Since you almost always want the new thread to be able to use data you initialized before starting it, Thread.start has a "happens before" guarantee, that is, data that has been updated before calling Thread.start is guaranteed to be available to the new thread. The same thing goes for Thread.join where data written by the new thread is guaranteed to be visible to the thread that joins it after termination.
It just makes threading much easier.
Consider this:
static int x = 0;
public static void main(String[] args) {
x = 1;
Thread t = new Thread() {
public void run() {
int y = x;
};
};
t.start();
}
The main thread has changed field x. Java memory model does not guarantee that this change will be visible to other threads if they are not synchronized with the main thread. But thread t will see this change because the main thread called t.start() and JLS guarantees that calling t.start() makes the change to x visible in t.run() so y is guaranteed to be assigned 1.
The same concerns Thread.join();
Thread visibility problems may occur in a code that isn't properly synchronized according to the java memory model. Due to compiler & hardware optimizations, writes by one thread aren't always visible by reads of another thread. The Java Memory Model is a formal model that makes the rules of "properly synchronized" clear, so that programmers can avoid thread visibility problems.
Happens-before is a relation defined in that model, and it refers to specific executions. A write W that is proven to be happens-before a read R is guaranteed to be visible by that read, assuming that there's no other interfering write (i.e. one with no happens-before relation with the read, or one happening between them according to that relation).
The simplest kind of happens-before relation happens between actions in the same thread. A write W to V in thread P happens-before a read R of V in the same thread, assuming that W comes before R according to the program order.
The text you are referring to states that thread.start() and thread.join() also guarantee happens-before relationship. Any action that happens-before thread.start() also happens before any action within that thread. Similarly, actions within the thread happen before any actions that appear after thread.join().
What's the practical meaning of that? If for example you start a thread and wait for it to terminate in a non-safe manner (e.g. a sleep for a long time, or testing some non-synchronized flag), then when you'll try reading the data modifications done by the thread, you may see them partially, thus having the risk of data inconsistencies. The join() method acts as a barrier that guarantees that any piece of data published by the thread is visible completely and consistently by the other thread.
According to oracle document, they define that The happens-before relationship is simply a guarantee that memory writes by one specific statement are visible to another specific statement.
package happen.before;
public class HappenBeforeRelationship {
private static int counter = 0;
private static void threadPrintMessage(String msg){
System.out.printf("[Thread %s] %s\n", Thread.currentThread().getName(), msg);
}
public static void main(String[] args) {
threadPrintMessage("Increase counter: " + ++counter);
Thread t = new Thread(new CounterRunnable());
t.start();
try {
t.join();
} catch (InterruptedException e) {
threadPrintMessage("Counter is interrupted");
}
threadPrintMessage("Finish count: " + counter);
}
private static class CounterRunnable implements Runnable {
#Override
public void run() {
threadPrintMessage("start count: " + counter);
counter++;
threadPrintMessage("stop count: " + counter);
}
}
}
Output will be:
[Thread main] Increase counter: 1
[Thread Thread-0] start count: 1
[Thread Thread-0] stop count: 2
[Thread main] Finish count: 2
Have a look output, line [Thread Thread-0] start count: 1 shows that all counter changes before invocation of Thread.start() are visible in Thread's body.
And line [Thread main] Finish count: 2 indicates that all changes in Thread's body are visible to main thread that calls Thread.join().
Hope it can help you clearly.

Kill an uncooperative thread in Java

Following piece is from a JUnit testcase that tests 4 different implementations of Sorter. It invokes the only method Sorter has viz sort().
I want to kill the sorting process if it takes longer than say 2 seconds (Because I don't care for any implementation that takes longer than 2 seconds to sort() say 500000 Integers).
I'm new the Java multi-threading and after looking at all other threads ( How to kill a java thread? and a few others) on SO, I figured following as solution to my problem. Question is, would it work consistently, or could there be any issues? I don't care abt the array or it's contents as reset() would reset it's contents.
Reason why I call it uncooperative is because s.sort() is out of my control.
protected E[] arr;
#Test
public void testSortTArray() {
boolean allOk = true;
for (Sorter s : TestParams.getSorters()) {
System.out.println("Testing: " + s.getName() + " with " + arrayLenToTestWith + " elems of type "
+ classOfElemType.getName());
reset();
long startTime = System.nanoTime();
MyThread test = new MyThread(s, arr);
test.start();
try {
test.join(TestParams.getTimeThreshold());
} catch (InterruptedException e) {
e.printStackTrace();
}
if (test.isAlive())
test.interrupt();
if (!test.isInterrupted()) {
System.out.println("Time taken: " + ((System.nanoTime() - startTime) / (1000000)) + "ms");
if (!isSorted(arr)) {
allOk = false;
System.err.println(s.getName() + " didn't sort array.");
}
} else {
allOk = false;
System.err.println(s.getName() + " took longer than .");
}
}
assertTrue("At least one algo didn't sort the array.", allOk);
}
public class MyThread extends Thread {
private Sorter s;
private E[] arr;
public MyThread(Sorter s, E[] arr) {
this.s = s;
this.arr = arr;
}
#Override
public void run() {
s.sort(arr);
}
}
--- edit: answer ---
Based on comments from everyone:
No. What I'm doing is not safe as Thread.interrupt() will not suspend the thread, it'll just set it's interrupted state, which if not checked by the thread's run() implementation, is useless.
In this case the next Sorter's sort() would be called on the same array (which is still being sorted by the old "interrupted" thread), thus making things unsafe.
One option is to create a separate Process instead of a Thread. A Process can be killed.
Obviously the parameter passing isn't easy in this case as it involves some IPC.
As you may have seen from the other questions you mention, it isn't possible to reliably stop a Java thread without its cooperation, because interrupt() ony works if the thread tests for it (deliberately or inadvertently).
However, it is possible to kill a process. If you spawn each sorting algorithm in a separate process, then you can kill it forcibly.
The downside is that interacting with the process is significantly harder than interacting with a thread, since you don't have shared variables.
Without a thread's cooperation, there is no reliable and safe way to stop it. With a thread's cooperation, you can interrupt or stop a thread using the mechanism it supports. Threads just don't provide this kind of isolation ... you have to use multiple processes.
This may be a case for Thread.stop(). Do read the disclaimer in the javadoc, though, in particular:
Deprecated. This method is inherently unsafe. Stopping a thread with Thread.stop causes it to unlock all of the monitors that it has locked (as a natural consequence of the unchecked ThreadDeath exception propagating up the stack). If any of the objects previously protected by these monitors were in an inconsistent state, the damaged objects become visible to other threads, potentially resulting in arbitrary behavior. Many uses of stop should be replaced by code that simply modifies some variable to indicate that the target thread should stop running. The target thread should check this variable regularly, and return from its run method in an orderly fashion if the variable indicates that it is to stop running. If the target thread waits for long periods (on a condition variable, for example), the interrupt method should be used to interrupt the wait.
would it work consistently, or could there be any issues?
It would work except that you need to handle the thread interrupt correctly. thread.interrupt() will only work if the sort method supports it. I suspect that the method will not be calling Thread.sleep(), wait(), or other such methods. Therefore it needs to test to see if it has been interrupted as it does its processing:
while (!Thread.currentThread().isInterrupted()) {
// do sort stuff
}
If it doesn't do that then interrupting the thread will not stop the processing. I would certainly add another test.join(); after the interrupt to make sure that the thread finishes before you start another sort operation.

How to share the variable between two threads in java?

I have a loop that doing this:
WorkTask wt = new WorkTask();
wt.count = count;
Thread a = new Thread(wt);
a.start();
When the workTask is run, the count will wt++ ,
but the WorkTask doesn't seems change the count number, and between the thread, the variable can't share within two thread, what did I wrote wrong? Thanks.
Without seeing the code for WorkThread it's hard to pin down the problem, but most likely you are missing synchronization between the two threads.
Whenever you start a thread, there are no guarantees on whether the original thread or the newly created thread runs first, or how they are scheduled. The JVM/operating system could choose to run the original thread to completion and then start running the newly created thread, run the newly created thread to completion and then switch back to the original thread, or anything in between.
In order to control how the threads run, you have to synchronize them explicitly. There are several ways to control the interaction between threads - certainly too much to describe in a single answer. I would recommend the concurrency trail of the Java tutorials for a broad overview, but in your specific case the synchronization mechanisms to get you started will probably be Thread.join and the synchronized keyword (one specific use of this keyword is described in the Java tutorials).
Make the count variable static (it looks like each thread has its own version of the variable right now) and use a mutex to make it thread safe (ie use the synchronized instruction)
From your description I came up with the following to demonstrate what I perceived as your issue. This code, should output 42. But it outputs 41.
public class Test {
static class WorkTask implements Runnable {
static int count;
#Override
public void run() {
count++;
}
}
public static void main(String... args) throws Exception {
WorkTask wt = new WorkTask();
wt.count = 41;
Thread a = new Thread(wt);
a.start();
System.out.println(wt.count);
}
}
The problem is due to the print statement running before thread had a chance to start.
To cause the current thread ( the thread that is going to read variable count ) to wait until the thread finishes, add the following after starting thre thread.
a.join();
If you are wishing to get a result back from a thread, I would recommend you to use Callable
interface and an ExecutorSercive to submit it. e.g:
Future future = Executors.newCachedThreadPool().submit
(new Callable<Interger>()
{
int count = 1000;
#Override public Integer call() throws Exception
{
//here goes the operations you want to be executed concurrently.
return count + 1; //Or whatever the result is.
}
}
//Here goes the operations you need before the other thread is done.
System.out.println(future.get()); //Here you will retrieve the result from
//the other thread. if the result is not ready yet, the main thread
//(current thread) will wait for it to finish.
this way you don't have to deal with the synchronization problems and etc.
you can see further about this in Java documentations:
http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/package-summary.html

Do threads work with respect to their respective priority number?

Why would the compiler print 2 a and then 2 b or vice versa when giving the priority to Thread a to start? Shouldn't thread b wait for thread a to finish in order to start? Can someone please explain how does it work?
public class Test1 extends Thread{
static int x = 0;
String name;
Test1(String n) {
name = n;
}
public void increment() {
x = x+1;
System.out.println(x + " " + name);
}
public void run() {
this.increment();
}
}
public class Main {
public static void main(String args[]) {
Test1 a = new Test1("a");
Test1 b = new Test1("b");
a.setPriority(3);
b.setPriority(2);
a.start();
b.start();
}
}
Giving priorities is not a job for the compiler. It is the OS scheduler to schedule and give CPU time (called quantum) to threads.
The scheduler further tries to run as much threads at once as possible, based on the available number of CPUs. In today's multicore systems, more often than not more than one core are available.
If you want for a thread to wait for another one, use some synchronizing mechanism.
Shouldn't thread b wait for thread a to finish in order to start?
No. The priority does not block the thread execution. It only tells the JVM to execute the thread "in preference to threads with lower priority". This does imply a wait.
Since your code is so trivial, there is nothing to wait for. Any of the two threads is run.
Why would the compiler print 2 a and then 2 b?
Luck of the draw. "Priority" means different things on different operating systems, but in general, it's always part of how the OS decides which thread gets to run and which one must wait when there's not enough CPUs available to run them both at the same time. If you computer has two or more idle CPUs when you start that program, then everybody gets to run. Priority doesn't matter in that case, and its just a race to see which one gets to the println(...) call first.
The a thread in your example has an advantage because the program doesn't call b.start() until after the a.start() method returns, but how big that advantage actually is depends on the details of the OS thread scheduling algorithm. The a thread could get a huge head start (like, it actually finishes before b even starts), or it could be a near-trivial head start.

Categories

Resources