I was asked this question in an interview today.
"When we create a thread with pthread_create() (POSIX Threads), the thread starts on its own. Why do we need to explicitly call start() in Java. What is the reason that Java doesnt start the thread when we create an instance of it."
I was blank and interviewer was short of time and eventually he couldn't explain the reason to me.
In Java not starting the thread right away leads to a better API. You can set properties on the thread (daemon, priority) without having to set all the properties in the constructor.
If the thread started right away, it would need a constructor,
public Thread(Runnable target, String name, ThreadGroup threadGroup, int priority, boolean daemon, ContextClassLoader contextClassLoader, long stackSize)
To allow setting all these parameters before the thread started. The daemon property can't be set after the thread has started.
I'm guessing that the POSIX API takes a struct with all the thread properties in the call to pthread_create(), so it makes sense to start the thread right away.
The reasons are a lot. But I'll give you a few:
The thread, itself, might start executing before returning the instance.
The context classloader MUST be set properly before running the thread (look at the previous point)
Extra configuration like priority should be set before starting the thread
pthreads uses a pointer to the initialized structure(s), since the java.lang.Thread cannot be properly initialized in the end of the c-tor, see points above; straight call to the native pthread_create to actually execute the code makes no sense
I hope you get the idea.
Related
Thread in Java has a static method currrentThread(). Does that mean that only a single thread is going to run during a smallest time slice? That would be a bit counter-intuitive for me. Can someone clarify? Thanks.
No.
The name "current thread" is an unfortunate historical artifact. It dates back to a time when computers had only one CPU. Its purpose is to return the ID of the thread that calls it. That is, it's the function that a thread can call to find its own ID.
Back in the day, it was easily implemented by simply returning the ID of the one thread that currently was running, because that was the only thread that could have called it.
Now that we have multi-processing systems, "current thread" is ambiguous because there could be four or sixteen or more than a hundred "current" threads. But the function always returns the ID of the thread that called it in any case.
While reusing Thread from thread pool, we get thread local variable value from the last execution of the thread.
I understand that Thread local is part of Thread so it is getting reused when we use Thread pool. But my problem is I do not want to use thread local variable set in the last execution (valid use case for many people).
Is there any better way of clearing thread local values when thread goes to the pool after current execution?
There is no way to clear ThreadLocals using public API. Nevertheless, ThreadPoolExecutor has a hook method where we can clear ThreadLocals in Thread using reflection before execution
public class ThreadPoolExector
protected void beforeExecute(Thread t, Runnable r) {
... set t.threadLocals and t.inheritableThreadLocals fields to null using reflection
}
...
public class Thread
...
ThreadLocal.ThreadLocalMap threadLocals;
ThreadLocal.ThreadLocalMap inheritableThreadLocals;
...
You can clear the thread local map for a thread using reflection. This will clear all thread locals no matter where they are.
Ideally, you should write your code so your thread locals are not stateful like this. This means unit testing much harder.
It should be the responsibility of the "thread user" to clear the ThreadLocal - that is, the code that put the ThreadLocal in place shall also clear it. Use e.g. a try-finally-block to make sure this happens.
However, you can hack what you want! Seriously, it is not difficult - you just introspect your way into the Thread object/class, find the Map where the ThreadLocals reside, and clear it.
Is it safe to start a thread pool in an objects constructor? I know that you shouldn't start a thread from a constructor, something about the "this" pointer escaping (I don't exactly understand this, but will do some more searches to try and figure it out).
The code would look something like this:
private ExecutorService pool;
public handler()
{
pool = Executors.newCachedThreadPool();
}
public void queueInstructionSet(InstructionSet set)
{
pool.submit(new Runnable that handles this instruction set);
}
If that doesn't work, i could just create this class as a Runnable and start it in a new thread. However, that seems like it would be adding an unnecessary thread to the program where it doesn't really need one.
Thanks.
EDIT:
Thanks for the replies everyone, they definitely helped make sense of this.
As per the code, in my mind it makes sense that this constructor creates the thread pool, but let me explain what specifically this code is doing, because i may be thinking about this in a weird way.
The entire point of this object is to take "Instruction Sets" objects, and act on them accordingly. The instruction sets come from clients connected to a server. Once a full instruction set is received from a client, that instruction set is sent to this object (handler) for processing.
This handler object contains a reference to every object that an instruction set can act upon. It will submit the instruction set to a thread pool, which will find which object this instruction set wants to interact with, and then handle the instruction set on that object.
I could handle the instruction set object in the IO server, but my thoughts are having a separate class for it makes the entire code more readable, as each class is focusing on doing only one specific thing.
Thoughts? Advice?
Thanks
Your sample code doesn't let "this" escape at all. It's reasonably safe to start a new thread in a constructor (and even use this as the Runnable, which you don't in this example) so long as you're sure that you've already initialized the object as far as the new thread will need it. For example, setting a final field which the new thread will rely on after starting the thread would be a really bad idea :)
Basically letting the "this" reference escape is generally nasty, but not universally so. There are situations in which it's safe. Just be careful.
Having said that, making a constructor start a thread might be seen as doing too much within the constructor. It's hard to say whether it's appropriate in this case or not - we don't know enough about what your code is doing.
EDIT: Yup, having read the extra information, I think this is okay. You should probably have a method to shut down the thread pool as well.
I agree with Jon.
Furthermore, let me point that you're not actually starting any actions on the thread pool in the constructor. You're instantiating the thread pool, but it has no tasks to run at that point. Therefore, as written, you're not going to have something start operating on this instance before it finishes construction.
It sounds like the thread pool would be owned and used by the object; threads wouldn't be pass out of the object. If that's the case, it shouldn't be an issue.
Constructors create an object and initialize its state. I can't imagine a use case where long-running processes are required to do so.
I can see where an object might interact with a thread pool to accomplish a task, but I don't see the need for that object to own the thread pool.
More details might help.
I think it's OK to start a thread pool in the constructor of the object as long as that object fully manages the lifetime of that thread pool.
If you go this path, you will have to work extra hard to provide the following guarantees:
If you constructor throws any exception ( both Runtime and checked ), you must have cleanup code in the constructor that shuts down the thread pool. If you don't do this and create a thread pool with non-daemon threads then, for example, a little console program that uses your object may stay up forever, leaking valuable system resources.
You need to provide something that I call destructor method, similar to close in Java I/O. I usually call it releaseResources. Notice that finalize is not a substitute for this method, because it is called by GC, and for an object with reasonably small memory footprint it may never be called.
When using this object follow this pattern
->
MyThreadPoolContainer container =
new MyThreadPoolContainer( ... args to initialize the object... );
try
{
methodThatUsesContainer( container );
}
finally
{
container.releaseResources( );
}
Document that object constructor allocates limited resources and the destructor method has to be called explicitly to prevent their leakage.
I am a building a console Sudoku Solver where the main objective is raw speed.
I now have a ManagerThread that starts WorkerThreads to compute the neibhbors of each cell. So one WorkerThread is started for each cell right now. How can I re-use an existing thread that has completed its work?
The Thread Pool Pattern seems to be the solution, but I don't understand what to do to prevent the thread from dying once its job has been completed.
ps : I do not expect to gain much performance for this particular task, just want to experiment how multi-threading works before applying it to the more complex parts of the code.
Thanks
Have a look at the Java SE provided java.util.concurrent API. You can create a threadpool using Executors#newFixedThreadPool() and you can submit tasks using the ExecutorService methods. No need to reinvent your own threadpool. Also see the Sun tutorial on the subject.
when using a thread pool (java.util.concurrent) , you never actually initialized a thread - but rather pass Runnables to the thread pool.
you don't need to worry about the thread life-cycle, just do whatever work you need to do in the runnable and let it exit when it's done.
Have a look into using CyclicBarrier synchro: http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/CyclicBarrier.html
Well, if I had to code this logic my self instead of using a package like Quartz from OpenSymphony, I would do the following:
I'd have a WorkerThread which extends Thread. This class will also have private property called runnable which is Runnable. This property will hold a reference to the code you'd like to execute. Have a public setter for it.
The main thread code will start by running the runnable you initialized it with and then switch to a wait state. Before doing that, it will mark to the pool manager that it has finished and it can be returned to the pool. Next time you need a thread, you pick one from the pool, call setRunnable which sets the property runnable, and then wakes up the thread. It will spawn back to work, enter the infinite loop: execute and runnable and go back to wait state.
I am writing a java program which tracks as threads are created in a program and is then supposed to perform some work as each Thread terminates.
I dont see any 'thread termination hooks' out there in the javadoc.
Currently the only way I can think of to achieve my requirement is to hold on to the thread objects and query its 'state' at repeated intervals.
Is there any better way to do this?
Edit:
I cannot wrap the runnable or modify the runnable in any way.
My code uses runtime instrumentation and just detects that a thread is created and gets a reference to the Thread object.
The runnable is already running at this point.
You can use the join() method.
EDIT
If your main thread must not be blocked until threads are not terminated, you can create a sub main thread which will call the threads, then wait for them with join() method.
I see four possible methods.
Use your own Thread subclass with an
overridden run() method. Add a
finally block for thread
termination.
Use a Runnable with
similar decoration, perhaps as a
wrapper around the supplied
Runnable. A variant of this is to
subclass Thread in order to apply
this wrapper at construction time.
Create a 2nd thread to join() on the
real thread and thus detect its
termination.
Use instrumentation to rewrite the Thread.run() method as above.
Just poking around in the (sun 1.5) source code for java.lang.Thread and sun.misc.VM, there is a field in thread called threadStatus. It is a private int and its values map to the enum java.lang.Thread.State. I have not verified this, nor determined how quickly it occurs if it does, but when a thread eventually terminates, this value will be set to java.lang.Thread.State.TERMINATED.
With this relatively simple condition to detect, I think it would be fairly straightforward to inject a field interceptor on threadStatus to fire an event when the field is set to a specific target value.
You could write a decorator for Runnable which calls a termination hook and wrap your thread code in it when you create the threads.
If you added a try/finally block to each run method, the code inside would be executed when each thread completed. Let the thread be responsible for its own clean-up.
AspectJ could help you do this if you needed to inject code into third-party compiled code, but apparently it doesn't work on standard Java class libraries.
Looks like there's a whitepaper on doing this here, but there's no telling if it's practical. I think you have to pay for it.
http://portal.acm.org/citation.cfm?doid=1411732.1411754
You could download OpenJDK, put the hook in yourself, compile a custom JRE and ship that with your application :)
As you say, there are no thread termination hooks. You have to code them yourself; call some method on a controller at the end of the run() method of your Runnables (AFAIK subclassing Thread is considered bad practice, you should implement Runnable and create a Thread with that Runnable as its target).
You can also implement an UncaughtExceptionHandler to know if a thread terminated abnormally due to an exception, in which case your controller's method won't be called.
If you run on java 1.5 you can probably do it using java.lang.instrument and the -javaagent option to the jvm.
Redefine the run method on the thread object which should call your code. You already seem to use instrumentation so it should be available. as it modifies runtime bytecode you should be fine
That said, it is hard to provide a more specific and detailed answer your question lacks at least the jvm version and the main frameworks in use (think spring-aop, jboss-aop, jvm version etc)