Is it safe to start a thread pool in an objects constructor? I know that you shouldn't start a thread from a constructor, something about the "this" pointer escaping (I don't exactly understand this, but will do some more searches to try and figure it out).
The code would look something like this:
private ExecutorService pool;
public handler()
{
pool = Executors.newCachedThreadPool();
}
public void queueInstructionSet(InstructionSet set)
{
pool.submit(new Runnable that handles this instruction set);
}
If that doesn't work, i could just create this class as a Runnable and start it in a new thread. However, that seems like it would be adding an unnecessary thread to the program where it doesn't really need one.
Thanks.
EDIT:
Thanks for the replies everyone, they definitely helped make sense of this.
As per the code, in my mind it makes sense that this constructor creates the thread pool, but let me explain what specifically this code is doing, because i may be thinking about this in a weird way.
The entire point of this object is to take "Instruction Sets" objects, and act on them accordingly. The instruction sets come from clients connected to a server. Once a full instruction set is received from a client, that instruction set is sent to this object (handler) for processing.
This handler object contains a reference to every object that an instruction set can act upon. It will submit the instruction set to a thread pool, which will find which object this instruction set wants to interact with, and then handle the instruction set on that object.
I could handle the instruction set object in the IO server, but my thoughts are having a separate class for it makes the entire code more readable, as each class is focusing on doing only one specific thing.
Thoughts? Advice?
Thanks
Your sample code doesn't let "this" escape at all. It's reasonably safe to start a new thread in a constructor (and even use this as the Runnable, which you don't in this example) so long as you're sure that you've already initialized the object as far as the new thread will need it. For example, setting a final field which the new thread will rely on after starting the thread would be a really bad idea :)
Basically letting the "this" reference escape is generally nasty, but not universally so. There are situations in which it's safe. Just be careful.
Having said that, making a constructor start a thread might be seen as doing too much within the constructor. It's hard to say whether it's appropriate in this case or not - we don't know enough about what your code is doing.
EDIT: Yup, having read the extra information, I think this is okay. You should probably have a method to shut down the thread pool as well.
I agree with Jon.
Furthermore, let me point that you're not actually starting any actions on the thread pool in the constructor. You're instantiating the thread pool, but it has no tasks to run at that point. Therefore, as written, you're not going to have something start operating on this instance before it finishes construction.
It sounds like the thread pool would be owned and used by the object; threads wouldn't be pass out of the object. If that's the case, it shouldn't be an issue.
Constructors create an object and initialize its state. I can't imagine a use case where long-running processes are required to do so.
I can see where an object might interact with a thread pool to accomplish a task, but I don't see the need for that object to own the thread pool.
More details might help.
I think it's OK to start a thread pool in the constructor of the object as long as that object fully manages the lifetime of that thread pool.
If you go this path, you will have to work extra hard to provide the following guarantees:
If you constructor throws any exception ( both Runtime and checked ), you must have cleanup code in the constructor that shuts down the thread pool. If you don't do this and create a thread pool with non-daemon threads then, for example, a little console program that uses your object may stay up forever, leaking valuable system resources.
You need to provide something that I call destructor method, similar to close in Java I/O. I usually call it releaseResources. Notice that finalize is not a substitute for this method, because it is called by GC, and for an object with reasonably small memory footprint it may never be called.
When using this object follow this pattern
->
MyThreadPoolContainer container =
new MyThreadPoolContainer( ... args to initialize the object... );
try
{
methodThatUsesContainer( container );
}
finally
{
container.releaseResources( );
}
Document that object constructor allocates limited resources and the destructor method has to be called explicitly to prevent their leakage.
Related
I am considering extending java.util.Timer, and completely overriding all public methods, to use a different implementation. The one "problem" I see is, that Timer instantiate and starts a Thread in it's constructor, which I cannot use, due to it being "private". So I would like to not waste the "resources" used up by that Thread. I see at least one things I could do, which is to call super.cancel() directly in the sub-class constructor, thereby immediately closing the thread.
My question is: When are the "resources" of a java.lang.Thread allocated and released?
Allocation: At instance instantiation, or at call of start()?
Release: At "end of run()" or at instance GC time?
If it's JVM implementation specific, I'd like to know how the Oracle JVM does it?
Generally, when you instantiate an object you allocate space in memory for it. This is the case when you create a Thread object as well. It is making perfect sense, as you might wonder how can you use an object which is not stored in the memory. A Thread object does not use a lot of memory though. On the other hand, when you call the start() method the run() method of the Runnable is called and all the resources associated to the Thread will be allocated there. If the Thread is no longer running, then all the otherwise unreferenced resources used by the Thread will be de-allocated by the garbage collector eventually. So, if you ask me I think your approach to stop the Thread is good, this way only the Thread object will remain in the memory along with any other resources you reference.
In my app I need to execute different future task.
My call would be something like
public Item getTaskResult(){
//creating the task object named task
Executors.newCachedThreadPool().execute(task);
....
}
Is it wrong to just call Executors.newCachedThreadPool() ?
Should I keep a reference to it? Am I wasting some resources doing in my way?
You should probably have only one CachedThreadPool in your whole application. Doing so, it allows you to factorize resources associated to the pool and also to take advantage of a better thread re-use.
Creating a thread pool every time is a costly operation. Therefore create it once and use it as much as you want.
Take it this way: In your house, would you want to create a new swimming pool every time you need to swim? Create just one CachedThreadPool and use it.
The worst problem with your demonstrated code is that it has a resource leak. The thread pool will not be automatically closed, and threads killed, just because it has become unreachable. You may observe your thread count growing without bounds, until finally you get an OutOfMemoryException: cannot create a native thread.
You can legally submit a task to a new thread pool and immediately call shutdown on it. This will work correctly, even if failing to be the most performant option.
On a different level of approaching this issue, thread pools are not designed to be used in such an ephemeral fashion. You are degrading the pool to what a raw Thread instance would do, where the main point of using a thread pool is... well, pooling threads, which are expensive system resources. This is why a global singleton is the preferred approach to using the Executor Service.
While reusing Thread from thread pool, we get thread local variable value from the last execution of the thread.
I understand that Thread local is part of Thread so it is getting reused when we use Thread pool. But my problem is I do not want to use thread local variable set in the last execution (valid use case for many people).
Is there any better way of clearing thread local values when thread goes to the pool after current execution?
There is no way to clear ThreadLocals using public API. Nevertheless, ThreadPoolExecutor has a hook method where we can clear ThreadLocals in Thread using reflection before execution
public class ThreadPoolExector
protected void beforeExecute(Thread t, Runnable r) {
... set t.threadLocals and t.inheritableThreadLocals fields to null using reflection
}
...
public class Thread
...
ThreadLocal.ThreadLocalMap threadLocals;
ThreadLocal.ThreadLocalMap inheritableThreadLocals;
...
You can clear the thread local map for a thread using reflection. This will clear all thread locals no matter where they are.
Ideally, you should write your code so your thread locals are not stateful like this. This means unit testing much harder.
It should be the responsibility of the "thread user" to clear the ThreadLocal - that is, the code that put the ThreadLocal in place shall also clear it. Use e.g. a try-finally-block to make sure this happens.
However, you can hack what you want! Seriously, it is not difficult - you just introspect your way into the Thread object/class, find the Map where the ThreadLocals reside, and clear it.
This is homework.
I do not want the solution, just a small number of links or ideas.
Simply speaking what I want to do is,
Simple example :
public class Example
{
public void method()
{
int x = doThat();
//Call other methods which do not depend on x
return;
}
}
doThat() is a method that is known to be time consuming, which results in my program blocking until results are back. And I want to use different methods of this Object, but program is frozen until doThat() is finished. Those different methods do not necesserely have to be invoked from the method() used in this example, but maybe from outside the object.
I thought about using threads but if I have a huge number of objects (1000+) this probably wont be very efficient (correct me if I am wrong please). I guess if I use threads I have to use one thread per object ?
Is there any other way besides threads that can make the invoking object not block when calling doThat(); ? If threading is the only way, could you provide a link ?
Knowing questions like that get downvoted I will accept any downvotes. But please just a link would be more than great.
Thanks in advance. I hope question is inline with the rules.
I'd also use threads for this, but I simply wanted to add that it would probably be interesting to look at java.util.concurrent.Executors (to create thread pools as you have a number of objects) and the java.util.concurrent.Future and java.util.concurrent.Callable classes which will allow you to launch threads that can return a value.
Take a look at the concurrency tutorial for more info.
I recommend you to create a class that implements Runnable, whose run method does what doThat() does in your sample. Then you can invoke it in a separate Thread in a simple way. Java's Thread class does have a constructor that takes a runnable. Use the run and join methods.
Cheers
Matthias
Of course threads are the only solution to handle some jobs in backgrounds, but
you are not forced to use a thread just for a single operation to be performed.
You can use only one thread that maintains a queue of operations to be performed, in a way that every call to the method doThat adds a new entry into the queue.
Maybe some design patterns like "Strategy" can help you to generalize the concept of operation to be performed, in order to store "operation objects" into the thread's queue.
You want to perform several things concurrently, so using threads is indeed the way to go. The Java tutorial concurrency lesson will probably help you.
1000 concurrent threads will impose a heavy memory load, because a certain amount of stack memory is allocated for each thread (2 MB?). If, however, you can somehow make sure there will be only one Thread running at a time, you still can take the thread per object approach. This would require you to manage that doThat() is only called, if the thread produced by a former invocation on another object has already finished.
If you cannot ensure that easily, the other approach would be to construct one worker thread which reads from a double ended queue which object to work on. The doThat() method would then just add this to the end of the queue, from which the worker thread will later extract it. You have to properly synchronize when accessing any data structure from concurrent threads. And the main thread should somehow notify the worker thread of the condition, that it will not add any more objects to the queue, so the worker thread can cleanly terminate.
The stop(), suspend(), and resume() in java.lang.Thread are deprecated because they are unsafe. The Oracle recommended work around is to use Thread.interrupt(), but that approach doesn't work in all cases. For example, if you are call a library method that doesn't explicitly or implicitly check the interrupted flag, you have no choice but to wait for the call to finish.
So, I'm wondering if it is possible to characterize situations where it is (provably) safe to call stop() on a Thread. For example, would it be safe to stop() a thread that did nothing but call find(...) or match(...) on a java.util.regex.Matcher?
(If there are any Oracle engineers reading this ... a definitive answer would be really appreciated.)
EDIT: Answers that simply restate the mantra that you should not call stop() because it is deprecated, unsafe, whatever are missing the point of this question. I know that that it is genuinely unsafe in the majority of cases, and that if there is a viable alternative you should always use that instead.
This question is about the subset cases where it is safe. Specifically, what is that subset?
Here's my attempt at answering my own question.
I think that the following conditions should be sufficient for a single thread to be safely stopped using Thread.stop():
The thread execution must not create or mutate any state (i.e. Java objects, class variables, external resources) that might be visible to other threads in the event that the thread is stopped.
The thread execution must not use notify to any other thread during its normal execution.
The thread must not start or join other threads, or interact with then using stop, suspend or resume.
(The term thread execution above covers all application-level code and all library code that is executed by the thread.)
The first condition means that a stopped thread will not leave any external data structures or resources in an inconsistent state. This includes data structures that it might be accessing (reading) within a mutex. The second condition means that a stoppable thread cannot leave some other thread waiting. But it also forbids use of any synchronization mechanism other that simple object mutexes.
A stoppable thread must have a way to deliver the results of each computation to the controlling thread. These results are created / mutated by the stoppable thread, so we simply need to ensure that they are not visible following a thread stop. For example, the results could be assigned to private members of the Thread object and "guarded" with a flag that is atomically by the thread to say it is "done".
EDIT: These conditions are pretty restrictive. For example, for a "regex evaluator" thread to be safely stopped, if we must guarantee that the regex engine does not mutate any externally visible state. The problem is that it might do, depending on how you implement the thread!
The Pattern.compile(...) methods might update a static cache of compiled
patterns, and if they did they would (should) use a mutex to do it. (Actually, the OpenJDK 6.0 version doesn't cache Patterns, but Sun might conceivably change this.)
If you try to avoid 1) by compiling the regex in the control thread and supplying a pre-instantiated Matcher, then the regex thread does mutate externally visible state.
In the first case, we would probably be in trouble. For example, suppose that a HashMap was used to implement the cache and that the thread was interrupted while the HashMap was being reorganized.
In the second case, we would be OK provided that the Matcher had not been passed to some other thread, and provided that the controller thread didn't try to use the Matcher after stopping the regex matcher thread.
So where does this leave us?
Well, I think I have identified conditions under which threads are theoretically safe to stop. I also think that it is theoretically possible to statically analyse the code of a thread (and the methods it calls) to see if these conditions will always hold. But, I'm not sure if this is really practical.
Does this make sense? Have I missed something?
EDIT 2
Things get a bit more hairy when you consider that the code that we might be trying to kill could be untrusted:
We can't rely on "promises"; e.g. annotations on the untrusted code that it is either killable, or not killable.
We actually need to be able to stop the untrusted code from doing things that would make it unkillable ... according to the identified criteria.
I suspect that this would entail modifying JVM behaviour (e.g. implementing runtime restrictions what threads are allowed to lock or modify), or a full implementation of the Isolates JSR. That's beyond the scope of what I was considering as "fair game".
So lets rule the untrusted code case out for now. Or at least, acknowledge that malicious code can do things to render itself not safely killable, and put that problem to one side.
The lack of safety comes from the idea idea of critical sections
Take mutex
do some work, temporarily while we work our state is inconsistent
// all consistent now
Release mutex
If you blow away the thread and it happend to be in a critical section then the object is left in an inconsistent state, that means not safely usable from that point.
For it to be safe to kill the thread you need to understand the entire processing of whatever is being done in that thread, to know that there are no such critical sections in the code. If you are using library code, then you may not be able to see the source and know that it's safe. Even if it's safe today it may not be tomorrow.
(Very contrived) Example of possible unsafety. We have a linked list, it's not cyclic. All the algorithms are really zippy because we know it's not cyclic. During our critical section we temporarily introduce a cycle. We then get blown away before we emerge from the critical section. Now all the algorithms using the list loop forever. No library author would do that surely! How do you know? You cannot assume that code you use is well written.
In the example you point to, it's surely possible to write the requreid functionality in an interruptable way. More work, but possible to be safe.
I'll take a flyer: there is no documented subset of Objects and methods that can be used in cancellable threads, because no library author wants to make the guarantees.
Maybe there's something I don't know, but as java.sun.com said, it is unsafe because anything this thread is handling is in serious risk to be damaged. Other objects, connections, opened files... for obvious reasons, like "don't shut down your Word without saving first".
For this find(...) exemple, I don't really think it would be a catastrophe to simply kick it away with a sutiless .stop()...
A concrete example would probably help here. If anyone can suggest a good alternative to the following use of stop I'd be very interested. Re-writing java.util.regex to support interruption doesn't count.
import java.util.regex.*;
import java.util.*;
public class RegexInterruptTest {
private static class BadRegexException extends RuntimeException { }
final Thread mainThread = Thread.currentThread();
TimerTask interruptTask = new TimerTask() {
public void run() {
System.out.println("Stopping thread.");
// Doesn't work:
// mainThread.interrupt();
// Does work but is deprecated and nasty
mainThread.stop(new BadRegexException());
}
};
Timer interruptTimer = new Timer(true);
interruptTimer.schedule(interruptTask, 2000L);
String s = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab";
String exp = "(a+a+){1,100}";
Pattern p = Pattern.compile(exp);
Matcher m = p.matcher(s);
try {
System.out.println("Match: " + m.matches());
interruptTimer.cancel();
} catch(BadRegexException bre) {
System.out.println("Oooops");
} finally {
System.out.println("All over");
}
}
}
There are ways to use Thread.stop() relatively stable w/o leaking memory or file descriptors (FDs are exceptionally leak prone on *NIX) but you shall rely on it only if you are forced to manage 3rd party code. Never do use it to achieve the result if you can have control over the code itself.
If I use Thread.stop along w/ interrupt() and some more hacks stuff like adding custom logging handlers to re-throw the trapped ThreadDeath, adding unhandleExceltionHandler, running into your own ThreadGroup (sync over 'em), etc...
But that deserves an entire new topic.
But in this case it's the Java Designers telling you; and
they're more authorative on their language then either of us :)
Just a note: quite a few of them are pretty clueless
If my understanding is right, the problem has to do with synchronization locks not being released as the generated ThreadInterruptedException() propagates up the stack.
Taking that for granted, it's inherently unsafe because you can never know whether or not any "inner method call" you happened to be in at the very moment stop() was invoked and effectuated, was effectively holding some synchronization lock, and then what the java engineers say is, seemingly, unequivocally right.
What I personally don't understand is why it should be impossible to release any synchronization lock as this particular type of Exception propagates up the stack, thereby passing all the '}' method/synchronization block delimiters, which do cause any locks to be release for any other type of exception.
I have a server written in java, and if the administrator of that service wants a "cold shutdown", then it is simply NECESSARY to be able to stop all running activity no matter what. Consistency of any object's state is not a concern because all I'm trying to do is to EXIT. As fast as I can.
There is no safe way to kill a thread.
Neither there is a subset of situations where it is safe. Even if it is working 100% while testing on Windows, it may corrupt JVM process memory under Solaris or leak thread resources under Linux.
One should always remember that underneath the Java Thread there is a real, native, unsafe thread.
That native thread works with native, low-level, data and control structures. Killing it may leave those native data structures in an invalid state, without a way to recover.
There is no way for Java machine to take all possible consequences into account, as the thread may allocate/use resources not only within JVM process, but within the OS kernel as well.
In other words, if native thread library doesn't provide a safe way to kill() a thread, Java cannot provide any guarantees better than that. And all known to me native implementations state that killing thread is a dangerous business.
All forms of concurrency control can be provided by the Java synchronization primitives by constructing more complex concurrency controls that suit your problem.
The reasons for deprecation are clearly given in the link you provide. If you're willing to accept the reasons why, then feel free to use those features.
However, if you choose to use those features, you also accept that support for those features could stop at any time.
Edit: I'll reiterate the reason for deprecation as well as how to avoid them.
Since the only danger is that objects that can be referenced by the stoped thread could be corrupted, simply clone the String before you pass it to the Thread. If no shared objects exist, the threat of corrupted objects in the program outside the stoped Thread is no longer there.