Is it possible to configure ForkJoinPool to use 1 execution thread?
I am executing code that invokes Random inside a ForkJoinPool. Every time it runs, I end up with different runtime behavior, making it difficult to investigate regressions.
I would like the codebase to offer "debug" and "release" modes. "debug" mode would configure Random with a fixed seed, and ForkJoinPool with a single execution thread. "release" mode would use system-provided Random seeds and use the default number of ForkJoinPool threads.
I tried configuring ForkJoinPool with a parallelism of 1, but it uses 2 threads (main and a second worker thread). Any ideas?
So, it turns out I was wrong.
When you configure a ForkJoinPool with parallelism set to 1, only one thread executes the tasks. The main thread is blocked on ForkJoin.get(). It doesn't actually execute any tasks.
That said, it turns out that it is really tricky providing deterministic behavior. Here are some of the problems I had to correct:
ForkJoinPool was executing tasks using different worker threads (with different names) if the worker thread became idle long enough. For example, if the main thread got suspended on a debugging breakpoint, the worker thread would become idle and shut down. When I would resume execution, ForkJoinThread would spin up a new worker thread with a different name. To solve this, I had to provide a custom ForkJoinWorkerThreadFactory implementation that ensures only one thread runs at a time, and that its name is hard-coded. I also had ensure that my code was returning the same Random instance even if a worker thread shut down and came back again.
Collections with non-deterministic iteration order such as HashMap or HashSet led to elements grabbing random numbers in a different order on every run. I corrected this by using LinkedHashMap and LinkedHashSet.
Objects with non-deterministic hashCode() implementations, such as Enum.hashCode(). I forget what problems this caused but I corrected it by calculating the hashCode() myself instead of relying on the built-in method.
Here is a sample implementation of ForkJoinWorkerThreadFactory:
class MyForkJoinWorkerThread extends ForkJoinWorkerThread
{
MyForkJoinWorkerThread(ForkJoinPool pool)
{
super(pool);
// Change thread name after ForkJoinPool.registerWorker() does the same
setName("DETERMINISTIC_WORKER");
}
}
ForkJoinWorkerThreadFactory factory = new ForkJoinWorkerThreadFactory()
{
private WeakReference<Thread> currentWorker = new WeakReference<>(null);
#Override
public synchronized ForkJoinWorkerThread newThread(ForkJoinPool pool)
{
// If the pool already has a live thread, wait for it to shut down.
Thread thread = currentWorker.get();
if (thread != null && thread.isAlive())
{
try
{
thread.join();
}
catch (InterruptedException e)
{
log.error("", e);
}
}
ForkJoinWorkerThread result = new MyForkJoinWorkerThread(pool);
currentWorker = new WeakReference<>(result);
return result;
}
};
Main thread is always the first thread your application will create. So when you create a ForkJoinPool with parallelism of 1, you are creating another thread. Effectively there will be two threads in the application now ( because you created a pool of threads ).
If you need only one thread that is Main, you can execute your code in sequence ( and not in parallel at all ).
Related
Assuming that I have a ForkJoinPool setup with degree of parallelism n, and that I call a parallel computation like this:
workpool.submit(
() -> {
objects.values().parallelStream().forEach(obj -> {
obj.foo();
});
});
I do this to ensure that the threads spawned there are created inside the workpool (I have different components of the system that need to be isolated). Now assume that the thread in which this is called, is also executing inside this workpool, and I do:
Future<?> wait = workpool.submit(
() -> {
objects.values().parallelStream().forEach(obj -> {
obj.foo();
});
});
wait.get()
1) Am I blocking a thread in the ForkJoinPool? If I were to have n threads all block on futures, while trying to schedule a task in the workpool, would this lead to deadlock? It's not clear to me whether the "maximum degree of parallellism" in the ForkJoinPool means that (if there are n non-blocked tasks), there will always be n threads executing, or whether there is a fixed number of threads,regardless of whether there are blocked. What if I use wait.join() insteadwait.join instead (I do not need checked exceptions as any exception thrown in this code will already generate a runtimeexception. If I understand correctly, join() will allow threads to execute queued tasks while waiting)
2) Am I still getting the benefit of the light-weight forkjoin tasks of the parallel stream if I am creating a runnable "wrapper" class by doing () -> {}
3) Is there any downside/upside to using this instead (assuming that the .join() does indeed implement the work-stealing behaviour that I think it does):
CompletableFuture.supplyAsync(this::mylambdafunction, workpool)
.thenAccept(this::mynextfunction);
Response to point 1: It's difficult to know whether your code will block without seeing the actual method implementations. One approach to dealing with blocking code is to increase the number of threads in the forkjoin threadpool. Normally, the number of threads in a forkjoin thread is n+1 for compute intensive tasks where n=number of processors. Or if you have I/O blocking you could use a ManagedBlocker.
Response to point 2: Yes
Response for point 3: The obvious upside to your completableFuture code is that the thenAccept is non blocking. So control will immediately go past your CompletableFuture block to the next statement without waiting whereas in the earlier code you wrote with a ForkJoin pool the wait.get() will block till you obtain an answer and won't proceed till then.
I have multiple threads accessing an external resource – a broswer. But only one thread can access it at a time. So, I am using a semaphore to synchronise them. However, one thread, which takes input from the GUI and then access the browser for the results, should have priority over other threads and I am not sure how to use a semaphore to achieve it.
I was thinking that every thread after acquiring the semaphore checks if there is the priority thread waiting in the queue and if yes, then it releases it and waits again. Only the priority thread doesn't release it once it is acquired.
Is this a good solution or is there anything else in Java API I could use?
There're no synchronization primitives in Java that would allow you to prioritise one thread over others in the manner you want.
But you could use another approach to solving your problem. Instead of synchronizing threads, make them produce small tasks (for instance, Runnable objects) and put those tasks into a PriorityBlockingQueue with tasks from the GUI thread having the highest priority. A single working thread will poll tasks from this queue and execute them. That would guarantee both mutual exclusion and prioritization.
There're special constructors in ThreadPoolExecutor that accept blocking queues. So, all you need is such an executor with a single thread provided with your PriorityBlockingQueue<Runnable>. Then submit your tasks to this executor and it will take care of the rest.
Should you decide to choose this approach, this post might be of interest to you: How to implement PriorityBlockingQueue with ThreadPoolExecutor and custom tasks
Here's a simple, no frills answer. This is similar to how a read/write lock works, except that every locker has exclusive access (normally all readers proceed in parallel). Note that it does not use Semaphore because that is almost always the wrong construct to use.
public class PrioLock {
private boolean _locked;
private boolean _priorityWaiting;
public synchronized void lock() throws InterruptedException {
while(_locked || _priorityWaiting) {
wait();
}
_locked = true;
}
public synchronized void lockPriority() throws InterruptedException {
_priorityWaiting = true;
try {
while(_locked) {
wait();
}
_locked = true;
} finally {
_priorityWaiting = false;
}
}
public synchronized void unlock() {
_locked = false;
notifyAll();
}
}
You would use it like one of the Lock types in java.util.concurrent:
Normal threads:
_prioLock.lock();
try {
// ... use resource here ...
} finally {
_prioLock.unlock();
}
"Priority" thread:
_prioLock.lockPriority();
try {
// ... use resource here ...
} finally {
_prioLock.unlock();
}
UPDATE:
Response to comment regarding "preemptive" thread interactions:
In the general sense, you cannot do that. you could build custom functionality which added "pause points" to the locked section which would allow a low priority thread to yield to a high priority thread, but that would be fraught with peril.
The only thing you could realistically do is interrupt the working thread causing it to exit the locked code block (assuming that your working code responded to interruption). This would allow a high priority thread to proceed quicker at the expense of the low priority thread losing in progress work (and you might have to implement rollback logic as well).
in order to implement this you would need to:
record the "current thread" when locking succeeds.
in lockPriority(), interrupt the "current thread" if found
implement the logic between the lock()/unlock() (low priority) calls so that:
it responds to interruption in a reasonable time-frame
it implements any necessary "rollback" code when interrupted
potentially implement "retry" logic outside the lock()/unlock() (low priority) calls in order to re-do any work lost when interrupted
You are mixing up concepts here.
Semaphores are just one of the many options to "synchronize" the interactions of threads. They have nothing to do with thread priorities and thread scheduling.
Thread priorities, on the other hand are a topic on its own. You have means in Java to affect them; but the results of such actions heavily depend on the underlying platform/OS; and the JVM implementation itself. In theory, using those priorities is easy, but as said; reality is more complicated.
In other words: you can only use your semaphore to ensure that only one thread is using your queue at one point in time. It doesn't help at all with ensuring that your GUI-reading thread wins over other threads when CPU cycles become a problem. But if your lucky, the answer to your problem will be simple calls to setPriority(); using different priorities.
I am having an issue ending threads once my program my has finished. I run a threaded clock object and it works perfectly but I need to end all threads when the time ´==´ one hour that bit seems to work I just need to know how to end them. Here is an example of the code I have and this is the only thing that runs in the run method apart from one int defined above this code.
#Override
public void run()
{
int mins = 5;
while(clock.getHour() != 1)
{
EnterCarPark();
if(clock.getMin() >= mins)
{
System.out.println("Time: " + clock.getTime() + " " + entryPoint.getRoadName() + ": " + spaces.availablePermits() + " Spaces");
mins += 5;
}
}
}
But when you keep watching the threads that are running in the debug mode of netbeans they keep running after an hour has passed not sure how to fix this. I have tried the interrupt call but it seems to do nothing.
There are two ways to stop a thread in a nice way, and one in an evil way.
For all you need access to the object of the thread (or in the first case a Runnable class that is executed on that thread).
So your first task is to make sure you can access a list of all threads you want to stop. Also notice that you need to make sure you are using threadsafe communication when dealing with objects used by several threads!
Now you have the following options
Interrupt mechanisme
Call Thread.interrupt() on each thread. This will throw an InterruptedException on the thread if you are in a blocking function. Otherwise it will only set the isInterrupted() flag, so you have to check this as well. This is a very clean and versatile way that will try to interrupt blocking functions by this thread. However many people don't understand how to nicely react to the InterruptedException, so it could be more prone to bugs.
isRunning flag
Have a boolean 'isRunning' in your thread. The while loop calls a function 'stopRunning()' that sets this boolean to false. In your thread you periodically read this boolean and stop execution when it is set to false.
This boolean needs to be threadsafe, this could be done by making it volatile (or using synchronized locking).
This also works well when you have a Runnable, which is currently the advised way of running tasks on Threads (because you can easily move Runnables to Threadpools etc.
Stop thread (EVIL)
A third and EVIL and deprecated way is to call Thread.stop(). This is very unsafe and will likely lead to unexpected behavior, don't do this!
Make sure that the loop inside every thread finishes - if it does in all the threads, it does not make sense that there are prints in the output. Just note that what you are checking in each loop condition check if the current hour is not 1 PM, not if an hour has not passed.
Also, your threads garbage collected, which means that the Garbage Collector is responsible for their destruction after termination - but in that case they should not output anything.
A volatile variable shared by all the Threads should help to achieve the goal. The importance of a volatile variable is that each of the Threads will not cache or have local copy but will need to directly read from the main memory. Once it is updated, the threads will get the fresh data.
public class A{
public static volatile boolean letThreadsRun = true;
}
// inside your Thread class
#Override
public void run()
{ // there will come a point when A.letThreadsRun will be set to false when desired
while(A.letThreadsRun)
{
}
}
If two threads are both reading and writing to a shared variable, then
using the volatile keyword for that is not enough. You need to use
synchronization in that case to guarantee that the reading and writing
of the variable is atomic.
Here are links that may help you to grasp the concept:
http://tutorials.jenkov.com/java-concurrency/volatile.html
http://java.dzone.com/articles/java-volatile-keyword-0
If these threads are still running after your main program has finished, then it may be appropriate to set them as daemon threads. The JVM will exit once all non-daemon threads have finished, killing all remaining daemon threads.
If you start the threads like:
Thread myThread = new MyThread();
myThread.start();
Then daemon-izing them is as simple as:
Thread myThread = new MyThread();
myThread.setDaemon(true);
myThread.start();
It's a bad practice to externally terminate threads or to rely on external mechanisms like kill for proper program termination. Threads should always be designed to self-terminate and not leave resources (and shared objects) in a potentially indeterminate state. Every time I have encountered a thread that didn't stop when it was supposed to, it was always a programming error. Go check your code and then step through the run loop in a debugger.
Regarding your thread, it should self-terminate when the hour reaches 1, but if it is below or above 1, it will not terminate. I would make sure that clock's hour count reaches one if minutes go past 59 and also check that it doesn't somehow skip 1 and increment off in to the sunset, having skipped the only tested value. Also check that clock.getHour() is actually returning the hour count instead of a dummy value or something grossly incorrect.
Have you considered using an ExecutorService ? It behaves more predictably and avoids the overhead of thread creation. My suggestion is that you wrap your while loop within one and set a time limit of 1 hr.
Using Thread.interrupt() will not stop the thread from running, it merely sends a signal to you thread. It's our job to listen for this signal and act accordingly.
Thread t = new Thread(new Runnable(){
public void run(){
// look for the signal
if(!Thread.interrupted()){
// keep doing whatever you're doing
}
}
});
// After 1 hour
t.interrupt();
But instead of doing all this work, consider using an ExecutorService. You can use Executors class with static methods to return different thread pools.
Executors.newFixedThreadPool(10)
creates a fixed thread pool of size 10 and any more jobs will go to queue for processing later
Executors.newCachedThreadPool()
starts with 0 threads and creates new threads and adds them to pool on required basis if all the existing threads are busy with some task. This one has a termination strategy that if a thread is idle for 60 seconds, it will remove that thread from the pool
Executors.newSingleThreadExecutor()
creates a single thread which will feed from a queue, all the tasks that're submitted will be processed one after the other.
You can submit your same Runnable tasks to your thread pool. Executors also has methods to get pools to which you can submit scheduled tasks, things you want to happen in future
ExecutorService service = Executors.newFixedThreadPool(10);
service.execute(myRunnableTask);
Coming to your question, when you use thread pools, you have an option to shut down them after some time elapsed like this
service.shutdown();
service.awaitTermination(60, TimeUnit.MINUTES);
Few things to pay attention
shutdown() Initiates an orderly shutdown in which previously submitted tasks are executed, but no new tasks will be accepted. Invocation has no additional effect if already shut down.
awaitTermination() is waiting for the state of the executor to go to TERMINATED. But first the state must go to SHUTDOWN if shutdown() is called or STOP if shutdownNow() is called.
In my Android project I had a lot of places where I need to run some code asynchronously (a web request, call to db etc.). This is not long running tasks (maximum a few seconds).
Until now I was doing this kind of stuff with creating a new thread, passing it a new runnable with the task. But recently I have read an article about threads and concurrency in Java and understood that creating a new Thread for every single task is not a good decision.
So now I have created a ThreadPoolExecutor in my Application class which holds 5 threads.
Here is the code:
public class App extends Application {
private ThreadPoolExecutor mPool;
#Override
public void onCreate() {
super.onCreate();
mPool = (ThreadPoolExecutor)Executors.newFixedThreadPool(5);
}
}
And also I have a method to submit Runnable tasks to the executor:
public void submitRunnableTask(Runnable task){
if(!mPool.isShutdown() && mPool.getActiveCount() != mPool.getMaximumPoolSize()){
mPool.submit(task);
} else {
new Thread(task).start();
}
}
So when I want to run an asynchronous task in my code I get the instance of App and call the submitRunnableTask method passing the runnable to it. As you can see, I also check, if the thread pool has free threads to execute my task, if not, I create a new Thread (I don't think that this will happen, but in any case... I don't want my task to wait in a queue and slow down the app).
In the onTerminate callback method of Application I shutdown the pool.
So my question is the following: Is this kind of pattern better then creating new Threads in code? What pros and cons my new approach has? Can it cause problems that I am not aware off yet? Can you advice me something better than this to manage my asynchronous tasks?
P.S. I have some experience in Android and Java, but I am far from being a concurrency guru ) So may be there are aspects that I don't understand well in this kind of questions. Any advice will be appreciated.
This answer assumes your tasks are short
Is this kind of pattern better then creating new Threads in code?
It's better, but it's still far from ideal. You are still creating threads for short tasks. Instead you just need to create a different type of thread pool - for example by Executors.newScheduledThreadPool(int corePoolSize).
What's the difference in behaviour?
A FixedThreadPool will always have a set of threads to use and if all threads are busy, a new task will be put into a queue.
A (default) ScheduledThreadPool, as created by the Executors class, has a minimum thread pool that it keeps, even when idle. If all threads are busy when a new task comes in, it creates a new thread for it, and disposes of the thread 60 seconds after it is done, unless it's needed again.
The second one can allow you to not create new threads by yourself. This behaviour can be achieved without the "Scheduled" part, but you will then have to construct the executor yourself. The constructor is
public ThreadPoolExecutor(int corePoolSize,
int maximumPoolSize,
long keepAliveTime,
TimeUnit unit,
BlockingQueue<Runnable> workQueue)
The various options allow you to fine-tune the behaviour.
If some tasks are long...
And I mean long. As in most of your application lifetime (Realtime 2-way connection? Server port? Multicast listener?). In that case, putting your Runnable in an executor is detrimental - standard executors are not designed to cope with it, and their performance will deteriorate.
Think about your fixed thread pool - if you have 5 long-running tasks, then any new task will spawn a new thread, completely destroying any possible gains of the pool. If you use a more flexible executor - some threads will be shared, but not always.
The rule of thumb is
If it's a short task - use an executor.
If it's a long task - make sure your executor can handle it (i.e. it either doesn't have a max pool size, or enough max threads to deal with 1 more thread being gone for a while)
If it's a parallel process that needs to always run alongside your main thread - use another Thread.
To answer your question — Yes, using Executor is better than creating new threads because:
Executor provides a selection of different thread pools. It allows re-use of already existing threads which increases performance as thread creation is an expensive operation.
In case a thread dies, Executor can replace it with a new thread without affecting the application.
Changes to multi-threading policies are much easier, as only the Executor implementation needs to be changed.
Based on the comment of Ordous I have modified my code to work with only one pool.
public class App extends Application {
private ThreadPoolExecutor mPool;
#Override
public void onCreate() {
super.onCreate();
mPool = new ThreadPoolExecutor(5, Integer.MAX_VALUE, 1, TimeUnit.MINUTES, new SynchronousQueue<Runnable>());
}
}
public void submitRunnableTask(Runnable task){
if(!mPool.isShutdown() && mPool.getActiveCount() != mPool.getMaximumPoolSize()){
mPool.submit(task);
} else {
new Thread(task).start(); // Actually this should never happen, just in case...
}
}
So, I hope this can be useful to someone else, and if more experienced people have some comments on my approach, I will very appreciate their comments.
In the below code the answer is always Started 0 1 2 3 Complete. Im just wondering how it is possible.
If someone could help with the consistency of the output, it would be nice
public class TestOne extends Thread {
/**
* #param args
*/
public static void main(String[] args)throws Exception {
// TODO Auto-generated method stub
Thread t = new Thread(new TestOne());
t.start();
System.out.println("started");
t.join();
System.out.println("Complete");
}
public void run(){
for(int i=0;i<4;i++){
System.out.println(i);
}
}
Most likely you're getting the same results because, most of the time, the main thread starts a new thread then, before that new thread has a chance to print anything, the main thread prints its started message. The join in the main thread then forces it to wait until the other thread has finished, then it prints Complete.
You have a race condition here. The instant you start the second thread, it's indeterministic as to which order the lines will be output (other than the complete line which is made deterministic by virtue of the wait call, as previously mentioned).
But a race condition doesn't guarantee that you'll get different results on multiple runs, only that it is possible. You still shouldn't rely on that behaviour, of course.
For example, the following code:
public class TestOne extends Thread {
public static void main (String[] args) throws Exception {
Thread t = new Thread (new TestOne());
t.start();
Thread.sleep (1000); // <- Added this.
System.out.println ("Started");
t.join();
System.out.println ("Complete");
}
public void run() {
for (int i = 0; i < 4; i++) {
System.out.println (i);
}
}
}
will output:
0
1
2
3
Started
Complete
Although, even then, the order is not guaranteed as it may take more than a second for the thread to "warm up" - sleep is rarely a good solution to race conditions. I've just used it here for illustrative purposes.
When you say Thread t = new Thread(); nothing special happens because you are not creating a "Thread" per se. It is just an object that has the "characteristics" to become a thread of execution. For the object to be a "Thread" you have to call t.start(); and this is where the magic lies.
When you say t.start() the JVM creates a new stack for the newly created thread to run. It associates that stack to the thread object in question. And then makes it available for scheduling. Basically this means that it queues the thread in the JVM scheduler and in the next time slice it is also available for execution. The JVM actually does a lot more than this, my answer is just oversimplified for your example.
Invariably, meanwhile all these thread + stack creation, your main thread has the opportunity to move to its next instruction which in your case is System.out.println("started");.
Theoretically, what you say is true, "Started" could come anywhere in between 0, 1, 2, 3. However in reality, since your t.start() is an "expensive" method, it takes some time to complete, during which the main thread generally gets the chance to execute its next instruction.
If you want to know the details of t.start(); look into the Java source code.
Clearly, you are seeing a consistent result (and this particular result) because on your machine the call to the child thread's run method is consistently happening after the println in the main thread.
Why is it consistent?
Well, simply because your platform's native thread library is behaving in a consistent fashion!
Typical modern Java virtual machine implementations use the host operating system's native thread support to implement Java threads, and to perform Java thread scheduling. On your machine, the native thread implementation appears to be consistently allowing the current thread to return from the Thread.start() call immediately and keep executing.
However, it is not guaranteed that this will always happen. For instance, if the machine was heavily loaded and the main thread had just about exhausted its current timeslice, it could get descheduled during or immediately after the start call, allowing the new thread to run first.
Furthermore, on another platform the normal scheduler behaviour could be different. The scheduler could consistently cause the current thread to deschedule and let the new one go first. Or it could happen "randomly".
The JVM and Java library specs deliberately do not specify which thread "goes first" precisely to allow for differences in the thread implementation ... and variation due to differences in hardware, system load and so on.
Bottom line - the apparent consistency you are seeing is an "artifact", and you shouldn't rely on it, especially if you want your application to work on a wide range of JVMs.
Simply put, the scheduling of the started thread with respect to the main thread is JVM implementation dependent. But that being said, most implementations, if not all, will leave the starting thread running to the completion of its timeslice, until it blocks on something, or until it is preempted by a higher priority thread (if preemptive scheduling is being used, also JVM implementation dependent).
The Java spec just does not say very much which is very specific about threading, deliberately to grant JVM writers the most scope for their implementation.
t.join() means "block until thread t has finished", which explains the "Completed" being last.
EDIT: In answer to question re "Started"...
The call to Thread.start() means "please schedule this thread to run", and it will start when java feels like starting it. The chance of that happening between t.start() the println() is platform dependant, but small on systems I've used (thx to # Stephen C for platform info).
This code however outputs 0, Started, 1, 2, 3, Completed:
Thread t = new Thread(new TestOne());
t.start();
try
{
Thread.sleep(100); // Added sleep between t.start() and println()
}
catch (InterruptedException e)
{
e.printStackTrace();
}
System.out.println("Started");
t.join();
System.out.println("Complete");
When you start a thread it takes time (nothing comes for free or occurs instantly)
The thread which is already running will almost always be quicker to print out than a thread which has just started.
If you really want to the order to be different every time you can use run() instead of start() This works because you only have one thread.