I have written a simple program, that is intended to start a few threads. The threads should then pick a integer n from an integer array, use it to wait n and return the time t the thread waited back into an array for the results.
If one thread finishes it's task, it should pick the next one, that has not yet being assigned to another thread.
Of course: The order in the arrays has to be maintained, so that integers and results match.
My code runs smoothly as far I see.
However I use one line of code block I find in particular unsatisfying and hope there is a good way to fix this without changing too much:
while(Thread.activeCount() != 1); // first evil line
I kinda abuse this line to make sure all my threads finish getting all the tasks done, before I access my array with the results. I want to do that to prevent ill values, like 0.0, Null Pointer Exception... etc. (in short anything that would make an application with an actual use crash)
Any sort of constructive help is appreciated. I am also not sure, if my code still runs smoothly for very very long arrays of tasks for the threads, for example the results no longer match the order of the integer.
Any constructive help is appreciated.
First class:
public class ThreadArrayWriterTest {
int[] repitions;
int len = 0;
double[] timeConsumed;
public boolean finished() {
synchronized (repitions) {
return len <= 0;
}
}
public ThreadArrayWriterTest(int[] repitions) {
this.repitions = repitions;
this.len = repitions.length;
timeConsumed = new double[this.len];
}
public double[] returnTimes(int[] repititions, int numOfThreads, TimeConsumer timeConsumer) {
for (int i = 0; i < numOfThreads; i++) {
new Thread() {
public void run() {
while (!finished()) {
len--;
timeConsumed[len] = timeConsumer.returnTimeConsumed(repititions[len]);
}
}
}.start();
}
while (Thread.activeCount() != 1) // first evil line
;
return timeConsumed;
}
public static void main(String[] args) {
long begin = System.currentTimeMillis();
int[] repitions = { 3, 1, 3, 1, 2, 1, 3, 3, 3 };
int numberOfThreads = 10;
ThreadArrayWriterTest t = new ThreadArrayWriterTest(repitions);
double[] times = t.returnTimes(repitions, numberOfThreads, new TimeConsumer());
for (double d : times) {
System.out.println(d);
}
long end = System.currentTimeMillis();
System.out.println("Total time of execution: " + (end - begin));
}
}
Second class:
public class TimeConsumer {
double returnTimeConsumed(int repitions) {
long before = System.currentTimeMillis();
for (int i = 0; i < repitions; i++) {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
long after = System.currentTimeMillis();
double ret = after - before;
System.out.println("It takes: " + ret + "ms" + " for " + repitions + " runs through the for-loop");
return ret;
}
}
The easiest way to wait for all threads to complete is to keep a Collection of them and then call Thread.join() on each one in turn.
In addition to .join() you can use ExecutorService to manage pools of threads,
An Executor that provides methods to manage termination and methods
that can produce a Future for tracking progress of one or more
asynchronous tasks.
An ExecutorService can be shut down, which will cause it to reject new
tasks. Two different methods are provided for shutting down an
ExecutorService. The shutdown() method will allow previously submitted
tasks to execute before terminating, while the shutdownNow() method
prevents waiting tasks from starting and attempts to stop currently
executing tasks. Upon termination, an executor has no tasks actively
executing, no tasks awaiting execution, and no new tasks can be
submitted. An unused ExecutorService should be shut down to allow
reclamation of its resources.
Method submit extends base method Executor.execute(Runnable) by
creating and returning a Future that can be used to cancel execution
and/or wait for completion. Methods invokeAny and invokeAll perform
the most commonly useful forms of bulk execution, executing a
collection of tasks and then waiting for at least one, or all, to
complete.
ExecutorService executorService = Executors.newFixedThreadPool(maximumNumberOfThreads);
CompletionService completionService = new ExecutorCompletionService(executorService);
for (int i = 0; i < numberOfTasks; ++i) {
completionService.take();
}
executorService.shutdown();
Plus take a look at ThreadPoolExecutor
Since java provides more advanced threading API with concurrent package, You should have look into ExecutorService, which simplifies thread management mechanism.
Simple to solution to your problem.
Use Executors API to create thread pool
static ExecutorService newFixedThreadPool(int nThreads)
Creates a thread pool that reuses a fixed number of threads operating off a shared unbounded queue.
Use invokeAll to wait for all tasks to complete.
Sample code:
ExecutorService service = Executors.newFixedThreadPool(10);
List<MyCallable> futureList = new ArrayList<MyCallable>();
for ( int i=0; i<12; i++){
MyCallable myCallable = new MyCallable((long)i);
futureList.add(myCallable);
}
System.out.println("Start");
try{
List<Future<Long>> futures = service.invokeAll(futureList);
for(Future<Long> future : futures){
try{
System.out.println("future.isDone = " + future.isDone());
System.out.println("future: call ="+future.get());
}
catch(Exception err1){
err1.printStackTrace();
}
}
}catch(Exception err){
err.printStackTrace();
}
service.shutdown();
Refer to this related SE question for more details on achieving the same:
wait until all threads finish their work in java
Related
I am creating program to run a logic in parallel with help of thread. The logic has to be done in a loop of a list and that list may contain 1000 rows to process , so i started each thread in a loop and once thread size reaches 5, i will call thread.join for all the 5 started thread.
So i think that this will start the 2nd set of 5 threads only after the first set of 5 threads are completed? is my understanding correct? i am kind of new to threads.
So my need is to start the 6th thread when any one of the previous set of 5 threads in completed. or start 6th and 7th when any 2 in the previous set of threads are completed.
My code
public static void executeTest(){
for (int i = 0; i < rows1; i++) {
RunnableInterface task = new New RunnableInterface(params);
Thread thread = new Thread(task);
thread.start();
threads.add(thread);
if ((threads.size() % 5 == 0) || ((i == rows1-1) && (threads.size() < 5))) {
waitForThreads(threads);
}
}
}
private static void waitForThreads(List<Thread> threads) {
for (Thread thread : threads) {
try {
thread.join();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
threads.clear();
}
What kind of modification do i need to do to achieve above mentioned results
You can use an ExecutorService with unbounded input queue and fixed number of threads like this:
ExecutorService exec = Executors.newFixedThreadPool(5);
// As many times as needed:
for(int i = 0; i< 100_000; i++) {
Runnable myrunnable = () -> {}; // new RunnableInterface(params)
exec.submit(myrunnable);
}
exec.shutdown();
exec.awaitTermination(1, TimeUnit.DAYS);
In JDK19 the executor service is AutoClosable, so you can simplify to:
try (ExecutorService exec = Executors.newFixedThreadPool(5)) {
...
}
When in doubt, break it down into smaller, simpler functions:
public static void executeTest(){
for (int i = 0; i < rows1; i++) {
startFiveThreads(i);
awaitFiveThreads(i);
}
}
private static void startFiveThreads(int i) {
...
}
private static void awaitFiveThreads(int i) {
...
}
But seriously? I'm giving you a literal answer to your question—a better way to implement your solution. #DuncG showed you a better solution for your problem. Definitely go with what DuncG said, but maybe remember what I said for next time you are trying to implement some tricky/complex algorithm.
I need a group of threads to run at the same time, and then another group of threads after that. For example, 10 threads start working, and then 10 or 15 other threads.
Of course, the first approach I've tried was to create a loop.
while (true) {
for (int i = 0; i < 10; i++) {
Thread thread = new Thread(
new Runnable() {
#Override
public void run() {
System.out.println("hi");
}
});
thread.start();
}
}
But the problem is when scenario like this happens: imagine if in first iteration, 8 threads finished their tasks, and 2 threads take longer time. The next 10 threads won't start until all 8 + 2 (completed and not completed) threads finish. while, I want an approach where 8 threads get replaced by 8 of waiting to start threads.
Bare Threads
It can be done using bare Thread and Runnable without diving into more advance technologies.
For that, you need to perform the following steps:
define your task (provide an implementation of the Runnable interface);
generate a collection of Threads creating based on this task);
start every thread;
invoke join() on every of these thread (note that firstly we need to start all threads).
That's how it might look like:
public static void main(String[] args) throws InterruptedException {
Runnable task = () -> System.out.println("hi");
int counter = 0;
while (true) {
System.out.println("iteration: " + counter++);
List<Thread> threads = new ArrayList<>();
for (int i = 0; i < 10; i++) {
threads.add(new Thread(task));
}
for (Thread thread : threads) {
thread.start();
}
for (Thread thread : threads) {
thread.join();
}
Thread.currentThread().sleep(1000);
}
}
Instead of managing your Threads manually, it definitely would be wise to look at the facilities provided by the implementations of the ExecutorService interfaces.
Things would be a bit earthier if you use Callable interface for your task instead of Runnable. Callable is more handy in many cases because it allows obtaining the result from the worker-thread and also propagating an exception if thing went wrong (as opposed run() would force you to catch every checked exception). If you have in mind something more interesting than printing a dummy message, you might find Callable to be useful for your purpose.
ExecutorService.invokeAll() + Callable
ExecutorService has a blocking method invokeAll() which expects a collection of the callable-tasks and return a list of completed Future objects when all the tasks are done.
To generate a light-weight collection of repeated elements (since we need to fire a bunch of identical tasks) we can use utility method Collections.nCopies().
Here's a sample code which repeatedly runs a dummy task:
ExecutorService executor = Executors.newWorkStealingPool();
while (true) {
executor.invokeAll(Collections.nCopies(10, () -> {
System.out.println("hi");
return true;
}));
}
To make sure that it does what expected, we can add a counter of iterations and display it on the console and Thread.currentThread().sleep() to avoid cluttering the output very fast (for the same reason, the number of tasks reduced to 3):
public static void main(String[] args) throws InterruptedException {
ExecutorService executor = Executors.newWorkStealingPool();
int counter = 0;
while (true) {
System.out.println("iteration: " + counter++);
executor.invokeAll(Collections.nCopies(3, () -> {
System.out.println("hi");
return true;
}));
Thread.currentThread().sleep(1000);
}
}
Output:
iteration: 0
hi
hi
hi
iteration: 1
hi
hi
hi
... etc.
CompletableFuture.allOf().join() + Runnable
Another possibility is to use CompletableFuture API, and it's method allOf() which expects a varargs of submitted tasks in the form CompletableFuture and return a single CompletableFuture which would be completed when all provided arguments are done.
In order to synchronize the execution of the tasks with the main thread, we need to invoke join() on the resulting CompletableFuture instance.
That's how it might be implemented:
public static void main(String[] args) throws InterruptedException {
ExecutorService executor = Executors.newWorkStealingPool();
Runnable task = () -> System.out.println("hi");
int counter = 0;
while (true) {
System.out.println("iteration: " + counter++);
CompletableFuture.allOf(
Stream.generate(() -> task)
.limit(3)
.map(t -> CompletableFuture.runAsync(t, executor))
.toArray(CompletableFuture<?>[]::new)
).join();
Thread.currentThread().sleep(1000);
}
}
Output:
iteration: 0
hi
hi
hi
iteration: 1
hi
hi
hi
... etc.
ScheduledExecutorService
I suspect you might interested in scheduling these tasks instead of running them reputedly. If that's the case, have a look at ScheduledExecutorService and it's methods scheduleAtFixedRate() and scheduleWithFixedDelay().
For adding tasks to threads and replacing them you can use ExecutorService. You can create it by using:
ExecutorService executor = Executors.newFixedThreadPool(10);
This is a java concurrency question. 10 jobs need to be done, each of them will have 32 worker threads. Worker thread will increase a counter . Once the counter is 32, it means this job is done and then clean up counter map. From the console output, I expect that 10 "done" will be output, pool size is 0 and counterThread size is 0.
The issues are :
most of time, "pool size: 0 and countThreadMap size:3" will be
printed out. even those all threads are gone, but 3 jobs are not
finished yet.
some time, I can see nullpointerexception in line 27. I have used ConcurrentHashMap and AtomicLong, why still have concurrency
exception.
Thanks
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.atomic.AtomicLong;
public class Test {
final ConcurrentHashMap<Long, AtomicLong[]> countThreadMap = new ConcurrentHashMap<Long, AtomicLong[]>();
final ExecutorService cachedThreadPool = Executors.newCachedThreadPool();
final ThreadPoolExecutor tPoolExecutor = ((ThreadPoolExecutor) cachedThreadPool);
public void doJob(final Long batchIterationTime) {
for (int i = 0; i < 32; i++) {
Thread workerThread = new Thread(new Runnable() {
#Override
public void run() {
if (countThreadMap.get(batchIterationTime) == null) {
AtomicLong[] atomicThreadCountArr = new AtomicLong[2];
atomicThreadCountArr[0] = new AtomicLong(1);
atomicThreadCountArr[1] = new AtomicLong(System.currentTimeMillis()); //start up time
countThreadMap.put(batchIterationTime, atomicThreadCountArr);
} else {
AtomicLong[] atomicThreadCountArr = countThreadMap.get(batchIterationTime);
atomicThreadCountArr[0].getAndAdd(1);
countThreadMap.put(batchIterationTime, atomicThreadCountArr);
}
if (countThreadMap.get(batchIterationTime)[0].get() == 32) {
System.out.println("done");
countThreadMap.remove(batchIterationTime);
}
}
});
tPoolExecutor.execute(workerThread);
}
}
public void report(){
while(tPoolExecutor.getActiveCount() != 0){
//
}
System.out.println("pool size: "+ tPoolExecutor.getActiveCount() + " and countThreadMap size:"+countThreadMap.size());
}
public static void main(String[] args) throws Exception {
Test test = new Test();
for (int i = 0; i < 10; i++) {
Long batchIterationTime = System.currentTimeMillis();
test.doJob(batchIterationTime);
}
test.report();
System.out.println("All Jobs are done");
}
}
Let’s dig through all the mistakes of thread related programming, one man can make:
Thread workerThread = new Thread(new Runnable() {
…
tPoolExecutor.execute(workerThread);
You create a Thread but don’t start it but submit it to an executor. It’s a historical mistake of the Java API to let Thread implement Runnable for no good reason. Now, every developer should be aware, that there is no reason to treat a Thread as a Runnable. If you don’t want to start a thread manually, don’t create a Thread. Just create the Runnable and pass it to execute or submit.
I want to emphasize the latter as it returns a Future which gives you for free what you are attempting to implement: the information when a task has been finished. It’s even easier when using invokeAll which will submit a bunch of Callables and return when all are done. Since you didn’t tell us anything about your actual task, it’s not clear whether you can let your tasks simply implement Callable (may return null) instead of Runnable.
If you can’t use Callables or don’t want to wait immediately on submission, you have to remember the returned Futures and query them at a later time:
static final ExecutorService cachedThreadPool = Executors.newCachedThreadPool();
public static List<Future<?>> doJob(final Long batchIterationTime) {
final Random r=new Random();
List<Future<?>> list=new ArrayList<>(32);
for (int i = 0; i < 32; i++) {
Runnable job=new Runnable() {
public void run() {
// pretend to do something
LockSupport.parkNanos(TimeUnit.SECONDS.toNanos(r.nextInt(10)));
}
};
list.add(cachedThreadPool.submit(job));
}
return list;
}
public static void main(String[] args) throws Exception {
Test test = new Test();
Map<Long,List<Future<?>>> map=new HashMap<>();
for (int i = 0; i < 10; i++) {
Long batchIterationTime = System.currentTimeMillis();
while(map.containsKey(batchIterationTime))
batchIterationTime++;
map.put(batchIterationTime,doJob(batchIterationTime));
}
// print some statistics, if you really need
int overAllDone=0, overallPending=0;
for(Map.Entry<Long,List<Future<?>>> e: map.entrySet()) {
int done=0, pending=0;
for(Future<?> f: e.getValue()) {
if(f.isDone()) done++;
else pending++;
}
System.out.println(e.getKey()+"\t"+done+" done, "+pending+" pending");
overAllDone+=done;
overallPending+=pending;
}
System.out.println("Total\t"+overAllDone+" done, "+overallPending+" pending");
// wait for the completion of all jobs
for(List<Future<?>> l: map.values())
for(Future<?> f: l)
f.get();
System.out.println("All Jobs are done");
}
But note that if you don’t need the ExecutorService for subsequent tasks, it’s much easier to wait for all jobs to complete:
cachedThreadPool.shutdown();
cachedThreadPool.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS);
System.out.println("All Jobs are done");
But regardless of how unnecessary the manual tracking of the job status is, let’s delve into your attempt, so you may avoid the mistakes in the future:
if (countThreadMap.get(batchIterationTime) == null) {
The ConcurrentMap is thread safe, but this does not turn your concurrent code into sequential one (that would render multi-threading useless). The above line might be processed by up to all 32 threads at the same time, all finding that the key does not exist yet so possibly more than one thread will then be going to put the initial value into the map.
AtomicLong[] atomicThreadCountArr = new AtomicLong[2];
atomicThreadCountArr[0] = new AtomicLong(1);
atomicThreadCountArr[1] = new AtomicLong(System.currentTimeMillis());
countThreadMap.put(batchIterationTime, atomicThreadCountArr);
That’s why this is called the “check-then-act” anti-pattern. If more than one thread is going to process that code, they all will put their new value, being confident that this was the right thing as they have checked the initial condition before acting but for all but one thread the condition has changed when acting and they are overwriting the value of a previous put operation.
} else {
AtomicLong[] atomicThreadCountArr = countThreadMap.get(batchIterationTime);
atomicThreadCountArr[0].getAndAdd(1);
countThreadMap.put(batchIterationTime, atomicThreadCountArr);
Since you are modifying the AtomicInteger which is already stored into the map, the put operation is useless, it will put the very array that it retrieved before. If there wasn’t the mistake that there can be multiple initial values as described above, the put operation had no effect.
}
if (countThreadMap.get(batchIterationTime)[0].get() == 32) {
Again, the use of a ConcurrentMap doesn’t turn the multi-threaded code into sequential code. While it is clear that the only last thread will update the atomic integer to 32 (when the initial race condition doesn’t materialize), it is not guaranteed that all other threads have already passed this if statement. Therefore more than one, up to all threads can still be at this point of execution and see the value of 32. Or…
System.out.println("done");
countThreadMap.remove(batchIterationTime);
One of the threads which have seen the 32 value might execute this remove operation. At this point, there might be still threads not having executed the above if statement, now not seeing the value 32 but producing a NullPointerException as the array supposed to contain the AtomicInteger is not in the map anymore. This is what happens, occasionally…
After creating your 10 jobs, your main thread is still running - it doesn't wait for your jobs to complete before it calls report on the test. You try to overcome this with the while loop, but tPoolExecutor.getActiveCount() is potentially coming out as 0 before the workerThread is executed, and then the countThreadMap.size() is happening after the threads were added to your HashMap.
There are a number of ways to fix this - but I will let another answer-er do that because I have to leave at the moment.
First and once more, thanks to all that already answered my question. I am not a very experienced programmer and it is my first experience with multithreading.
I got an example that is working quite like my problem. I hope it could ease our case here.
public class ThreadMeasuring {
private static final int TASK_TIME = 1; //microseconds
private static class Batch implements Runnable {
CountDownLatch countDown;
public Batch(CountDownLatch countDown) {
this.countDown = countDown;
}
#Override
public void run() {
long t0 =System.nanoTime();
long t = 0;
while(t<TASK_TIME*1e6){ t = System.nanoTime() - t0; }
if(countDown!=null) countDown.countDown();
}
}
public static void main(String[] args) {
ThreadFactory threadFactory = new ThreadFactory() {
int counter = 1;
#Override
public Thread newThread(Runnable r) {
Thread t = new Thread(r, "Executor thread " + (counter++));
return t;
}
};
// the total duty to be divided in tasks is fixed (problem dependent).
// Increase ntasks will mean decrease the task time proportionally.
// 4 Is an arbitrary example.
// This tasks will be executed thousands of times, inside a loop alternating
// with serial processing that needs their result and prepare the next ones.
int ntasks = 4;
int nthreads = 2;
int ncores = Runtime.getRuntime().availableProcessors();
if (nthreads<ncores) ncores = nthreads;
Batch serial = new Batch(null);
long serialTime = System.nanoTime();
serial.run();
serialTime = System.nanoTime() - serialTime;
ExecutorService executor = Executors.newFixedThreadPool( nthreads, threadFactory );
CountDownLatch countDown = new CountDownLatch(ntasks);
ArrayList<Batch> batches = new ArrayList<Batch>();
for (int i = 0; i < ntasks; i++) {
batches.add(new Batch(countDown));
}
long start = System.nanoTime();
for (Batch r : batches){
executor.execute(r);
}
// wait for all threads to finish their task
try {
countDown.await();
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
long tmeasured = (System.nanoTime() - start);
System.out.println("Task time= " + TASK_TIME + " ms");
System.out.println("Number of tasks= " + ntasks);
System.out.println("Number of threads= " + nthreads);
System.out.println("Number of cores= " + ncores);
System.out.println("Measured time= " + tmeasured);
System.out.println("Theoretical serial time= " + TASK_TIME*1000000*ntasks);
System.out.println("Theoretical parallel time= " + (TASK_TIME*1000000*ntasks)/ncores);
System.out.println("Speedup= " + (serialTime*ntasks)/(double)tmeasured);
executor.shutdown();
}
}
Instead of doing the calculations, each batch just waits for some given time. The program calculates the speedup, that would allways be 2 in theory but can get less than 1 (actually a speed down) if the 'TASK_TIME' is small.
My calculations take at the top 1 ms and are commonly faster. For 1 ms I find a little speedup of around 30%, but in practice, with my program, I notice a speed down.
The structure of this code is very similar to my program, so if you could help me to optimise the thread handling I would be very grateful.
Kind regards.
Below, the original question:
Hi.
I would like to use multithreading on my program, since it could increase its efficiency considerably, I believe. Most of its running time is due to independent calculations.
My program has thousands of independent calculations (several linear systems to solve), but they just happen at the same time by minor groups of dozens or so. Each of this groups would take some miliseconds to run. After one of these groups of calculations, the program has to run sequentially for a little while and then I have to solve the linear systems again.
Actually, it can be seen as these independent linear systems to solve are inside a loop that iterates thousands of times, alternating with sequential calculations that depends on the previous results. My idea to speed up the program is to compute these independent calculations in parallel threads, by dividing each group into (the number of processors I have available) batches of independent calculation. So, in principle, there isn't queuing at all.
I tried using the FixedThreadPool and CachedThreadPool and it got even slower than serial processing. It seems to takes too much time creating new Treads each time I need to solve the batches.
Is there a better way to handle this problem? These pools I've used seem to be proper for cases when each thread takes more time instead of thousands of smaller threads...
Thanks!
Best Regards!
Thread pools don't create new threads over and over. That's why they're pools.
How many threads were you using and how many CPUs/cores do you have? What is the system load like (normally, when you execute them serially, and when you execute with the pool)? Is synchronization or any kind of locking involved?
Is the algorithm for parallel execution exactly the same as the serial one (your description seems to suggest that serial was reusing some results from previous iteration).
From what i've read: "thousands of independent calculations... happen at the same time... would take some miliseconds to run" it seems to me that your problem is perfect for GPU programming.
And i think it answers you question. GPU programming is becoming more and more popular. There are Java bindings for CUDA & OpenCL. If it is possible for you to use it, i say go for it.
I'm not sure how you perform the calculations, but if you're breaking them up into small groups, then your application might be ripe for the Producer/Consumer pattern.
Additionally, you might be interested in using a BlockingQueue. The calculation consumers will block until there is something in the queue and the block occurs on the take() call.
private static class Batch implements Runnable {
CountDownLatch countDown;
public Batch(CountDownLatch countDown) {
this.countDown = countDown;
}
CountDownLatch getLatch(){
return countDown;
}
#Override
public void run() {
long t0 =System.nanoTime();
long t = 0;
while(t<TASK_TIME*1e6){ t = System.nanoTime() - t0; }
if(countDown!=null) countDown.countDown();
}
}
class CalcProducer implements Runnable {
private final BlockingQueue queue;
CalcProducer(BlockingQueue q) { queue = q; }
public void run() {
try {
while(true) {
CountDownLatch latch = new CountDownLatch(ntasks);
for(int i = 0; i < ntasks; i++) {
queue.put(produce(latch));
}
// don't need to wait for the latch, only consumers wait
}
} catch (InterruptedException ex) { ... handle ...}
}
CalcGroup produce(CountDownLatch latch) {
return new Batch(latch);
}
}
class CalcConsumer implements Runnable {
private final BlockingQueue queue;
CalcConsumer(BlockingQueue q) { queue = q; }
public void run() {
try {
while(true) { consume(queue.take()); }
} catch (InterruptedException ex) { ... handle ...}
}
void consume(Batch batch) {
batch.Run();
batch.getLatch().await();
}
}
class Setup {
void main() {
BlockingQueue<Batch> q = new LinkedBlockingQueue<Batch>();
int numConsumers = 4;
CalcProducer p = new CalcProducer(q);
Thread producerThread = new Thread(p);
producerThread.start();
Thread[] consumerThreads = new Thread[numConsumers];
for(int i = 0; i < numConsumers; i++)
{
consumerThreads[i] = new Thread(new CalcConsumer(q));
consumerThreads[i].start();
}
}
}
Sorry if there are any syntax errors, I've been chomping away at C# code and sometimes I forget the proper java syntax, but the general idea is there.
If you have a problem which does not scale to multiple cores, you need to change your program or you have a problem which is not as parallel as you think. I suspect you have some other type of bug, but cannot say based on the information given.
This test code might help.
Time per million tasks 765 ms
code
ExecutorService es = Executors.newFixedThreadPool(4);
Runnable task = new Runnable() {
#Override
public void run() {
// do nothing.
}
};
long start = System.nanoTime();
for(int i=0;i<1000*1000;i++) {
es.submit(task);
}
es.shutdown();
es.awaitTermination(10, TimeUnit.SECONDS);
long time = System.nanoTime() - start;
System.out.println("Time per million tasks "+time/1000/1000+" ms");
EDIT: Say you have a loop which serially does this.
for(int i=0;i<1000*1000;i++)
doWork(i);
You might assume that changing to loop like this would be faster, but the problem is that the overhead could be greater than the gain.
for(int i=0;i<1000*1000;i++) {
final int i2 = i;
ex.execute(new Runnable() {
public void run() {
doWork(i2);
}
}
}
So you need to create batches of work (at least one per thread) so there are enough tasks to keep all the threads busy, but not so many tasks that your threads are spending time in overhead.
final int batchSize = 10*1000;
for(int i=0;i<1000*1000;i+=batchSize) {
final int i2 = i;
ex.execute(new Runnable() {
public void run() {
for(int i3=i2;i3<i2+batchSize;i3++)
doWork(i3);
}
}
}
EDIT2: RUnning atest which copied data between threads.
for (int i = 0; i < 20; i++) {
ExecutorService es = Executors.newFixedThreadPool(1);
final double[] d = new double[4 * 1024];
Arrays.fill(d, 1);
final double[] d2 = new double[4 * 1024];
es.submit(new Runnable() {
#Override
public void run() {
// nothing.
}
}).get();
long start = System.nanoTime();
es.submit(new Runnable() {
#Override
public void run() {
synchronized (d) {
System.arraycopy(d, 0, d2, 0, d.length);
}
}
});
es.shutdown();
es.awaitTermination(10, TimeUnit.SECONDS);
// get a the values in d2.
for (double x : d2) ;
long time = System.nanoTime() - start;
System.out.printf("Time to pass %,d doubles to another thread and back was %,d ns.%n", d.length, time);
}
starts badly but warms up to ~50 us.
Time to pass 4,096 doubles to another thread and back was 1,098,045 ns.
Time to pass 4,096 doubles to another thread and back was 171,949 ns.
... deleted ...
Time to pass 4,096 doubles to another thread and back was 50,566 ns.
Time to pass 4,096 doubles to another thread and back was 49,937 ns.
Hmm, CachedThreadPool seems to be created just for your case. It does not recreate threads if you reuse them soon enough, and if you spend a whole minute before you use new thread, the overhead of thread creation is comparatively negligible.
But you can't expect parallel execution to speed up your calculations unless you can also access data in parallel. If you employ extensive locking, many synchronized methods, etc you'll spend more on overhead than gain on parallel processing. Check that your data can be efficiently processed in parallel and that you don't have non-obvious synchronizations lurkinb in the code.
Also, CPUs process data efficiently if data fully fit into cache. If data sets of each thread is bigger than half the cache, two threads will compete for cache and issue many RAM reads, while one thread, if only employing one core, may perform better because it avoids RAM reads in the tight loop it executes. Check this, too.
Here's a psuedo outline of what I'm thinking
class WorkerThread extends Thread {
Queue<Calculation> calcs;
MainCalculator mainCalc;
public void run() {
while(true) {
while(calcs.isEmpty()) sleep(500); // busy waiting? Context switching probably won't be so bad.
Calculation calc = calcs.pop(); // is it pop to get and remove? you'll have to look
CalculationResult result = calc.calc();
mainCalc.returnResultFor(calc,result);
}
}
}
Another option, if you're calling external programs. Don't put them in a loop that does them one at a time or they won't run in parallel. You can put them in a loop that PROCESSES them one at a time, but not that execs them one at a time.
Process calc1 = Runtime.getRuntime.exec("myCalc paramA1 paramA2 paramA3");
Process calc2 = Runtime.getRuntime.exec("myCalc paramB1 paramB2 paramB3");
Process calc3 = Runtime.getRuntime.exec("myCalc paramC1 paramC2 paramC3");
Process calc4 = Runtime.getRuntime.exec("myCalc paramD1 paramD2 paramD3");
calc1.waitFor();
calc2.waitFor();
calc3.waitFor();
calc4.waitFor();
InputStream is1 = calc1.getInputStream();
InputStreamReader isr1 = new InputStreamReader(is1);
BufferedReader br1 = new BufferedReader(isr1);
String resultStr1 = br1.nextLine();
InputStream is2 = calc2.getInputStream();
InputStreamReader isr2 = new InputStreamReader(is2);
BufferedReader br2 = new BufferedReader(isr2);
String resultStr2 = br2.nextLine();
InputStream is3 = calc3.getInputStream();
InputStreamReader isr3 = new InputStreamReader(is3);
BufferedReader br3 = new BufferedReader(isr3);
String resultStr3 = br3.nextLine();
InputStream is4 = calc4.getInputStream();
InputStreamReader isr4 = new InputStreamReader(is4);
BufferedReader br4 = new BufferedReader(isr4);
String resultStr4 = br4.nextLine();
What is a way to simply wait for all threaded process to finish? For example, let's say I have:
public class DoSomethingInAThread implements Runnable{
public static void main(String[] args) {
for (int n=0; n<1000; n++) {
Thread t = new Thread(new DoSomethingInAThread());
t.start();
}
// wait for all threads' run() methods to complete before continuing
}
public void run() {
// do something here
}
}
How do I alter this so the main() method pauses at the comment until all threads' run() methods exit? Thanks!
You put all threads in an array, start them all, and then have a loop
for(i = 0; i < threads.length; i++)
threads[i].join();
Each join will block until the respective thread has completed. Threads may complete in a different order than you joining them, but that's not a problem: when the loop exits, all threads are completed.
One way would be to make a List of Threads, create and launch each thread, while adding it to the list. Once everything is launched, loop back through the list and call join() on each one. It doesn't matter what order the threads finish executing in, all you need to know is that by the time that second loop finishes executing, every thread will have completed.
A better approach is to use an ExecutorService and its associated methods:
List<Callable> callables = ... // assemble list of Callables here
// Like Runnable but can return a value
ExecutorService execSvc = Executors.newCachedThreadPool();
List<Future<?>> results = execSvc.invokeAll(callables);
// Note: You may not care about the return values, in which case don't
// bother saving them
Using an ExecutorService (and all of the new stuff from Java 5's concurrency utilities) is incredibly flexible, and the above example barely even scratches the surface.
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
public class DoSomethingInAThread implements Runnable
{
public static void main(String[] args) throws ExecutionException, InterruptedException
{
//limit the number of actual threads
int poolSize = 10;
ExecutorService service = Executors.newFixedThreadPool(poolSize);
List<Future<Runnable>> futures = new ArrayList<Future<Runnable>>();
for (int n = 0; n < 1000; n++)
{
Future f = service.submit(new DoSomethingInAThread());
futures.add(f);
}
// wait for all tasks to complete before continuing
for (Future<Runnable> f : futures)
{
f.get();
}
//shut down the executor service so that this thread can exit
service.shutdownNow();
}
public void run()
{
// do something here
}
}
instead of join(), which is an old API, you can use CountDownLatch. I have modified your code as below to fulfil your requirement.
import java.util.concurrent.*;
class DoSomethingInAThread implements Runnable{
CountDownLatch latch;
public DoSomethingInAThread(CountDownLatch latch){
this.latch = latch;
}
public void run() {
try{
System.out.println("Do some thing");
latch.countDown();
}catch(Exception err){
err.printStackTrace();
}
}
}
public class CountDownLatchDemo {
public static void main(String[] args) {
try{
CountDownLatch latch = new CountDownLatch(1000);
for (int n=0; n<1000; n++) {
Thread t = new Thread(new DoSomethingInAThread(latch));
t.start();
}
latch.await();
System.out.println("In Main thread after completion of 1000 threads");
}catch(Exception err){
err.printStackTrace();
}
}
}
Explanation:
CountDownLatch has been initialized with given count 1000 as per your requirement.
Each worker thread DoSomethingInAThread will decrement the CountDownLatch, which has been passed in constructor.
Main thread CountDownLatchDemo await() till the count has become zero. Once the count has become zero, you will get below line in output.
In Main thread after completion of 1000 threads
More info from oracle documentation page
public void await()
throws InterruptedException
Causes the current thread to wait until the latch has counted down to zero, unless the thread is interrupted.
Refer to related SE question for other options:
wait until all threads finish their work in java
Avoid the Thread class altogether and instead use the higher abstractions provided in java.util.concurrent
The ExecutorService class provides the method invokeAll that seems to do just what you want.
Consider using java.util.concurrent.CountDownLatch. Examples in javadocs
Depending on your needs, you may also want to check out the classes CountDownLatch and CyclicBarrier in the java.util.concurrent package. They can be useful if you want your threads to wait for each other, or if you want more fine-grained control over the way your threads execute (e.g., waiting in their internal execution for another thread to set some state). You could also use a CountDownLatch to signal all of your threads to start at the same time, instead of starting them one by one as you iterate through your loop. The standard API docs have an example of this, plus using another CountDownLatch to wait for all threads to complete their execution.
As Martin K suggested java.util.concurrent.CountDownLatch seems to be a better solution for this. Just adding an example for the same
public class CountDownLatchDemo
{
public static void main (String[] args)
{
int noOfThreads = 5;
// Declare the count down latch based on the number of threads you need
// to wait on
final CountDownLatch executionCompleted = new CountDownLatch(noOfThreads);
for (int i = 0; i < noOfThreads; i++)
{
new Thread()
{
#Override
public void run ()
{
System.out.println("I am executed by :" + Thread.currentThread().getName());
try
{
// Dummy sleep
Thread.sleep(3000);
// One thread has completed its job
executionCompleted.countDown();
}
catch (InterruptedException e)
{
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}.start();
}
try
{
// Wait till the count down latch opens.In the given case till five
// times countDown method is invoked
executionCompleted.await();
System.out.println("All over");
}
catch (InterruptedException e)
{
e.printStackTrace();
}
}
}
If you make a list of the threads, you can loop through them and .join() against each, and your loop will finish when all the threads have. I haven't tried it though.
http://docs.oracle.com/javase/8/docs/api/java/lang/Thread.html#join()
Create the thread object inside the first for loop.
for (int i = 0; i < threads.length; i++) {
threads[i] = new Thread(new Runnable() {
public void run() {
// some code to run in parallel
}
});
threads[i].start();
}
And then so what everyone here is saying.
for(i = 0; i < threads.length; i++)
threads[i].join();
You can do it with the Object "ThreadGroup" and its parameter activeCount:
As an alternative to CountDownLatch you can also use CyclicBarrier e.g.
public class ThreadWaitEx {
static CyclicBarrier barrier = new CyclicBarrier(100, new Runnable(){
public void run(){
System.out.println("clean up job after all tasks are done.");
}
});
public static void main(String[] args) {
for (int i = 0; i < 100; i++) {
Thread t = new Thread(new MyCallable(barrier));
t.start();
}
}
}
class MyCallable implements Runnable{
private CyclicBarrier b = null;
public MyCallable(CyclicBarrier b){
this.b = b;
}
#Override
public void run(){
try {
//do something
System.out.println(Thread.currentThread().getName()+" is waiting for barrier after completing his job.");
b.await();
} catch (InterruptedException e) {
e.printStackTrace();
} catch (BrokenBarrierException e) {
e.printStackTrace();
}
}
}
To use CyclicBarrier in this case barrier.await() should be the last statement i.e. when your thread is done with its job. CyclicBarrier can be used again with its reset() method. To quote javadocs:
A CyclicBarrier supports an optional Runnable command that is run once per barrier point, after the last thread in the party arrives, but before any threads are released. This barrier action is useful for updating shared-state before any of the parties continue.
The join() was not helpful to me. see this sample in Kotlin:
val timeInMillis = System.currentTimeMillis()
ThreadUtils.startNewThread(Runnable {
for (i in 1..5) {
val t = Thread(Runnable {
Thread.sleep(50)
var a = i
kotlin.io.println(Thread.currentThread().name + "|" + "a=$a")
Thread.sleep(200)
for (j in 1..5) {
a *= j
Thread.sleep(100)
kotlin.io.println(Thread.currentThread().name + "|" + "$a*$j=$a")
}
kotlin.io.println(Thread.currentThread().name + "|TaskDurationInMillis = " + (System.currentTimeMillis() - timeInMillis))
})
t.start()
}
})
The result:
Thread-5|a=5
Thread-1|a=1
Thread-3|a=3
Thread-2|a=2
Thread-4|a=4
Thread-2|2*1=2
Thread-3|3*1=3
Thread-1|1*1=1
Thread-5|5*1=5
Thread-4|4*1=4
Thread-1|2*2=2
Thread-5|10*2=10
Thread-3|6*2=6
Thread-4|8*2=8
Thread-2|4*2=4
Thread-3|18*3=18
Thread-1|6*3=6
Thread-5|30*3=30
Thread-2|12*3=12
Thread-4|24*3=24
Thread-4|96*4=96
Thread-2|48*4=48
Thread-5|120*4=120
Thread-1|24*4=24
Thread-3|72*4=72
Thread-5|600*5=600
Thread-4|480*5=480
Thread-3|360*5=360
Thread-1|120*5=120
Thread-2|240*5=240
Thread-1|TaskDurationInMillis = 765
Thread-3|TaskDurationInMillis = 765
Thread-4|TaskDurationInMillis = 765
Thread-5|TaskDurationInMillis = 765
Thread-2|TaskDurationInMillis = 765
Now let me use the join() for threads:
val timeInMillis = System.currentTimeMillis()
ThreadUtils.startNewThread(Runnable {
for (i in 1..5) {
val t = Thread(Runnable {
Thread.sleep(50)
var a = i
kotlin.io.println(Thread.currentThread().name + "|" + "a=$a")
Thread.sleep(200)
for (j in 1..5) {
a *= j
Thread.sleep(100)
kotlin.io.println(Thread.currentThread().name + "|" + "$a*$j=$a")
}
kotlin.io.println(Thread.currentThread().name + "|TaskDurationInMillis = " + (System.currentTimeMillis() - timeInMillis))
})
t.start()
t.join()
}
})
And the result:
Thread-1|a=1
Thread-1|1*1=1
Thread-1|2*2=2
Thread-1|6*3=6
Thread-1|24*4=24
Thread-1|120*5=120
Thread-1|TaskDurationInMillis = 815
Thread-2|a=2
Thread-2|2*1=2
Thread-2|4*2=4
Thread-2|12*3=12
Thread-2|48*4=48
Thread-2|240*5=240
Thread-2|TaskDurationInMillis = 1568
Thread-3|a=3
Thread-3|3*1=3
Thread-3|6*2=6
Thread-3|18*3=18
Thread-3|72*4=72
Thread-3|360*5=360
Thread-3|TaskDurationInMillis = 2323
Thread-4|a=4
Thread-4|4*1=4
Thread-4|8*2=8
Thread-4|24*3=24
Thread-4|96*4=96
Thread-4|480*5=480
Thread-4|TaskDurationInMillis = 3078
Thread-5|a=5
Thread-5|5*1=5
Thread-5|10*2=10
Thread-5|30*3=30
Thread-5|120*4=120
Thread-5|600*5=600
Thread-5|TaskDurationInMillis = 3833
As it's clear when we use the join:
The threads are running sequentially.
The first sample takes 765 Milliseconds while the second sample takes 3833 Milliseconds.
Our solution to prevent blocking other threads was creating an ArrayList:
val threads = ArrayList<Thread>()
Now when we want to start a new thread we most add it to the ArrayList:
addThreadToArray(
ThreadUtils.startNewThread(Runnable {
...
})
)
The addThreadToArray function:
#Synchronized
fun addThreadToArray(th: Thread) {
threads.add(th)
}
The startNewThread funstion:
fun startNewThread(runnable: Runnable) : Thread {
val th = Thread(runnable)
th.isDaemon = false
th.priority = Thread.MAX_PRIORITY
th.start()
return th
}
Check the completion of the threads as below everywhere it's needed:
val notAliveThreads = ArrayList<Thread>()
for (t in threads)
if (!t.isAlive)
notAliveThreads.add(t)
threads.removeAll(notAliveThreads)
if (threads.size == 0){
// The size is 0 -> there is no alive threads.
}
The problem with:
for(i = 0; i < threads.length; i++)
threads[i].join();
...is, that threads[i + 1] never can join before threads[i].
Except the "latch"ed ones, all solutions have this lack.
No one here (yet) mentioned ExecutorCompletionService, it allows to join threads/tasks according to their completion order:
public class ExecutorCompletionService<V>
extends Object
implements CompletionService<V>
A CompletionService that uses a supplied Executor to execute tasks. This class arranges that submitted tasks are, upon completion, placed on a queue accessible using take. The class is lightweight enough to be suitable for transient use when processing groups of tasks.
Usage Examples.
Suppose you have a set of solvers for a certain problem, each returning a value of some type Result, and would like to run them concurrently, processing the results of each of them that return a non-null value, in some method use(Result r). You could write this as:
void solve(Executor e, Collection<Callable<Result>> solvers) throws InterruptedException, ExecutionException {
CompletionService<Result> cs = new ExecutorCompletionService<>(e);
solvers.forEach(cs::submit);
for (int i = solvers.size(); i > 0; i--) {
Result r = cs.take().get();
if (r != null)
use(r);
}
}
Suppose instead that you would like to use the first non-null result of the set of tasks, ignoring any that encounter exceptions, and cancelling all other tasks when the first one is ready:
void solve(Executor e, Collection<Callable<Result>> solvers) throws InterruptedException {
CompletionService<Result> cs = new ExecutorCompletionService<>(e);
int n = solvers.size();
List<Future<Result>> futures = new ArrayList<>(n);
Result result = null;
try {
solvers.forEach(solver -> futures.add(cs.submit(solver)));
for (int i = n; i > 0; i--) {
try {
Result r = cs.take().get();
if (r != null) {
result = r;
break;
}
} catch (ExecutionException ignore) {}
}
} finally {
futures.forEach(future -> future.cancel(true));
}
if (result != null)
use(result);
}
Since: 1.5 (!)
Assuming use(r) (of Example 1) also asynchronous, we had a big advantage. #