So I have an ExecutorService successfully blocking and running linearly right now. My trouble is, I am trying to add a status update and I can't figure out how to get Futures to settle one-item at a time. It seems that by the time the first item in my Future<> is ready so is the last. I'm hoping to find a place where I can know how many tasks my executorService has remaining/total so I can calculate a simple percentage indicator. Please note I intend on recycling my Executor and don't want to shut it down.
ExecutorService updateService = Executors.newSingleThreadExecutor();
Callable<String> callHour = () -> {
//doStuff, unaware of total number of hourCalls
return "done";
};
private void startMe(int hours){
List<Future<String>> futureHours;
List<Callable<String>> hourCalls = new ArrayList<>(hours);
for (int hour = 0; hour < hours; ++hour) {
hourCalls.add(callHour); //queue list (not running yet)
}
try {
//executes queue and blocks thread
futureHours = updateService.invokeAll(hourCalls);
futureHours.get(0).get();//performs blocking
} catch (Exception e) {
e.printStackTrace();
}
}
}
There are two things at work here.
Firstly, if we take a look at the documentation of ExecutorService#invokeAll(...), we see that it returns
[...] a list of Futures holding their status and results when all complete. [...]
(emphasis added by me)
You most probably want to use Executor#submit(...) instead.
Secondly, you have no guarantee that the task coupled to futureHours.get(0) is executed first. I would suggest using Future#isDone() with some additional logic:
private void startMe(int hours) {
[...]
try {
[...]
ArrayList<Future<String>> futureHoursDone = new ArrayList<>();
final int numTasks = futureHours.size();
int done = 0;
double percentageDone = 0.0d;
while (futureHours.isEmpty() == false) {
for (int index = 0; index < futureHours.size(); ++index) {
Future<String> futureHour = futureHours.get(index);
if (futureHour.isDone()) {
futureHours.remove(index);
futureHoursDone.add(futureHour);
--index;
++done;
percentageDone = done / (double) numTasks;
}
}
}
} catch (Exception e) {
// TODO: don't forget to HCF (https://en.wikipedia.org/wiki/Halt_and_Catch_Fire) :)
e.printStackTrace();
}
}
(This is a rough sketch. To make the progress, i.e. percentage, visible to the outside, you would have to make it an attribute and accessible through, e.g., some getter)
Related
I have a problem which I would like to solve using Java's ExecutorService and Future classes. I am currently taking many samples from a function that is very expensive for me to compute (each sample can take several minutes) using a for loop. I have a class FunctionEvaluator that evaluates this function for me and this class is quite expensive to instantiate, since it contains a lot of internal memory, so I have made this class easily reusable with some internal counters and a reset() method. So my current situation looks like this:
int numSamples = 100;
int amountOfData = 1000000;
double[] data = new double[amountOfData];//Data comes from somewhere...
double[] results = new double[numSamples];
//a lot of memory contained inside the FunctionEvaluator class,
//expensive to intialise
FunctionEvaluator fe = new FunctionEvaluator();
for(int i=0; i<numSamples; i++) {
results[i] = fe.sampleAt(i, data);//very expensive computation
}
but I would like to get some multithreading going to speed things up. It should be easy enough, because while each sample will share whatever is inside of data, it is a read-only operation and each sample is independent of any other. Now I wouldn't be having any trouble with this since I've used Java's Future and ExecutorService before, but never in a context where the Callable had to be re-used. So in general, how would I go about setting this scenario up given that I can afford to run n instantiations of FunctionEvaluator? Something (very roughly) like this:
int numSamples = 100;
int amountOfData = 1000000;
int N = 10;
double[] data = new double[amountOfData];//Data comes from somewhere...
double[] results = new double[numSamples];
//a lot of memory contained inside the FunctionEvaluator class,
//expensive to intialise
FunctionEvaluator[] fe = new FunctionEvaluator[N];
for(int i=0; i<numSamples; i++) {
//Somehow add available FunctionEvaluators to an ExecutorService
//so that N FunctionEvaluators can run in parallel. When a
//FunctionEvaluator is finished, reset then compute a new sample
//until numSamples samples have been taken.
}
Any help would be greatly appreciated! Many thanks.
EDIT
So here is a toy example (which doesn't work :P). In this case the "expensive function" that I want to sample is just squaring an integer and the "expensive to instantiate class" that does it for me is called CallableComputation:
In TestConc.java:
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import java.util.concurrent.TimeUnit;
public class TestConc {
public static void main(String[] args) {
SquareCalculator squareCalculator = new SquareCalculator();
int numFunctionEvaluators = 2;
int numSamples = 10;
ExecutorService executor = Executors.newFixedThreadPool(2);
CallableComputation c1 = new CallableComputation(2);
CallableComputation c2 = new CallableComputation(3);
CallableComputation[] callables = new CallableComputation[numFunctionEvaluators];
Future<Integer>[] futures = (new Future[numFunctionEvaluators]);
int[] results = new int[numSamples];
for(int i=0; i<numFunctionEvaluators; i++) {
callables[i] = new CallableComputation(i);
futures[i] = executor.submit(callables[i]);
}
futures[0] = executor.submit(c1);
futures[1] = executor.submit(c2);
for(int i=numFunctionEvaluators; i<numSamples; ) {
for(int j=0; j<futures.length; j++) {
if(futures[j].isDone()) {
try {
results[i] = futures[j].get();
}
catch (InterruptedException e) {
e.printStackTrace();
}
catch (ExecutionException e) {
e.printStackTrace();
}
callables[j].set(i);
System.out.printf("Function evaluator %d given %d\n", j, i+1);
executor.submit(callables[j]);
i++;
}
}
}
executor.shutdown();
try {
executor.awaitTermination(1, TimeUnit.MINUTES);
}
catch (InterruptedException e) {
e.printStackTrace();
}
for (int i=0; i<results.length; i++) {
System.out.printf("res%d=%d, ", i, results[i]);
}
System.out.println();
}
private static boolean areDone(Future<Integer>[] futures) {
for(int i=0; i<futures.length; i++) {
if(!futures[i].isDone()) {
return false;
}
}
return true;
}
private static void printFutures(Future<Integer>[] futures) {
for (int i=0; i<futures.length; i++) {
System.out.printf("f%d=%s | ", i, futures[i].isDone()?"done" : "not done");
}System.out.printf("\n");
}
}
In CallableComputation.java:
import java.util.concurrent.Callable;
public class CallableComputation implements Callable<Integer>{
int input = 0;
public CallableComputation(int input) {
this.input = input;
}
public void set(int i) {
input = i;
}
#Override
public Integer call() throws Exception {
System.out.printf("currval=%d\n", input);
Thread.sleep(500);
return input * input;
}
}
In Java8:
double[] result = IntStream.range(0, numSamples)
.parallel()
.mapToDouble(i->fe.sampleAt(i, data))
.toArray();
The question asks how to execute heavy computational functions in parallel by loading as many CPU as possible.
Exert from the Parallelism tutorial:
Parallel computing involves dividing a problem into subproblems,
solving those problems simultaneously (in parallel, with each
subproblem running in a separate thread), and then combining the
results of the solutions to the subproblems. Java SE provides the
fork/join framework, which enables you to more easily implement
parallel computing in your applications. However, with this framework,
you must specify how the problems are subdivided (partitioned). With
aggregate operations, the Java runtime performs this partitioning and
combining of solutions for you.
The actual solution includes:
IntStream.range will generate the stream of integers from 0 to numSamples.
parallel() will split the stream and execute it will all available CPU on the box.
mapToDouble() will convert the stream of integers to the stream of doubles by applying the lamba expression that will do actual work.
toArray() is a terminal operation that will aggregate the result and return it as an array.
There is no special code change required, you can use the same Callable again and again without any issue. Also, to improve efficiency, as you are saying, creating an instance of FunctionEvaluator is expensive, you can use only one instance and ensure that sampleAt is thread safe. One option is, maybe you can use all function local variables and don't modify any of the passing argument at any point of time while any of the thread is running
Please find a quick example below:
Code Snippet:
ExecutorService executor = Executors.newFixedThreadPool(2);
Callable<String> task1 = new Callable<String>(){public String call(){System.out.println(Thread.currentThread()+"currentThread");return null;}}
executor.submit(task1);
executor.submit(task1);
executor.shutdown();
Please find the screenshot below:
You can wrap each FunctionEvaluator's actual work as a Callable/Runnanle, then using a fixdThreadPool with a queue, then you just need to sumbit the target callable/runnable to the threadPool.
I would like to get some multithreading going to speed things up.
Sounds like a good idea but your code is massively over complex. #Pavel has a dead simple Java 8 solution but even without Java 8 you can make it a lot easier.
All you need to do is to submit the jobs into the executor and then call get() on each one of the Futures that are returned. A Callable class is not needed although it does make the code a lot cleaner. But you certainly don't need the arrays which are a bad pattern anyway because a typo can easily generate out-of-bounds exceptions. Stick to collections or Java 8 streams.
ExecutorService executor = Executors.newFixedThreadPool(2);
List<Future<Integer>> futureList = new ArrayList<Future<Integer>>();
for (int i = 0; i < numSamples; i++ ) {
// start the jobs running in the background
futureList.add(executor.subject(new CallableComputation(i));
}
// shutdown executor if done submitting tasks, submitted jobs will keep running
executor.shutdown();
for (Future<Integer> future : futureList) {
// this will wait for the future to finish, it also throws some exceptions
Integer result = future.get();
// add result to a collection or something here
}
In my app there are 2 phases, one download some big data, and the other manipulates it.
so i created 2 classes which implements runnable: ImageDownloader and ImageManipulator, and they share a downloadedBlockingQueue:
public class ImageDownloader implements Runnable {
private ArrayBlockingQueue<ImageBean> downloadedImagesBlockingQueue;
private ArrayBlockingQueue<String> imgUrlsBlockingQueue;
public ImageDownloader(ArrayBlockingQueue<String> imgUrlsBlockingQueue, ArrayBlockingQueue<ImageBean> downloadedImagesBlockingQueue) {
this.downloadedImagesBlockingQueue = downloadedImagesBlockingQueue;
this.imgUrlsBlockingQueue = imgUrlsBlockingQueue;
}
#Override
public void run() {
while (!this.imgUrlsBlockingQueue.isEmpty()) {
try {
String imgUrl = this.imgUrlsBlockingQueue.take();
ImageBean imageBean = doYourThing(imgUrl);
this.downloadedImagesBlockingQueue.add(imageBean);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
}
public class ImageManipulator implements Runnable {
private ArrayBlockingQueue<ImageBean> downloadedImagesBlockingQueue;
private AtomicInteger capacity;
public ImageManipulator(ArrayBlockingQueue<ImageBean> downloadedImagesBlockingQueue,
AtomicInteger capacity) {
this.downloadedImagesBlockingQueue = downloadedImagesBlockingQueue;
this.capacity = capacity;
}
#Override
public void run() {
while (capacity.get() > 0) {
try {
ImageBean imageBean = downloadedImagesBlockingQueue.take(); // <- HERE I GET THE DEADLOCK
capacity.decrementAndGet();
} catch (InterruptedException e) {
e.printStackTrace();
}
// ....
}
}
}
public class Main {
public static void main(String[] args) {
String[] imageUrls = new String[]{"url1", "url2"};
int capacity = imageUrls.length;
ArrayBlockingQueue<String> imgUrlsBlockingQueue = initImgUrlsBlockingQueue(imageUrls, capacity);
ArrayBlockingQueue<ImageBean> downloadedImagesBlockingQueue = new ArrayBlockingQueue<>(capacity);
ExecutorService downloaderExecutor = Executors.newFixedThreadPool(3);
for (int i = 0; i < 3; i++) {
Runnable worker = new ImageDownloader(imgUrlsBlockingQueue, downloadedImagesBlockingQueue);
downloaderExecutor.execute(worker);
}
downloaderExecutor.shutdown();
ExecutorService manipulatorExecutor = Executors.newFixedThreadPool(3);
AtomicInteger manipulatorCapacity = new AtomicInteger(capacity);
for (int i = 0; i < 3; i++) {
Runnable worker = new ImageManipulator(downloadedImagesBlockingQueue, manipulatorCapacity);
manipulatorExecutor.execute(worker);
}
manipulatorExecutor.shutdown();
while (!downloaderExecutor.isTerminated() && !manipulatorExecutor.isTerminated()) {
}
}
}
The deadlock happens because this scenario:
t1 checks capacity its 1.
t2 checks its 1.
t3 checks its 1.
t2 takes, sets capacity to 0, continue with flow and eventually exits.
t1 and t3 now on deadlock, cause there will be no adding to the downloadedImagesBlockingQueue.
Eventually i want something like that: when the capacity is reached && the queue is empty = break the "while" loop, and terminate gracefully.
to set "is queue empty" as only condition won't work, cause in the start it is empty, until some ImageDownloader puts a imageBean into the queue.
There area a couple of things you can do to prevent deadlock:
Use a LinkedBlockingQueue which has a capacity
Use offer to add to the queue which does not block
Use drainTo or poll to take items from the queue which are not blocking
There are also some tips you might want to consider:
Use a ThreadPool:
final ExecutorService executorService = Executors.newFixedThreadPool(4);
If you use a fixed size ThreadPool you can add "poison pill"s when you finished adding data to the queue corresponding to the size of your ThreadPool and check it when you poll
Using a ThreadPool is as simple as this:
final ExecutorService executorService = Executors.newFixedThreadPool(4);
final Future<?> result = executorService.submit(new Runnable() {
#Override
public void run() {
}
});
There is also the less known ExecutorCompletionService which abstracts this whole process. More info here.
You don't need the capacity in your consumer. It's now read and updated in multiple threads, which cause the synchronization issue.
initImgUrlsBlockingQueue creates the url blocking queue with capacity number of URL items. (Right?)
ImageDownloader consumes the imgUrlsBlockingQueue and produce images, it terminates when all the URLs are downloaded, or, if capacity means number of images that should be downloaded because there may be some failure, it terminates when it added capacity number of images.
Before ImageDownloader terminates, it add a marker in to the downloadedImagesBlockingQueue, for example, a null element, a static final ImageBean static final ImageBean marker = new ImageBean().
All ImageManipulator drains the queue use the following construct, and when it sees the null element, it add it to the queue again and terminate.
// use identity comparison
while ((imageBean = downloadedImagesBlockingQueue.take()) != marker) {
// process image
}
downloadedImagesBlockingQueue.add(marker);
Note that the BlockingQueue promises its method call it atomic, however, if you check it's capacity first, and consume an element according to the capacity, the action group won't be atomic.
Well i used some of the features suggested, but this is the complete solution for me, the one which does not busy waiting and wait until the Downloader notify it.
public ImageManipulator(LinkedBlockingQueue<ImageBean> downloadedImagesBlockingQueue,
LinkedBlockingQueue<ImageBean> manipulatedImagesBlockingQueue,
AtomicInteger capacity,
ManipulatedData manipulatedData,
ReentrantLock downloaderReentrantLock,
ReentrantLock manipulatorReentrantLock,
Condition downloaderNotFull,
Condition manipulatorNotFull) {
this.downloadedImagesBlockingQueue = downloadedImagesBlockingQueue;
this.manipulatedImagesBlockingQueue = manipulatedImagesBlockingQueue;
this.capacity = capacity;
this.downloaderReentrantLock = downloaderReentrantLock;
this.manipulatorReentrantLock = manipulatorReentrantLock;
this.downloaderNotFull = downloaderNotFull;
this.manipulatorNotFull = manipulatorNotFull;
this.manipulatedData = manipulatedData;
}
#Override
public void run() {
while (capacity.get() > 0) {
downloaderReentrantLock.lock();
if (capacity.get() > 0) { //checks if the value is updated.
ImageBean imageBean = downloadedImagesBlockingQueue.poll();
if (imageBean != null) { // will be null if no downloader finished is work (successfully downloaded or not)
capacity.decrementAndGet();
if (capacity.get() == 0) { //signal all the manipulators to wake up and stop waiting for downloaded images.
downloaderNotFull.signalAll();
}
downloaderReentrantLock.unlock();
if (imageBean.getOriginalImage() != null) { // the downloader will set it null iff it failes to download it.
// business logic
}
manipulatedImagesBlockingQueue.add(imageBean);
signalAllPersisters(); // signal the persisters (which has the same lock/unlock as this manipulator.
} else {
try {
downloaderNotFull.await(); //manipulator will wait for downloaded image - downloader will signalAllManipulators (same as signalAllPersisters() here) when an imageBean will be inserted to queue.
downloaderReentrantLock.unlock();
} catch (InterruptedException e) {
logger.log(Level.ERROR, e.getMessage(), e);
}
}
}
}
logger.log(Level.INFO, "Manipulator: " + Thread.currentThread().getId() + " Ended Gracefully");
}
private void signalAllPersisters() {
manipulatorReentrantLock.lock();
manipulatorNotFull.signalAll();
manipulatorReentrantLock.unlock();
}
For full flow you can check this project on my github: https://github.com/roy-key/image-service/
Your issue is that you are trying to use a counter to track queue elements and aren't composing operations that need to be atomic. You are doing check, take, decrement. This allows the queue size and counter to desynchronize and your threads block forever. It would be better to write a synchronization primitive that is 'closeable' so that you don't have to keep an associated counter. However, a quick fix would be to change it so you are get and decrementing the counter atomically:
while (capacity.getAndDecrement() > 0) {
try {
ImageBean imageBean = downloadedImagesBlockingQueue.take();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
In this case if there are 3 threads and only one element left in the queue then only one thread will atomically decrement the counter and see that it can take without blocking. Both other threads will see 0 or <0 and break out of the loop.
You also need to make all of your class instance variables final so that they have the correct memory visibility. You should also determine how you are going to handle interrupts rather than relying on the default print trace template.
I have written a simple program, that is intended to start a few threads. The threads should then pick a integer n from an integer array, use it to wait n and return the time t the thread waited back into an array for the results.
If one thread finishes it's task, it should pick the next one, that has not yet being assigned to another thread.
Of course: The order in the arrays has to be maintained, so that integers and results match.
My code runs smoothly as far I see.
However I use one line of code block I find in particular unsatisfying and hope there is a good way to fix this without changing too much:
while(Thread.activeCount() != 1); // first evil line
I kinda abuse this line to make sure all my threads finish getting all the tasks done, before I access my array with the results. I want to do that to prevent ill values, like 0.0, Null Pointer Exception... etc. (in short anything that would make an application with an actual use crash)
Any sort of constructive help is appreciated. I am also not sure, if my code still runs smoothly for very very long arrays of tasks for the threads, for example the results no longer match the order of the integer.
Any constructive help is appreciated.
First class:
public class ThreadArrayWriterTest {
int[] repitions;
int len = 0;
double[] timeConsumed;
public boolean finished() {
synchronized (repitions) {
return len <= 0;
}
}
public ThreadArrayWriterTest(int[] repitions) {
this.repitions = repitions;
this.len = repitions.length;
timeConsumed = new double[this.len];
}
public double[] returnTimes(int[] repititions, int numOfThreads, TimeConsumer timeConsumer) {
for (int i = 0; i < numOfThreads; i++) {
new Thread() {
public void run() {
while (!finished()) {
len--;
timeConsumed[len] = timeConsumer.returnTimeConsumed(repititions[len]);
}
}
}.start();
}
while (Thread.activeCount() != 1) // first evil line
;
return timeConsumed;
}
public static void main(String[] args) {
long begin = System.currentTimeMillis();
int[] repitions = { 3, 1, 3, 1, 2, 1, 3, 3, 3 };
int numberOfThreads = 10;
ThreadArrayWriterTest t = new ThreadArrayWriterTest(repitions);
double[] times = t.returnTimes(repitions, numberOfThreads, new TimeConsumer());
for (double d : times) {
System.out.println(d);
}
long end = System.currentTimeMillis();
System.out.println("Total time of execution: " + (end - begin));
}
}
Second class:
public class TimeConsumer {
double returnTimeConsumed(int repitions) {
long before = System.currentTimeMillis();
for (int i = 0; i < repitions; i++) {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
long after = System.currentTimeMillis();
double ret = after - before;
System.out.println("It takes: " + ret + "ms" + " for " + repitions + " runs through the for-loop");
return ret;
}
}
The easiest way to wait for all threads to complete is to keep a Collection of them and then call Thread.join() on each one in turn.
In addition to .join() you can use ExecutorService to manage pools of threads,
An Executor that provides methods to manage termination and methods
that can produce a Future for tracking progress of one or more
asynchronous tasks.
An ExecutorService can be shut down, which will cause it to reject new
tasks. Two different methods are provided for shutting down an
ExecutorService. The shutdown() method will allow previously submitted
tasks to execute before terminating, while the shutdownNow() method
prevents waiting tasks from starting and attempts to stop currently
executing tasks. Upon termination, an executor has no tasks actively
executing, no tasks awaiting execution, and no new tasks can be
submitted. An unused ExecutorService should be shut down to allow
reclamation of its resources.
Method submit extends base method Executor.execute(Runnable) by
creating and returning a Future that can be used to cancel execution
and/or wait for completion. Methods invokeAny and invokeAll perform
the most commonly useful forms of bulk execution, executing a
collection of tasks and then waiting for at least one, or all, to
complete.
ExecutorService executorService = Executors.newFixedThreadPool(maximumNumberOfThreads);
CompletionService completionService = new ExecutorCompletionService(executorService);
for (int i = 0; i < numberOfTasks; ++i) {
completionService.take();
}
executorService.shutdown();
Plus take a look at ThreadPoolExecutor
Since java provides more advanced threading API with concurrent package, You should have look into ExecutorService, which simplifies thread management mechanism.
Simple to solution to your problem.
Use Executors API to create thread pool
static ExecutorService newFixedThreadPool(int nThreads)
Creates a thread pool that reuses a fixed number of threads operating off a shared unbounded queue.
Use invokeAll to wait for all tasks to complete.
Sample code:
ExecutorService service = Executors.newFixedThreadPool(10);
List<MyCallable> futureList = new ArrayList<MyCallable>();
for ( int i=0; i<12; i++){
MyCallable myCallable = new MyCallable((long)i);
futureList.add(myCallable);
}
System.out.println("Start");
try{
List<Future<Long>> futures = service.invokeAll(futureList);
for(Future<Long> future : futures){
try{
System.out.println("future.isDone = " + future.isDone());
System.out.println("future: call ="+future.get());
}
catch(Exception err1){
err1.printStackTrace();
}
}
}catch(Exception err){
err.printStackTrace();
}
service.shutdown();
Refer to this related SE question for more details on achieving the same:
wait until all threads finish their work in java
Good day,
I am writing a program where a method is called for each line read from a text file. As each call of this method is independent of any other line read I can call them on parallel. To maximize cpu usage I use a ExecutorService where I submit each run() call. As the text file has 15 million lines, I need to stagger the ExecutorService run to not create too many jobs at once (OutOfMemory exception). I also want to keep track of the time each submitted run has been running as I have seen that some are not finishing. The problem is that when I try to use the Future.get method with timeout, the timeout refers to the time since it got into the queue of the ExecutorService, not since it started running, if it even started. I would like to get the time since it started running, not since it got into the queue.
The code looks like this:
ExecutorService executorService= Executors.newFixedThreadPool(ncpu);
line = reader.readLine();
long start = System.currentTimeMillis();
HashMap<MyFut,String> runs = new HashMap<MyFut, String>();
HashMap<Future, MyFut> tasks = new HashMap<Future, MyFut>();
while ( (line = reader.readLine()) != null ) {
String s = line.split("\t")[1];
final String m = line.split("\t")[0];
MyFut f = new MyFut(s, m);
tasks.put(executorService.submit(f), f);
runs.put(f, line);
while (tasks.size()>ncpu*100){
try {
Thread.sleep(100);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
Iterator<Future> i = tasks.keySet().iterator();
while(i.hasNext()){
Future task = i.next();
if (task.isDone()){
i.remove();
} else {
MyFut fut = tasks.get(task);
if (fut.elapsed()>10000){
System.out.println(line);
task.cancel(true);
i.remove();
}
}
}
}
}
private static class MyFut implements Runnable{
private long start;
String copy;
String id2;
public MyFut(String m, String id){
super();
copy=m;
id2 = id;
}
public long elapsed(){
return System.currentTimeMillis()-start;
}
#Override
public void run() {
start = System.currentTimeMillis();
do something...
}
}
As you can see I try to keep track of how many jobs I have sent and if a threshold is passed I wait a bit until some have finished. I also try to check if any of the jobs is taking too long to cancel it, keeping in mind which failed, and continue execution. This is not working as I hoped. 10 seconds execution for one task is much more than needed (I get 1000 lines done in 70 to 130s depending on machine and number of cpu).
What am I doing wrong? Shouldn't the run method in my Runnable class be called only when some Thread in the ExecutorService is free and starts working on it? I get a lot of results that take more than 10 seconds. Is there a better way to achieve what I am trying?
Thanks.
If you are using Future, I would recommend change Runnable to Callable and return total time in execution of thread as result. Below is sample code:
import java.util.concurrent.Callable;
public class MyFut implements Callable<Long> {
String copy;
String id2;
public MyFut(String m, String id) {
super();
copy = m;
id2 = id;
}
#Override
public Long call() throws Exception {
long start = System.currentTimeMillis();
//do something...
long end = System.currentTimeMillis();
return (end - start);
}
}
You are making your work harder as it should be. Java’s framework provides everything you want, you only have to use it.
Limiting the number of pending work items works by using a bounded queue, but the ExecutorService returned by Executors.newFixedThreadPool() uses an unbound queue. The policy to wait once the bounded queue is full can be implemented via a RejectedExecutionHandler. The entire thing looks like this:
static class WaitingRejectionHandler implements RejectedExecutionHandler {
public void rejectedExecution(Runnable r, ThreadPoolExecutor executor) {
try {
executor.getQueue().put(r);// block until capacity available
} catch(InterruptedException ex) {
throw new RejectedExecutionException(ex);
}
}
}
public static void main(String[] args)
{
final int nCPU=Runtime.getRuntime().availableProcessors();
final int maxPendingJobs=100;
ExecutorService executorService=new ThreadPoolExecutor(nCPU, nCPU, 1, TimeUnit.MINUTES,
new ArrayBlockingQueue<Runnable>(maxPendingJobs), new WaitingRejectionHandler());
// start flooding the `executorService` with jobs here
That’s all.
Measuring the elapsed time within a job is quite easy as it has nothing to do with multi-threading:
long startTime=System.nanoTime();
// do your work here
long elpasedTimeSoFar = System.nanoTime()-startTime;
But maybe you don’t need it anymore once you are using the bounded queue.
By the way the Future.get method with timeout does not refer to the time since it got into the queue of the ExecutorService, it refers to the time of invoking the get method itself. In other words, it tells how long the get method is allowed to wait, nothing more.
I read in a few posts that using JUnit to test concurrency is not ideal but I have no choice for now. I have just encountered an exception that I can't explain.
I run a test where, in summary:
I submit 1000 runnables to an executor
each runnable adds an element to a list
I wait for the executor termination
JUnit tells me the list only has 999 elements
no exception is printed in the runnable catch block
What could cause that behavior?
Note: I only get the exception from time to time. The code has some non related stuff but I left it there in case I missed something. XXXQuery is an enum.
public void testConcurrent() throws InterruptedException {
final int N_THREADS = 1000;
final XXXData xxxData = new AbstractXXXDataImpl();
final List<QueryResult> results = new ArrayList<>();
ExecutorService executor = Executors.newFixedThreadPool(N_THREADS);
for (int i = 0; i < N_THREADS; i++) {
final int j = i;
executor.submit(new Runnable() {
#Override
public void run() {
try {
results.add(xxxData.get(XXXQuery.values()[j % XXXQuery.values().length]));
} catch (Exception e) {
System.out.println(e);
}
}
});
}
executor.shutdown();
executor.awaitTermination(10, TimeUnit.SECONDS);
assertEquals(N_THREADS, results.size());
}
You cannot add to the results ArrayList in your Runnable.run() method in multiple threads without synchronizing around it.
The assertion failed message is showing that although N_THREADS calls to add() were made, the ArrayList got fewer entries because of concurrency race conditions.
I would use a final array instead of a list. Something like:
final QueryResult[] results = new QueryResult[N_THREADS];
for (int i = 0; i < N_THREADS; i++) {
...
public void run() {
results[j] = data.get(Query.values()[j % Query.values().length]);
}
Also, I don't quite get the XXXQuery.values() but I'd pull that into a variable above the loop unless it is changing.