How to read multiple files using thread pool? - java

I want to read multiple files using a thread pool, but I failed.
#Test
public void test2() throws IOException {
String dir = "/tmp/acc2tid2999928854413665054";
int[] shardIds = new int[]{1, 2};
ExecutorService executorService = Executors.newFixedThreadPool(2);
for (int id : shardIds) {
executorService.submit(() -> {
try {
System.out.println(Files.readAllLines(Paths.get(dir, String.valueOf(id)), Charset.forName("UTF-8")));
} catch (IOException e) {
e.printStackTrace();
}
});
}
}
Above is a simple example I wrote. It cannot reach my purpose.
System.out.println(Files.readAllLines(
Paths.get(dir, String.valueOf(id)), Charset.forName("UTF-8")));
This line will not run and there were no warnings. I don't know why?

You are submitting tasks to be executed then ending the test before waiting for the tasks to complete. ExecutorService::submit will submit the task to be executed in the future and return immediately. Therefore, your for-loop submits the two tasks then ends, and the test function returns before the tasks had the time to complete.
You might try calling ExecutorService::shutdown after the for-loop to let the executor know that all the tasks have been submitted. Then use ExecutorService::awaitTermination to block until the tasks are complete.
For example:
#Test
public void test2() throws IOException {
String dir = "/tmp/acc2tid2999928854413665054";
int[] shardIds = new int[]{1, 2};
ExecutorService executorService = Executors.newFixedThreadPool(2);
for (int id : shardIds) {
executorService.submit(
() -> {
try {
System.out.println(Files.readAllLines(Paths.get(dir, String.valueOf(id)), Charset.forName("UTF-8")));
} catch (IOException e) {
e.printStackTrace();
}
});
}
executorService.shutdown();
executorService.awaitTermination(60, TimeUnit.SECONDS); //Wait up to 1 minute for the tasks to complete
}

Related

ScheduledExecutorService waits for task to complete, does pending tasks pile up to ultimately interrupting main thred?

I was curious for my new implementation using ScheduledExecutorService in which the task is expected to finish within 100ms period and 0ms delay. But in case if there is system load and its taking say 550 ms, would there be a queue maintained by ScheduledExecutorService for those pending 4 tasks? And then run as soon as (0ms delay) first one is finished. And what if second execution takes 560 ms , would that add another 4 threads to its queue?
There is not documentation around that, or I might be overlooking it. But I want to make sure that the pile up of such executions would trigger to leak or overflow.
For example: below code, could main thread ever fail?
private static ScheduledExecutorService consumerThreadPool = Executors.newSingleThreadScheduledExecutor();
public static void main(String[] args) throws Exception {
consumerThreadPool.scheduleAtFixedRate(() -> performTask(), 0, 1, TimeUnit.MILLISECONDS);
}
private static void performTask () {
try {
Thread.sleep(550);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
Your tasks will be skipped if they overrun the next scheduled time, you can verify easily with System.out.println and alter the sleep under 500ms to 5000ms:
public static void main(final String[] args) throws InterruptedException, ExecutionException
{
var executor = Executors.newScheduledThreadPool(1);
var count = new AtomicInteger();
Runnable task = () -> {
String desc = "run("+((System.currentTimeMillis() / 1000) % 60)+") "+Thread.currentThread().getName()+" count "+count.incrementAndGet();
System.out.println(desc);
if(count.get() == 50)
throw new RuntimeException("I give up!");
try
{
Thread.sleep(2500);
}
catch (InterruptedException e)
{
System.out.println("Thread "+Thread.currentThread().getName()+" INTERRUPTED");
}
};
var future = executor.scheduleAtFixedRate(task, 5000, 1000, TimeUnit.MILLISECONDS);
System.out.println("Calling future.get() ...");
try {
var res = future.get();
System.out.println("future.get()="+res);
}
catch(Exception e)
{
System.err.println("There was an exception:" +e);
// Decide between "continue" or "throw e" here
// ...
}
executor.shutdownNow();
System.out.println("shutdown complete");
}

Executor Does Not Return 10 Future Objects

I have an executor service with a Thread Pool of 10 and I expected that I would get 10 print out statements separated by three seconds, but I only receive one print out statement. I passed 10 as the parameter so I was expecting 10 threads to be running. How can I retrieve the 10 future objects?
public class Demo {
private static final ExecutorService executor = Executors.newFixedThreadPool(10);
public static void main (String[] args) throws ExecutionException, InterruptedException {
ArrayList futureObjects = new ArrayList();
Callable<Integer> task = () -> {
try {
TimeUnit.SECONDS.sleep(3);
return 123;
}
catch (InterruptedException e) {
throw new IllegalStateException("task interrupted", e);
}
};
System.out.println("Before execution of threads");
Future<Integer> future = executor.submit(task);
Integer result = future.get();
futureObjects.add(future.get());
System.out.println("result: " + result);
for(Object futures : futureObjects ){
System.out.println("Futures in ArrayList: " + futures);
}
}
}
The output I get is:
Before execution of threads
result: 123
Futures in ArrayList: 123
You have actually added only one task & submitted to the Threadpool, because of which one task executed & returned.
You need to submit multiple tasks together (using Option1 or Option2 below) so that you can actually utilize the Threadpool(to keep threads busy).
You can look at the updated version of the code below:
Option(1) : ExecutorService-invokeAll():
private static final ExecutorService executor = Executors.newFixedThreadPool(10);
public static void main (String[] args) throws ExecutionException, InterruptedException {
ArrayList futureObjects = new ArrayList();
Callable<Integer> task = () -> {
try {
TimeUnit.MILLISECONDS.sleep(100);
return 123;
}
catch (InterruptedException e) {
throw new IllegalStateException("task interrupted", e);
}
};
List<Callable<Integer>> callables = new ArrayList<>();
callables.add(task);
callables.add(task);
callables.add(task);
callables.add(task);
//Add other tasks
System.out.println("Before execution of threads");
List<Future<Integer>> futures = executor.invokeAll(callables);
for(Future future : futures ){
System.out.println("Futures in ArrayList: " + future.get());
}
}
Option(2) : ExecutorService-submit():
private static final ExecutorService executor = Executors.newFixedThreadPool(10);
public static void main (String[] args) throws ExecutionException, InterruptedException {
ArrayList futureObjects = new ArrayList();
Callable<Integer> task = () -> {
try {
TimeUnit.MILLISECONDS.sleep(100);
return 123;
}
catch (InterruptedException e) {
throw new IllegalStateException("task interrupted", e);
}
};
List<Callable<Integer>> callables = new ArrayList<>();
callables.add(task);
callables.add(task);
callables.add(task);
callables.add(task);
//Add other tasks
List<Future<Integer>> futures = new ArrayList<>();
System.out.println("Before execution of threads");
for(Callable<Integer> callable : callables) {
futures.add(executor.submit(callable));
}
for(Future future : futures ){
System.out.println("Futures in ArrayList: " + future.get());
}
}
You can refer the API here
Created Executor will try to execute tasks in 10 threads in parallel, but each submitted task will be executed only once.

Java 7: How to execute parallel tasks in batches?

I have three web-service calls that can run in parallel. Hence, I'm using a fixed pool of 3 threads to run them.
Now I want to process a couple more web-service calls, that can run in parallel, but only after the first three calls are processed.
How can I batch them? I want the ones inside a batch to run in parallel. And every batch only runs after the previous batch is completed.
So far I am only working with three services. How can I batch them and start using another 2 services?
ExecutorService peopleDataTaskExecutor = Executors.newFixedThreadPool(3);
Future<Collection<PeopleInterface>> task1 = null;
if (condition) {
task1 = peopleDataTaskExecutor.submit(buildTask1Callable(mycontext));
}
Future<Map<String, Task2Response>> task2 = peopleDataTaskExecutor.submit(buildTask2Callable(mycontext));
Future<Map<String, Task3Response>> task3 = null;
task3 = peopleDataTaskExecutor.submit(buildTask3Callable(mycontext));
peopleDataTaskExecutor.shutdown();
try {
peopleDataTaskExecutor.awaitTermination(10, TimeUnit.SECONDS);
} catch (InterruptedException e) {
}
Collection<PeopleInterface> task1Data = null;
try {
task1Data = task1 != null ? task1.get() : null;
} catch (InterruptedException | ExecutionException e) {
}
Map<String, Task2Response> task2Data = null;
try {
task2Data = task2.get();
} catch (InterruptedException | ExecutionException e) {
}
Map<String, Task3Response> task3Data = null;
if (task3 != null) {
try {
task3Data = task3.get();
} catch (InterruptedException | ExecutionException e) {
}
}
The easiest way to execute batches sequentially is to use the invokeAll() method. It accepts a collection of tasks, submits them to the executor and waits until completion (or until a timeout expires). Here's a simple example that executes three batches sequentially. Each batch contains three tasks running in parallel:
public class Program {
static class Task implements Callable<Integer> {
private static Random rand = new Random();
private final int no;
Task(int no) {
this.no = no;
}
#Override
public Integer call() throws Exception {
Thread.sleep(rand.nextInt(5000));
System.out.println("Task " + no + " finished");
return no;
}
}
public static void main(String[] args) throws Exception {
ExecutorService executor = Executors.newFixedThreadPool(3);
processBatch(executor, 1);
processBatch(executor, 2);
processBatch(executor, 3);
executor.shutdown();
}
private static void processBatch(ExecutorService executor, int batchNo) throws InterruptedException {
Collection batch = new ArrayList<>();
batch.add(new Task(batchNo * 10 + 1));
batch.add(new Task(batchNo * 10 + 2));
batch.add(new Task(batchNo * 10 + 3));
List<Future> futures = executor.invokeAll(batch);
System.out.println("Batch " + batchNo + " proceseed");
}
}
You can use those Futures in the processBatch() method to check the completion states of the tasks (were they executes successfully or terminated because of an exception), obtain their return values etc.

Stop ExecutorService threads when one thread fails. & return the exception

If any of the submitted thread is throwing exception its not returning the exception.
I want to write a piece of code for my project where in if any of the thread execution is failed it should throw the exception there & it should stop all the running & scheduled threads.
ExecutorService executorService = Executors.newFixedThreadPool(10);
for (int i = 0; i < 10; i++) {
Thread t = new Thread(new MyObject());
executorService.submit(t);
}
I wrote MyObject like this..,
public class MyObject implements Runnable {
public void run() {
throw new NullPointerException("Sample NullPointerException");
}
}
Is this the correct implementation for my goal...?????
i want to achieve that goal please give me some pointers.
Thanks In Advance....!!
Here is something you can consider about. Here I am using CallableTask instead of Thread.
public static void main(String[] args) {
ExecutorService executorService = Executors.newFixedThreadPool(10);
Set<Future<Void>> futureSet = new HashSet<Future<Void>>();
for (int i = 0; i < 9; i++) {
CallableTask1 task = new CallableTask1();
futureSet.add(executorService.submit(task));
}
CallableTask2 task2 = new CallableTask2();
futureSet.add(executorService.submit(task2));
boolean flag = false;
for (Future<Void> future : futureSet ) {
try {
future.get();
} catch (InterruptedException e) {
System.out.println("Interrupted");
} catch (ExecutionException e) {
System.out.println("Exception thrown from the thread");
flag = true;
break;
}
}
if(flag) {
for (Future<Void> future : futureSet) {
future.cancel(true);
}
}
}
Here I am using two classes to demonstrate this is working. When one task throw an exception the forever running task is also stop running.
class CallableTask1 implements Callable<Void> {
#Override
public Void call() throws Exception {
throw new NullPointerException("Sample NullPointerException");
}
}
class CallableTask2 implements Callable<Void> {
#Override
public Void call() throws Exception {
while (true){
System.out.println("THIS IS RUNNING");
Thread.sleep(5000);
}
}
}
But this has it's own limitations. This code will wait for it's turn to throw an exception because of "future.get()" executed sequentially.
Best case : Throw an exception in first future.get() and other tasks will be cancelled.
Worst case : Throw an exception in the last future.get() and by the time throw an exception all other tasks done with execution.
Optimizing : Identify the tasks that can throw an exception and wait for those tasks only to cancel all the other tasks.
If your run methods has while in it then best way share a flag and break on it. Check this answer for more information.

How to properly multi-thread a collection of independent tasks?

I'm using this code to divide up a few hundred tasks between different CPU cores.
final List<Throwable> errors = Collections.synchronizedList(Lists.<Throwable>newArrayList());
final ExecutorService pool = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors());
for (...) {
pool.execute(new Runnable() { #Override public void run() {
try {
// TASK HERE
} catch (Throwable e) {
errors.add(e);
}
}});
}
pool.shutdown();
try {
pool.awaitTermination(1000, TimeUnit.DAYS); // wait "indefinitely"
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
if (!errors.isEmpty()) throw Exceptions.wrap(errors.get(0)); // TODO multi-exception
It works, but it's not nice.
There is no version of awaitTermination without timeout, which is what I want.
I need to do my own error collecting.
What is the proper/common way to do this?
The point of a thread pool is to reuse threads. You should create it on application startup, outside of your code that creates tasks, and inject it. There is no need to shut down the pool after adding tasks. You do that when your application is shutting down.
To run a collection of tasks, use ExecutorService.invokeAll. To get the results afterwards, call get on each of the returned Futures. It will rethrow any exception that the task threw, so you can collect it afterwards.
You can use a future to do the error handling:
final List<Future> futures = new ArrayList<Future>();
for (int i = 0; i < 5; i++) {
futures.add(pool.submit(new Runnable() { #Override public void run() {
// TASK HERE
}}));
}
for (Future f : futures) {
try {
f.get();
} catch (ExecutionException e) {
//something bad happened in your runnable
}
}
//when you are done with the executor
pool.shutdown();
try {
pool.awaitTermination(1000, TimeUnit.DAYS); // wait "indefinitely"
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
I think you need to submit each Runnable, get a Future back, and then call get() on each Future.
When you call get(), you'll either get the result of the Runnable, or the exception that it encountered.

Categories

Resources