Parallel Processing of DTO in Java 7

Parallel Processing of DTO in Java 7 - java

I have requirement where need to process and map the DTOs with the values in for loop as below. Each of the mapping method here consumes nearly 10 minutes to complete its business logic and hence creating performance delay. I am working to refine the algorithms of business logic. However, please let me know if each of these mapping methods can be parallel processed to increase performance.
Since application is compatible only with Java 7 I cannot use streams of java 8.
for(Portfolio pf : portfolio) {
mapAddress(pf);
mapBusinessUnit(pf);
mapRelationShipDetails(pf)
--
--
--
}

You could split portfolios to different threads using either Runnable or Callable.
For example:
public class PortfolioService implements Callable<List<Portfolio>>
{
List<Portfolio> portfolios;
public PortfolioService(List<Portfolio> portfolios)
{
this.portfolios = portfolios;
}
public List<Portfolio> call()
{
for(Portfolio pf : portfolios) {
mapAddress(pf);
mapBusinessUnit(pf);
...
}
return portfolios;
}
}
However, this needs some modifications in your main class. I am using Callable here, since I don't know if you want to do something with all of these mapped Portfolios afterwards. However, if you want to let the threads do all of the work and don't need any return, use Runnable and modify the code.
1) You have to get your amount of cores:
int threads = Runtime.getRuntime().availableProcessors();
2) Now you split the workload per thread
// determine the average workload per thread
int blocksize = portfolios.size()/threads;
// doesn't always get all entries
int overlap = portfolios.size()%threads;
3) Start an ExecutorService, make a list of Future Elements, make reminder variable for old index of array slice
ExecutorService exs = Executors.newFixedThreadPool(threads);
List<Future<List<Portfoilio>>> futures = new ArrayList();
int oldIndex = 0;
4) Start threads
for(int i = 0; i<threads; i++)
{
int actualBlocksize = blocksize;
if(overlap != 0){
actualBlocksize++;
overlap--;
}
futures.add(exs.submit(new PortfolioService(portfolios.subList(oldIndex,actualBlocksize));
oldIndex = actualBlocksize;
}
5) Shutdown the ExecutorService and await it's termination
exs.shutdown();
try {exs.awaitTermination(6, TimeUnit.HOURS);}
catch (InterruptedException e) { }
6) do something with the future, if you want / have to.

Related

Execution of Tasks in ExecutorService without Thread pauses

I have a thread pool with 8 threads
private static final ExecutorService SERVICE = Executors.newFixedThreadPool(8);
My mechanism emulating the work of 100 user (100 Tasks):
List<Callable<Boolean>> callableTasks = new ArrayList<>();
for (int i = 0; i < 100; i++) { // Number of users == 100
callableTasks.add(new Task(client));
}
SERVICE.invokeAll(callableTasks);
SERVICE.shutdown();
The user performs the Task of generating a document.
Get UUID of Task;
Get Task status every 10 seconds;
If Task is ready get document.
public class Task implements Callable<Boolean> {
private final ReportClient client;
public Task(ReportClient client) {
this.client = client;
}
#Override
public Boolean call() {
final var uuid = client.createDocument(documentId);
GetStatusResponse status = null;
do {
try {
Thread.sleep(10000); // This stop current thread, but not a Task!!!!
} catch (InterruptedException e) {
return Boolean.FALSE;
}
status = client.getStatus(uuid);
} while (Status.PENDING.equals(status.status()));
final var document = client.getReport(uuid);
return Boolean.TRUE;
}
}
I want to give the idle time (10 seconds) to another task. But when the command Thread.sleep(10000); is called, the current thread suspends its execution. First 8 Tasks are suspended and 92 Tasks are pending 10 seconds. How can I do 100 Tasks in progress at the same time?

The Answer by Yevgeniy looks correct, regarding Java today. You want to have your cake and eat it too, in that you want a thread to sleep before repeating a task but you also want that thread to do other work. That is not possible today, but may be in the future.
Project Loom
In current Java, a Java thread is mapped directly to a host OS thread. In all common OSes such as macOS, BSD, Linux, Windows, and such, when code executing in a host thread blocks (stops to wait for sleep, or storage I/O, or network I/O, etc.) the thread too blocks. The blocked thread suspends, and the host OS generally runs another thread on that otherwise unused core. But the crucial point is that the suspended thread performs no further work until your blocking call to sleep returns.
This picture may change in the not-so-distant future. Project Loom seeks to add virtual threads to the concurrency facilities in Java.
In this new technology, many Java virtual threads are mapped to each host OS thread. Juggling the many Java virtual threads is managed by the JVM rather than by the OS. When the JVM detects a virtual thread’s executing code is blocking, that virtual thread is "parked", set aside by the JVM, with another virtual thread swapped out for execution on that "real" host OS thread. When the other thread returns from its blocking call, it can be reassigned to a "real" host OS thread for further execution. Under Project Loom, the host OS threads are kept busy, never idled while any pending virtual thread has work to do.
This swapping between virtual threads is highly efficient, so that thousands, even millions, of threads can be running at a time on conventional computer hardware.
Using virtual threads, your code will indeed work as you had hoped: A blocking call in Java will not block the host OS thread. But virtual threads are experimental, still in development, scheduled as a preview feature in Java 19. Early-access builds of Java 19 with Loom technology included are available now for you to try. But for production deployment today, you'll need to follow the advice in the Answer by Yevgeniy.
Take my coverage here with a grain of salt, as I am not an expert on concurrency. You can hear it from the actual experts, in the articles, interviews, and presentations by members of the Project Loom team including Ron Pressler and Alan Bateman.

EDIT: I just posted this answer and realized that you seem to be using that code to emulate real user interactions with some system. I would strongly recommend just using a load testing utility for that, rather than trying to come up with your own. However, in that case just using a CachedThreadPool might do the trick, although probably not a very robust or scalable solution.
Thread.sleep() behavior here is working as intended: it suspends the thread to let the CPU execute other threads.
Note that in this state a thread can be interrupted for a number of reasons unrelated to your code, and in that case your Task returns false: I'm assuming you actually have some retry logic down the line.
So you want two mutually exclusive things: on the one hand, if the document isn't ready, the thread should be free to do something else, but should somehow return and check that document's status again in 10 seconds.
That means you have to choose:
You definitely need that once-every-10-seconds check for each document - in that case, maybe use a cachedThreadPool and have it generate as many threads as necessary, just keep in mind that you'll carry the overhead for numerous threads doing virtually nothing.
Or, you can first initiate that asynchronous document creation process and then only check for status in your callables, retrying as needed.
Something like:
public class Task implements Callable<Boolean> {
private final ReportClient client;
private final UUID uuid;
// all args constructor omitted for brevity
#Override
public Boolean call() {
GetStatusResponse status = client.getStatus(uuid);
if (Status.PENDING.equals(status.status())) {
final var document = client.getReport(uuid);
return Boolean.TRUE;
} else {
return Boolean.FALSE; //retry next time
}
}
}
List<Callable<Boolean>> callableTasks = new ArrayList<>();
for (int i = 0; i < 100; i++) {
var uuid = client.createDocument(documentId); //not sure where documentId comes from here in your code
callableTasks.add(new Task(client, uuid));
}
List<Future<Boolean>> results = SERVICE.invokeAll(callableTasks);
// retry logic until all results come back as `true` here
This assumes that createDocument is relatively efficient, but that stage can be parallelized just as well, you just need to use a separate list of Runnable tasks and invoke them using the executor service.
Note that we also assume that the document's status will indeed eventually change to something other than PENDING, and that might very well not be the case. You might want to have a timeout for retries.

In your case, it seems like you need to check if a certain condition is met every x seconds. In fact, from your code the document generation seems asynchronous and what the Task keeps doing after that is just is waiting for the document generation to happen.
You could launch every document generation from your Thread-Main and use a ScheduledThreadPoolExecutor to verify every x seconds whether the document generation has been completed. At that point, you retrieve the result and cancel the corresponding Task's scheduling.
Basically, one ConcurrentHashMap is shared among the thread-main and the Tasks you've scheduled (mapRes), while the other, mapTask, is just used locally within the thread-main to keep track of the ScheduledFuture returned by every Task.
public class Main {
public static void main(String[] args) {
ScheduledThreadPoolExecutor pool = (ScheduledThreadPoolExecutor) Executors.newScheduledThreadPool(8);
//ConcurrentHashMap shared among the submitted tasks where each Task updates its corresponding outcome to true as soon as the document has been produced
ConcurrentHashMap<Integer, Boolean> mapRes = new ConcurrentHashMap<>();
for (int i = 0; i < 100; i++) {
mapRes.put(i, false);
}
String uuid;
ScheduledFuture<?> schedFut;
//HashMap containing the ScheduledFuture returned by scheduling each Task to cancel their repetition as soon as the document has been produced
Map<String, ScheduledFuture<?>> mapTask = new HashMap<>();
for (int i = 0; i < 100; i++) {
//Starting the document generation from the thread-main
uuid = client.createDocument(documentId);
//Scheduling each Task 10 seconds apart from one another and with an initial delay of i*10 to not start all of them at the same time
schedFut = pool.scheduleWithFixedDelay(new Task(client, uuid, mapRes), i * 10, 10000, TimeUnit.MILLISECONDS);
//Adding the ScheduledFuture to the map
mapTask.put(uuid, schedFut);
}
//Keep checking the outcome of each task until all of them have been canceled due to completion
while (!mapTasks.values().stream().allMatch(v -> v.isCancelled())) {
for (Integer key : mapTasks.keySet()) {
//Canceling the i-th task scheduling if:
// - Its result is positive (i.e. its verification is terminated)
// - The task hasn't been canceled already
if (mapRes.get(key) && !mapTasks.get(key).isCancelled()) {
schedFut = mapTasks.get(key);
schedFut.cancel(true);
}
}
//... eventually adding a sleep to check the completion every x seconds ...
}
pool.shutdown();
}
}
class Task implements Runnable {
private final ReportClient client;
private final String uuid;
private final ConcurrentHashMap mapRes;
public Task(ReportClient client, String uuid, ConcurrentHashMap mapRes) {
this.client = client;
this.uuid = uuid;
this.mapRes = mapRes;
}
#Override
public void run() {
//This is taken form your code and I'm assuming that if it's not pending then it's completed
if (!Status.PENDING.equals(client.getStatus(uuid).status())) {
mapRes.replace(uuid, true);
}
}
}
I've tested your case locally, by emulating a scenario where n Tasks wait for a folder with their same id to be created (or uuid in your case). I'll post it right here as a sample in case you'd like to try something simpler first.
public class Main {
public static void main(String[] args) {
ScheduledThreadPoolExecutor pool = (ScheduledThreadPoolExecutor) Executors.newScheduledThreadPool(2);
ConcurrentHashMap<Integer, Boolean> mapRes = new ConcurrentHashMap<>();
for (int i = 0; i < 16; i++) {
mapRes.put(i, false);
}
ScheduledFuture<?> schedFut;
Map<Integer, ScheduledFuture<?>> mapTasks = new HashMap<>();
for (int i = 0; i < 16; i++) {
schedFut = pool.scheduleWithFixedDelay(new MyTask(i, mapRes), i * 20, 3000, TimeUnit.MILLISECONDS);
mapTasks.put(i, schedFut);
}
while (!mapTasks.values().stream().allMatch(v -> v.isCancelled())) {
for (Integer key : mapTasks.keySet()) {
if (mapRes.get(key) && !mapTasks.get(key).isCancelled()) {
schedFut = mapTasks.get(key);
schedFut.cancel(true);
}
}
}
pool.shutdown();
}
}
class MyTask implements Runnable {
private int num;
private ConcurrentHashMap mapRes;
public MyTask(int num, ConcurrentHashMap mapRes) {
this.num = num;
this.mapRes = mapRes;
}
#Override
public void run() {
System.out.println("Task " + num + " is checking whether the folder exists: " + Files.exists(Path.of("./" + num)));
if (Files.exists(Path.of("./" + num))) {
mapRes.replace(num, true);
}
}
}

CompletableFuture: Await percentage complete

I am writing identical data in parallel to n nodes of a distributed system.
When n% of these nodes have been written to successfully, the remaining writes to the other nodes are unimportant as n% guarantees replication between the other nodes.
Java's CompletableFuture seems to have very close to what I want eg:
CompletableFuture.anyOf()
(Returns when the first future is complete) - avoids waiting unnecessarily, but returns too soon as I require n% completions
CompletableFuture.allOf()
(Returns when all futures complete) - avoids returning too soon but waits unnecessarily for 100% completion
I am looking for a way to return when a specific percentage of futures have completed.
For example if I supply 10 futures, return when 6 or 60% of these complete successfully.
For example, Bluebird JS has this feature with
Promise.some(promises, countThatNeedToComplete)
I was wondering if I could do something similar with TheadExecutor or vanilla CompletableFuture in java

I believe you can achieve what you want using only what's already provided by CompletableFuture, but you'll have to implement additional control to know how many future tasks were already completed and, when you reach the number/percentage that you need, cancel the remaining tasks.
Below is a class to illustrate the idea:
public class CompletableSome<T>
{
private List<CompletableFuture<Void>> tasks;
private int tasksCompleted = 0;
public CompletableSome(List<CompletableFuture<T>> tasks, int percentOfTasksThatMustComplete)
{
int minTasksThatMustComplete = tasks.size() * percentOfTasksThatMustComplete / 100;
System.out.println(
String.format("Need to complete at least %s%% of the %s tasks provided, which means %s tasks.",
percentOfTasksThatMustComplete, tasks.size(), minTasksThatMustComplete));
this.tasks = new ArrayList<>(tasks.size());
for (CompletableFuture<?> task : tasks)
{
this.tasks.add(task.thenAccept(a -> {
// thenAccept will be called right after the future task is completed. At this point we'll
// check if we reached the minimum number of nodes needed. If we did, then complete the
// remaining tasks since they are no longer needed.
tasksCompleted++;
if (tasksCompleted >= minTasksThatMustComplete)
{
tasks.forEach(t -> t.complete(null));
}
}));
}
}
public void execute()
{
CompletableFuture.allOf(tasks.toArray(new CompletableFuture<?>[0])).join();
}
}
You would use this class as in the example below:
public static void main(String[] args)
{
int numberOfNodes = 4;
// Create one future task for each node.
List<CompletableFuture<String>> nodes = new ArrayList<>();
for (int i = 1; i <= numberOfNodes; i++)
{
String nodeId = "result" + i;
nodes.add(CompletableFuture.supplyAsync(() -> {
try
{
// Sleep for some time to avoid all tasks to complete before the count is checked.
Thread.sleep(100 + new Random().nextInt(500));
}
catch (InterruptedException e)
{
e.printStackTrace();
}
// The action here is just to print the nodeId, you would make the actual call here.
System.out.println(nodeId + " completed.");
return nodeId;
}));
}
// Here we're saying that just 75% of the nodes must be called successfully.
CompletableSome<String> tasks = new CompletableSome<>(nodes, 75);
tasks.execute();
}
Please note that with this solution you could end up executing more tasks than the minimum required -- for instance, when two or more nodes respond almost simultaneously, you may reach the minimum required count when the first node responds, but there will be no time to cancel the other tasks. If that's an issue, then you'd have to implement even more controls.

Make expensive for loop multithreaded, Java

I have a problem which I would like to solve using Java's ExecutorService and Future classes. I am currently taking many samples from a function that is very expensive for me to compute (each sample can take several minutes) using a for loop. I have a class FunctionEvaluator that evaluates this function for me and this class is quite expensive to instantiate, since it contains a lot of internal memory, so I have made this class easily reusable with some internal counters and a reset() method. So my current situation looks like this:
int numSamples = 100;
int amountOfData = 1000000;
double[] data = new double[amountOfData];//Data comes from somewhere...
double[] results = new double[numSamples];
//a lot of memory contained inside the FunctionEvaluator class,
//expensive to intialise
FunctionEvaluator fe = new FunctionEvaluator();
for(int i=0; i<numSamples; i++) {
results[i] = fe.sampleAt(i, data);//very expensive computation
}
but I would like to get some multithreading going to speed things up. It should be easy enough, because while each sample will share whatever is inside of data, it is a read-only operation and each sample is independent of any other. Now I wouldn't be having any trouble with this since I've used Java's Future and ExecutorService before, but never in a context where the Callable had to be re-used. So in general, how would I go about setting this scenario up given that I can afford to run n instantiations of FunctionEvaluator? Something (very roughly) like this:
int numSamples = 100;
int amountOfData = 1000000;
int N = 10;
double[] data = new double[amountOfData];//Data comes from somewhere...
double[] results = new double[numSamples];
//a lot of memory contained inside the FunctionEvaluator class,
//expensive to intialise
FunctionEvaluator[] fe = new FunctionEvaluator[N];
for(int i=0; i<numSamples; i++) {
//Somehow add available FunctionEvaluators to an ExecutorService
//so that N FunctionEvaluators can run in parallel. When a
//FunctionEvaluator is finished, reset then compute a new sample
//until numSamples samples have been taken.
}
Any help would be greatly appreciated! Many thanks.
EDIT
So here is a toy example (which doesn't work :P). In this case the "expensive function" that I want to sample is just squaring an integer and the "expensive to instantiate class" that does it for me is called CallableComputation:
In TestConc.java:
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import java.util.concurrent.TimeUnit;
public class TestConc {
public static void main(String[] args) {
SquareCalculator squareCalculator = new SquareCalculator();
int numFunctionEvaluators = 2;
int numSamples = 10;
ExecutorService executor = Executors.newFixedThreadPool(2);
CallableComputation c1 = new CallableComputation(2);
CallableComputation c2 = new CallableComputation(3);
CallableComputation[] callables = new CallableComputation[numFunctionEvaluators];
Future<Integer>[] futures = (new Future[numFunctionEvaluators]);
int[] results = new int[numSamples];
for(int i=0; i<numFunctionEvaluators; i++) {
callables[i] = new CallableComputation(i);
futures[i] = executor.submit(callables[i]);
}
futures[0] = executor.submit(c1);
futures[1] = executor.submit(c2);
for(int i=numFunctionEvaluators; i<numSamples; ) {
for(int j=0; j<futures.length; j++) {
if(futures[j].isDone()) {
try {
results[i] = futures[j].get();
}
catch (InterruptedException e) {
e.printStackTrace();
}
catch (ExecutionException e) {
e.printStackTrace();
}
callables[j].set(i);
System.out.printf("Function evaluator %d given %d\n", j, i+1);
executor.submit(callables[j]);
i++;
}
}
}
executor.shutdown();
try {
executor.awaitTermination(1, TimeUnit.MINUTES);
}
catch (InterruptedException e) {
e.printStackTrace();
}
for (int i=0; i<results.length; i++) {
System.out.printf("res%d=%d, ", i, results[i]);
}
System.out.println();
}
private static boolean areDone(Future<Integer>[] futures) {
for(int i=0; i<futures.length; i++) {
if(!futures[i].isDone()) {
return false;
}
}
return true;
}
private static void printFutures(Future<Integer>[] futures) {
for (int i=0; i<futures.length; i++) {
System.out.printf("f%d=%s | ", i, futures[i].isDone()?"done" : "not done");
}System.out.printf("\n");
}
}
In CallableComputation.java:
import java.util.concurrent.Callable;
public class CallableComputation implements Callable<Integer>{
int input = 0;
public CallableComputation(int input) {
this.input = input;
}
public void set(int i) {
input = i;
}
#Override
public Integer call() throws Exception {
System.out.printf("currval=%d\n", input);
Thread.sleep(500);
return input * input;
}
}

In Java8:
double[] result = IntStream.range(0, numSamples)
.parallel()
.mapToDouble(i->fe.sampleAt(i, data))
.toArray();
The question asks how to execute heavy computational functions in parallel by loading as many CPU as possible.
Exert from the Parallelism tutorial:
Parallel computing involves dividing a problem into subproblems,
solving those problems simultaneously (in parallel, with each
subproblem running in a separate thread), and then combining the
results of the solutions to the subproblems. Java SE provides the
fork/join framework, which enables you to more easily implement
parallel computing in your applications. However, with this framework,
you must specify how the problems are subdivided (partitioned). With
aggregate operations, the Java runtime performs this partitioning and
combining of solutions for you.
The actual solution includes:
IntStream.range will generate the stream of integers from 0 to numSamples.
parallel() will split the stream and execute it will all available CPU on the box.
mapToDouble() will convert the stream of integers to the stream of doubles by applying the lamba expression that will do actual work.
toArray() is a terminal operation that will aggregate the result and return it as an array.

There is no special code change required, you can use the same Callable again and again without any issue. Also, to improve efficiency, as you are saying, creating an instance of FunctionEvaluator is expensive, you can use only one instance and ensure that sampleAt is thread safe. One option is, maybe you can use all function local variables and don't modify any of the passing argument at any point of time while any of the thread is running
Please find a quick example below:
Code Snippet:
ExecutorService executor = Executors.newFixedThreadPool(2);
Callable<String> task1 = new Callable<String>(){public String call(){System.out.println(Thread.currentThread()+"currentThread");return null;}}
executor.submit(task1);
executor.submit(task1);
executor.shutdown();
Please find the screenshot below:

You can wrap each FunctionEvaluator's actual work as a Callable/Runnanle, then using a fixdThreadPool with a queue, then you just need to sumbit the target callable/runnable to the threadPool.

I would like to get some multithreading going to speed things up.
Sounds like a good idea but your code is massively over complex. #Pavel has a dead simple Java 8 solution but even without Java 8 you can make it a lot easier.
All you need to do is to submit the jobs into the executor and then call get() on each one of the Futures that are returned. A Callable class is not needed although it does make the code a lot cleaner. But you certainly don't need the arrays which are a bad pattern anyway because a typo can easily generate out-of-bounds exceptions. Stick to collections or Java 8 streams.
ExecutorService executor = Executors.newFixedThreadPool(2);
List<Future<Integer>> futureList = new ArrayList<Future<Integer>>();
for (int i = 0; i < numSamples; i++ ) {
// start the jobs running in the background
futureList.add(executor.subject(new CallableComputation(i));
}
// shutdown executor if done submitting tasks, submitted jobs will keep running
executor.shutdown();
for (Future<Integer> future : futureList) {
// this will wait for the future to finish, it also throws some exceptions
Integer result = future.get();
// add result to a collection or something here
}

Massive tasks alternative pattern for Runnable or Callable

For massive parallel computing I tend to use executors and callables. When I have thousand of objects to be computed I feel not so good to instantiate thousand of Runnables for each object.
So I have two approaches to solve this:
I. Split the workload into a small amount of x-workers giving y-objects each. (splitting the object list into x-partitions with y/x-size each)
public static <V> List<List<V>> partitions(List<V> list, int chunks) {
final ArrayList<List<V>> lists = new ArrayList<List<V>>();
final int size = Math.max(1, list.size() / chunks + 1);
final int listSize = list.size();
for (int i = 0; i <= chunks; i++) {
final List<V> vs = list.subList(Math.min(listSize, i * size), Math.min(listSize, i * size + size));
if(vs.size() == 0) break;
lists.add(vs);
}
return lists;
}
II. Creating x-workers which fetch objects from a queue.
Questions:
Is creating thousand of Runnables really expensive and to be avoided?
Is there a generic pattern/recommendation how to do it by solution II?
Are you aware of a different approach?

Creating thousands of Runnable (objects implementing Runnable) is not more expensive than creating a normal object.
Creating and running thousands of Threads can be very heavy, but you can use Executors with a pool of threads to solve this problem.

As for the different approach, you might be interested in java 8's parallel streams.

Combining various answers here :
Is creating thousand of Runnables really expensive and to be avoided?
No, it's not in and of itself. It's how you will make them execute that may prove costly (spawning a few thousand threads certainly has its cost).
So you would not want to do this :
List<Computation> computations = ...
List<Thread> threads = new ArrayList<>();
for (Computation computation : computations) {
Thread thread = new Thread(new Computation(computation));
threads.add(thread);
thread.start();
}
// If you need to wait for completion:
for (Thread t : threads) {
t.join();
}
Because it would 1) be unnecessarily costly in terms of OS ressource (native threads, each having a stack on the heap), 2) spam the OS scheduler with a vastly concurrent workload, most certainly leading to plenty of context switchs and associated cache invalidations at the CPU level 3) be a nightmare to catch and deal with exceptions (your threads should probably define an Uncaught exception handler, and you'd have to deal with it manually).
You'd probably prefer an approach where a finite Thread pool (of a few threads, "a few" being closely related to your number of CPU cores) handles many many Callables.
List<Computation> computations = ...
ExecutorService pool = Executors.newFixedSizeThreadPool(someNumber)
List<Future<Result>> results = new ArrayList<>();
for (Computation computation : computations) {
results.add(pool.submit(new ComputationCallable(computation));
}
for (Future<Result> result : results {
doSomething(result.get);
}
The fact that you reuse a limited number threads should yield a really nice improvement.
Is there a generic pattern/recommendation how to do it by solution II?
There are. First, your partition code (getting from a List to a List<List>) can be found inside collection tools such as Guava, with more generic and fail-proofed implementations.
But more than this, two patterns come to mind for what you are achieving :
Use the Fork/Join Pool with Fork/Join tasks (that is, spawn a task with your whole list of items, and each task will fork sub tasks with half of that list, up to the point where each task manages a small enough list of items). It's divide and conquer. See: http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ForkJoinTask.html
If your computation were to be "add integers from a list", it could look like (there might be a boundary bug in there, I did not really check) :
public static class Adder extends RecursiveTask<Integer> {
protected List<Integer> globalList;
protected int start;
protected int stop;
public Adder(List<Integer> globalList, int start, int stop) {
super();
this.globalList = globalList;
this.start = start;
this.stop = stop;
System.out.println("Creating for " + start + " => " + stop);
}
#Override
protected Integer compute() {
if (stop - start > 1000) {
// Too many arguments, we split the list
Adder subTask1 = new Adder(globalList, start, start + (stop-start)/2);
Adder subTask2 = new Adder(globalList, start + (stop-start)/2, stop);
subTask2.fork();
return subTask1.compute() + subTask2.join();
} else {
// Manageable size of arguments, we deal in place
int result = 0;
for(int i = start; i < stop; i++) {
result +=i;
}
return result;
}
}
}
public void doWork() throws Exception {
List<Integer> computation = new ArrayList<>();
for(int i = 0; i < 10000; i++) {
computation.add(i);
}
ForkJoinPool pool = new ForkJoinPool();
RecursiveTask<Integer> masterTask = new Adder(computation, 0, computation.size());
Future<Integer> future = pool.submit(masterTask);
System.out.println(future.get());
}
Use Java 8 parallel streams in order to launch multiple parallel computations easily (under the hood, Java parallel streams can fall back to the Fork/Join pool actually).
Others have shown how this might look like.
Are you aware of a different approach?
For a different take at concurrent programming (without explicit task / thread handling), have a look at the actor pattern. https://en.wikipedia.org/wiki/Actor_model
Akka comes to mind as a popular implementation of this pattern...

#Aaron is right, you should take a look into Java 8's parallel streams:
void processInParallel(List<V> list) {
list.parallelStream().forEach(item -> {
// do something
});
}
If you need to specify chunks, you could use a ForkJoinPool as described here:
void processInParallel(List<V> list, int chunks) {
ForkJoinPool forkJoinPool = new ForkJoinPool(chunks);
forkJoinPool.submit(() -> {
list.parallelStream().forEach(item -> {
// do something with each item
});
});
}
You could also have a functional interface as an argument:
void processInParallel(List<V> list, int chunks, Consumer<V> processor) {
ForkJoinPool forkJoinPool = new ForkJoinPool(chunks);
forkJoinPool.submit(() -> {
list.parallelStream().forEach(item -> processor.accept(item));
});
}
Or in shorthand notation:
void processInParallel(List<V> list, int chunks, Consumer<V> processor) {
new ForkJoinPool(chunks).submit(() -> list.parallelStream().forEach(processor::accept));
}
And then you would use it like:
processInParallel(myList, 2, item -> {
// do something with each item
});
Depending on your needs, the ForkJoinPool#submit() returns an instance of ForkJoinTask, which is a Future and you may use it to check for the status or wait for the end of your task.
You'd most probably want the ForkJoinPool instantiated only once (not instantiate it on every method call) and then reuse it to prevent CPU choking if the method is called multiple times.

Is creating thousand of Runnables really expensive and to be avoided?
Not at all, the runnable/callable interfaces have only one method to implement each, and the amount of "extra" code in each task depends on the code you are running. But certainly no fault of the Runnable/Callable interfaces.
Is there a generic pattern/recommendation how to do it by solution II?
Pattern 2 is more favorable than pattern 1. This is because pattern 1 assumes that each worker will finish at the exact same time. If some workers finish before other workers, they could just be sitting idle since they only are able to work on the y/x-size queues you assigned to each of them. In pattern 2 however, you will never have idle worker threads (unless the end of the work queue is reached and numWorkItems < numWorkers).
An easy way to use the preferred pattern, pattern 2, is to use the ExecutorService invokeAll(Collection<? extends Callable<T>> list) method.
Here is an example usage:
List<Callable<?>> workList = // a single list of all of your work
ExecutorService es = Executors.newCachedThreadPool();
es.invokeAll(workList);
Fairly readable and straightforward usage, and the ExecutorService implementation will automatically use solution 2 for you, so you know that each worker thread has their use time maximized.
Are you aware of a different approach?
Solution 1 and 2 are two common approaches for generic work. Now, there are many different implementation available for you choose from (such as java.util.Concurrent, Java 8 parallel streams, or Fork/Join pools), but the concept of each implementation is generally the same. The only exception is if you have specific tasks in mind with non-standard running behavior.

Java concurrency counter not properly clean up

This is a java concurrency question. 10 jobs need to be done, each of them will have 32 worker threads. Worker thread will increase a counter . Once the counter is 32, it means this job is done and then clean up counter map. From the console output, I expect that 10 "done" will be output, pool size is 0 and counterThread size is 0.
The issues are :
most of time, "pool size: 0 and countThreadMap size:3" will be
printed out. even those all threads are gone, but 3 jobs are not
finished yet.
some time, I can see nullpointerexception in line 27. I have used ConcurrentHashMap and AtomicLong, why still have concurrency
exception.
Thanks
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.atomic.AtomicLong;
public class Test {
final ConcurrentHashMap<Long, AtomicLong[]> countThreadMap = new ConcurrentHashMap<Long, AtomicLong[]>();
final ExecutorService cachedThreadPool = Executors.newCachedThreadPool();
final ThreadPoolExecutor tPoolExecutor = ((ThreadPoolExecutor) cachedThreadPool);
public void doJob(final Long batchIterationTime) {
for (int i = 0; i < 32; i++) {
Thread workerThread = new Thread(new Runnable() {
#Override
public void run() {
if (countThreadMap.get(batchIterationTime) == null) {
AtomicLong[] atomicThreadCountArr = new AtomicLong[2];
atomicThreadCountArr[0] = new AtomicLong(1);
atomicThreadCountArr[1] = new AtomicLong(System.currentTimeMillis()); //start up time
countThreadMap.put(batchIterationTime, atomicThreadCountArr);
} else {
AtomicLong[] atomicThreadCountArr = countThreadMap.get(batchIterationTime);
atomicThreadCountArr[0].getAndAdd(1);
countThreadMap.put(batchIterationTime, atomicThreadCountArr);
}
if (countThreadMap.get(batchIterationTime)[0].get() == 32) {
System.out.println("done");
countThreadMap.remove(batchIterationTime);
}
}
});
tPoolExecutor.execute(workerThread);
}
}
public void report(){
while(tPoolExecutor.getActiveCount() != 0){
//
}
System.out.println("pool size: "+ tPoolExecutor.getActiveCount() + " and countThreadMap size:"+countThreadMap.size());
}
public static void main(String[] args) throws Exception {
Test test = new Test();
for (int i = 0; i < 10; i++) {
Long batchIterationTime = System.currentTimeMillis();
test.doJob(batchIterationTime);
}
test.report();
System.out.println("All Jobs are done");
}
}

Let’s dig through all the mistakes of thread related programming, one man can make:
Thread workerThread = new Thread(new Runnable() {
…
tPoolExecutor.execute(workerThread);
You create a Thread but don’t start it but submit it to an executor. It’s a historical mistake of the Java API to let Thread implement Runnable for no good reason. Now, every developer should be aware, that there is no reason to treat a Thread as a Runnable. If you don’t want to start a thread manually, don’t create a Thread. Just create the Runnable and pass it to execute or submit.
I want to emphasize the latter as it returns a Future which gives you for free what you are attempting to implement: the information when a task has been finished. It’s even easier when using invokeAll which will submit a bunch of Callables and return when all are done. Since you didn’t tell us anything about your actual task, it’s not clear whether you can let your tasks simply implement Callable (may return null) instead of Runnable.
If you can’t use Callables or don’t want to wait immediately on submission, you have to remember the returned Futures and query them at a later time:
static final ExecutorService cachedThreadPool = Executors.newCachedThreadPool();
public static List<Future<?>> doJob(final Long batchIterationTime) {
final Random r=new Random();
List<Future<?>> list=new ArrayList<>(32);
for (int i = 0; i < 32; i++) {
Runnable job=new Runnable() {
public void run() {
// pretend to do something
LockSupport.parkNanos(TimeUnit.SECONDS.toNanos(r.nextInt(10)));
}
};
list.add(cachedThreadPool.submit(job));
}
return list;
}
public static void main(String[] args) throws Exception {
Test test = new Test();
Map<Long,List<Future<?>>> map=new HashMap<>();
for (int i = 0; i < 10; i++) {
Long batchIterationTime = System.currentTimeMillis();
while(map.containsKey(batchIterationTime))
batchIterationTime++;
map.put(batchIterationTime,doJob(batchIterationTime));
}
// print some statistics, if you really need
int overAllDone=0, overallPending=0;
for(Map.Entry<Long,List<Future<?>>> e: map.entrySet()) {
int done=0, pending=0;
for(Future<?> f: e.getValue()) {
if(f.isDone()) done++;
else pending++;
}
System.out.println(e.getKey()+"\t"+done+" done, "+pending+" pending");
overAllDone+=done;
overallPending+=pending;
}
System.out.println("Total\t"+overAllDone+" done, "+overallPending+" pending");
// wait for the completion of all jobs
for(List<Future<?>> l: map.values())
for(Future<?> f: l)
f.get();
System.out.println("All Jobs are done");
}
But note that if you don’t need the ExecutorService for subsequent tasks, it’s much easier to wait for all jobs to complete:
cachedThreadPool.shutdown();
cachedThreadPool.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS);
System.out.println("All Jobs are done");
But regardless of how unnecessary the manual tracking of the job status is, let’s delve into your attempt, so you may avoid the mistakes in the future:
if (countThreadMap.get(batchIterationTime) == null) {
The ConcurrentMap is thread safe, but this does not turn your concurrent code into sequential one (that would render multi-threading useless). The above line might be processed by up to all 32 threads at the same time, all finding that the key does not exist yet so possibly more than one thread will then be going to put the initial value into the map.
AtomicLong[] atomicThreadCountArr = new AtomicLong[2];
atomicThreadCountArr[0] = new AtomicLong(1);
atomicThreadCountArr[1] = new AtomicLong(System.currentTimeMillis());
countThreadMap.put(batchIterationTime, atomicThreadCountArr);
That’s why this is called the “check-then-act” anti-pattern. If more than one thread is going to process that code, they all will put their new value, being confident that this was the right thing as they have checked the initial condition before acting but for all but one thread the condition has changed when acting and they are overwriting the value of a previous put operation.
} else {
AtomicLong[] atomicThreadCountArr = countThreadMap.get(batchIterationTime);
atomicThreadCountArr[0].getAndAdd(1);
countThreadMap.put(batchIterationTime, atomicThreadCountArr);
Since you are modifying the AtomicInteger which is already stored into the map, the put operation is useless, it will put the very array that it retrieved before. If there wasn’t the mistake that there can be multiple initial values as described above, the put operation had no effect.
}
if (countThreadMap.get(batchIterationTime)[0].get() == 32) {
Again, the use of a ConcurrentMap doesn’t turn the multi-threaded code into sequential code. While it is clear that the only last thread will update the atomic integer to 32 (when the initial race condition doesn’t materialize), it is not guaranteed that all other threads have already passed this if statement. Therefore more than one, up to all threads can still be at this point of execution and see the value of 32. Or…
System.out.println("done");
countThreadMap.remove(batchIterationTime);
One of the threads which have seen the 32 value might execute this remove operation. At this point, there might be still threads not having executed the above if statement, now not seeing the value 32 but producing a NullPointerException as the array supposed to contain the AtomicInteger is not in the map anymore. This is what happens, occasionally…

After creating your 10 jobs, your main thread is still running - it doesn't wait for your jobs to complete before it calls report on the test. You try to overcome this with the while loop, but tPoolExecutor.getActiveCount() is potentially coming out as 0 before the workerThread is executed, and then the countThreadMap.size() is happening after the threads were added to your HashMap.
There are a number of ways to fix this - but I will let another answer-er do that because I have to leave at the moment.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.