Parallel for loop with specific number of threads - java

What is the best way, how to implement parallel for loop with a specified number of threads?
Like this:
int maxThreads=5;
int curretnThreads=0;
for(int i = 0; i < 10000; i++){
if(currentThreads<maxThreads){
start thread......
}else{
wait...
}
}

I would first create a ForkJoinPool with a fixed number of threads:
final ForkJoinPool forkJoinPool = new ForkJoinPool(numThreads);
Now simply execute a parallel stream operation in a task:
forkJoinPool.submit(() -> {
IntStream.range(0, 10_000)
.parallel()
.forEach(i -> {
//do stuff
});
});
Obviously this example simply translates your code literally. I would recommend that you use the Stream API to its fullest rather than just loop [0, 10,000).

Related

Java increase and decrease number of threads in interval

I want to create an API Load Test where number of parallel users(threads) increase to the pool size and after a while decrease. Right now I have a test where I can start all threads start at once.
// Thread pull - How many threads at once we will use as concurrent USERS
ExecutorService executor = Executors.newFixedThreadPool(Integer.parseInt(prop.getProperty("threadPool")));
// numberOfRequests - How many requests we want to send in total
int numberOfRequests = Integer.parseInt(prop.getProperty("numberOfRequests"));
CountDownLatch latch = new CountDownLatch(numberOfRequests);
List<PostCallableData> tasks = IntStream.range(0, numberOfRequests).mapToObj(i -> {
return new PostCallableData("Thread ", branch, 4, 5, latch);
}).collect(Collectors.toList());
List<Future<List<Integer>>> futures = executor.invokeAll(tasks);
latch.await();
executor.shutdown();
List<List<Integer>> results = futures.stream()
.map(future -> {
try {
return future.get();
} catch (Exception e) {
throw new RuntimeException(e);
}
})
.collect(Collectors.toList());
My goal is to start with one thread and add next after interval(can be changed by variable) to max thread pool. ex. till 30 threads add next every 30s.
Keep that for ex. for 45 minutes(variable) or number of and then decrease number of threads by one every 30 seconds.
List<PostCallableData> tasks = IntStream.range(0, numberOfRequests).mapToObj(i -> {
return new PostCallableData("Thread ", branch, 4, 5, latch);
}).collect(Collectors.toList());
List<Future<List<Integer>>> futures = executor.invokeAll(tasks);
Ideally lines above will be replaced by sort of a random action - I want to Post/Get/Delete actions to be run in this test.
What I need to do to have it increasing and decreasing gradually?

Rest parellel calls to service -Multithreading in java

I have a rest call api where max count of result return by the api is 1000.start page=1
{
"status": "OK",
"payload": {
"EMPList":[],
count:5665
}
So to get other result I have to change the start page=2 and again hit the service.again will get 1000 results only.
but after first call i want to make it as a parallel call and I want to collect the result and combine it and send it back to calling service in java. Please suggest i am new to java.i tried using callable but it's not working
It seems to me that ideally you should be able to configure your max count to one appropriate for your use case. I'm assuming you aren't able to do that. Here is a simple, lock-less, multi threading scheme that acts as a simple reduction operation for your two network calls:
// online runnable: https://ideone.com/47KsoS
int resultSize = 5;
int[] result = new int[resultSize*2];
Thread pg1 = new Thread(){
public void run(){
System.out.println("Thread 1 Running...");
// write numbers 1-5 to indexes 0-4
for(int i = 0 ; i < resultSize; i ++) {
result[i] = i + 1;
}
System.out.println("Thread 1 Exiting...");
}
};
Thread pg2 = new Thread(){
public void run(){
System.out.println("Thread 2 Running");
// write numbers 5-10 to indexes 5-9
for(int i = 0 ; i < resultSize; i ++) {
result[i + resultSize] = i + 1 + resultSize;
}
System.out.println("Thread 2 Exiting...");
}
};
pg1.start();
pg2.start();
// ensure that pg1 execution finishes
pg1.join();
// ensure that pg2 execution finishes
pg2.join();
// print result of reduction operation
System.out.println(Arrays.toString(result));
There is a very important caveat with this implementation however. You will notice that both of the threads DO NOT overlap in their memory writes. This is very important as if you were to simply change our int[] result to ArrayList<Integer> this could lead to catastrophic failure in our reduction operation between the two threads called a Race Condition (I believe the standard ArrayList implementation in Java is not thread safe). Since we can guarantee how large our result will be I would highly suggest sticking to my usage of an array for this multi-threaded implementation as ArrayLists hide a lot of implementation logic from you that you likely won't understand until you take a basic data-structures course.

Why is CompletableFuture join/get faster in separate streams than using one stream

For the following program I am trying to figure out why using 2 different streams parallelizes the task and using the same stream and calling join/get on the Completable future makes them take longer time equivalent to as if they were sequentially processed).
public class HelloConcurrency {
private static Integer sleepTask(int number) {
System.out.println(String.format("Task with sleep time %d", number));
try {
TimeUnit.SECONDS.sleep(number);
} catch (InterruptedException e) {
e.printStackTrace();
return -1;
}
return number;
}
public static void main(String[] args) {
List<Integer> sleepTimes = Arrays.asList(1,2,3,4,5,6);
System.out.println("WITH SEPARATE STREAMS FOR FUTURE AND JOIN");
ExecutorService executorService = Executors.newFixedThreadPool(6);
long start = System.currentTimeMillis();
List<CompletableFuture<Integer>> futures = sleepTimes.stream()
.map(sleepTime -> CompletableFuture.supplyAsync(() -> sleepTask(sleepTime), executorService)
.exceptionally(ex -> { ex.printStackTrace(); return -1; }))
.collect(Collectors.toList());
executorService.shutdown();
List<Integer> result = futures.stream()
.map(CompletableFuture::join)
.collect(Collectors.toList());
long finish = System.currentTimeMillis();
long timeElapsed = (finish - start)/1000;
System.out.println(String.format("done in %d seconds.", timeElapsed));
System.out.println(result);
System.out.println("WITH SAME STREAM FOR FUTURE AND JOIN");
ExecutorService executorService2 = Executors.newFixedThreadPool(6);
start = System.currentTimeMillis();
List<Integer> results = sleepTimes.stream()
.map(sleepTime -> CompletableFuture.supplyAsync(() -> sleepTask(sleepTime), executorService2)
.exceptionally(ex -> { ex.printStackTrace(); return -1; }))
.map(CompletableFuture::join)
.collect(Collectors.toList());
executorService2.shutdown();
finish = System.currentTimeMillis();
timeElapsed = (finish - start)/1000;
System.out.println(String.format("done in %d seconds.", timeElapsed));
System.out.println(results);
}
}
Output
WITH SEPARATE STREAMS FOR FUTURE AND JOIN
Task with sleep time 6
Task with sleep time 5
Task with sleep time 1
Task with sleep time 3
Task with sleep time 2
Task with sleep time 4
done in 6 seconds.
[1, 2, 3, 4, 5, 6]
WITH SAME STREAM FOR FUTURE AND JOIN
Task with sleep time 1
Task with sleep time 2
Task with sleep time 3
Task with sleep time 4
Task with sleep time 5
Task with sleep time 6
done in 21 seconds.
[1, 2, 3, 4, 5, 6]
The two approaches are quite different, let me try to explain it clearly
1st approach : In the first approach you are spinning up all Async requests for all 6 tasks and then calling join function on each one of them to get the result
2st approach : But in the second approach you are calling the join immediately after spinning the Async request for each task. For example after spinning Async thread for task 1 calling join, make sure that thread to complete task and then only spin up the second task with Async thread
Note : Another side if you observe the output clearly, In the 1st approach output appears in random order since the all six tasks were executed asynchronously. But during second approach all tasks were executed sequentially one after the another.
I believe you have an idea how stream map operation is performed, or you can get more information from here or here
To perform a computation, stream operations are composed into a stream pipeline. A stream pipeline consists of a source (which might be an array, a collection, a generator function, an I/O channel, etc), zero or more intermediate operations (which transform a stream into another stream, such as filter(Predicate)), and a terminal operation (which produces a result or side-effect, such as count() or forEach(Consumer)). Streams are lazy; computation on the source data is only performed when the terminal operation is initiated, and source elements are consumed only as needed.
The stream framework does not define the order in which map operations are executed on stream elements, because it is not intended for use cases in which that might be a relevant issue. As a result, the particular way your second version is executing is equivalent, essentially, to
List<Integer> results = new ArrayList<>();
for (Integer sleepTime : sleepTimes) {
results.add(CompletableFuture
.supplyAsync(() -> sleepTask(sleepTime), executorService2)
.exceptionally(ex -> { ex.printStackTrace(); return -1; }))
.join());
}
...which is itself essentially equivalent to
List<Integer> results = new ArrayList<>()
for (Integer sleepTime : sleepTimes) {
results.add(sleepTask(sleepTime));
}
#Deadpool answered it pretty well, just adding my answer which can help someone understand it better.
I was able to get an answer by adding more printing to both methods.
TLDR
2 stream approach: We are starting up all 6 tasks asynchronously and then calling join function on each one of them to get the result in a separate stream.
1 stream approach: We are calling the join immediately after starting up each task. For example after spinning a thread for task 1, calling join makes sure the thread waits for completion of task 1 and then only spin up the second task with async thread.
Note: Also, if we observe the output clearly, in the 1 stream approach, output appears sequential order since the all six tasks were executed in order. But during second approach all tasks were executed in parallel, hence the random order.
Note 2: If we replace stream() with parallelStream() in the 1 stream approach, it will work identically to 2 stream approach.
More proof
I added more printing to the streams which gave the following outputs and confirmed the note above :
1 stream:
List<Integer> results = sleepTimes.stream()
.map(sleepTime -> CompletableFuture.supplyAsync(() -> sleepTask(sleepTime), executorService2)
.exceptionally(ex -> { ex.printStackTrace(); return -1; }))
.map(f -> {
int num = f.join();
System.out.println(String.format("doing join on task %d", num));
return num;
})
.collect(Collectors.toList());
WITH SAME STREAM FOR FUTURE AND JOIN
Task with sleep time 1
doing join on task 1
Task with sleep time 2
doing join on task 2
Task with sleep time 3
doing join on task 3
Task with sleep time 4
doing join on task 4
Task with sleep time 5
doing join on task 5
Task with sleep time 6
doing join on task 6
done in 21 seconds.
[1, 2, 3, 4, 5, 6]
2 streams:
List<CompletableFuture<Integer>> futures = sleepTimes.stream()
.map(sleepTime -> CompletableFuture.supplyAsync(() -> sleepTask(sleepTime), executorService)
.exceptionally(ex -> { ex.printStackTrace(); return -1; }))
.collect(Collectors.toList());
List<Integer> result = futures.stream()
.map(f -> {
int num = f.join();
System.out.println(String.format("doing join on task %d", num));
return num;
})
.collect(Collectors.toList());
WITH SEPARATE STREAMS FOR FUTURE AND JOIN
Task with sleep time 2
Task with sleep time 5
Task with sleep time 3
Task with sleep time 1
Task with sleep time 4
Task with sleep time 6
doing join on task 1
doing join on task 2
doing join on task 3
doing join on task 4
doing join on task 5
doing join on task 6
done in 6 seconds.
[1, 2, 3, 4, 5, 6]

exec multiThread for one process - java

I have a problem, like this :
I have an array with 50 elements, i'd like calculate with each element, for faster, how do devide for 5 threads, each thread handling and calculate for 10 elements, and not duplicate with orther threads.
And remember number of thread like a variable, maybe 5 or 10 or any number.
I try use like :
ExecutorService executor = Executors.newCachedThreadPool();
for(int i = 1; i <= 5; i++){ //mycalculate }
but all of 5 threads just process 10 elements first.
Anyone can help me ! please.
(Hope you understand my question, my English not good)
Thanks
ExecutorService executor = Executors.newCachedThreadPool();
for (int i = 1; i <= 5; i++) {
int final taskNo = i;
executor.submit(new Runnable() {
public void run() {
// perform 'mycalculate' for task 'taskNo'
}
});
}
(That can be written more neatly using lambdas, but lets stick with the "classic Java" way for now.)
This does not deal with the issue of how to wait for the tasks to finish. For that, you could capture the Future objects that submit returns, and call get on each one.
It also doesn't deal with any synchronization that would be necessary if the tasks changed any shared objects.
i'd like calculate with each element, for faster ...
If the 'mycalculate' task is lengthy, and the tasks don't interfere with each other, and you have multiple cores, this approach should give some speedup.
Try parallel stream like this.
SomeClass[] array = new SomeClass[50];
// fill array
Stream.of(array)
.parallel()
.forEach(e -> /* calculate */);

Why is my multi threaded sorting algorithm not faster than my single threaded mergesort

There are certain algorithms whose running time can decrease significantly when one divides up a task and gets each part done in parallel. One of these algorithms is merge sort, where a list is divided into infinitesimally smaller parts and then recombined in a sorted order. I decided to do an experiment to test whether or not I could I increase the speed of this sort by using multiple threads. I am running the following functions in Java on a Quad-Core Dell with Windows Vista.
One function (the control case) is simply recursive:
// x is an array of N elements in random order
public int[] mergeSort(int[] x) {
if (x.length == 1)
return x;
// Dividing the array in half
int[] a = new int[x.length/2];
int[] b = new int[x.length/2+((x.length%2 == 1)?1:0)];
for(int i = 0; i < x.length/2; i++)
a[i] = x[i];
for(int i = 0; i < x.length/2+((x.length%2 == 1)?1:0); i++)
b[i] = x[i+x.length/2];
// Sending them off to continue being divided
mergeSort(a);
mergeSort(b);
// Recombining the two arrays
int ia = 0, ib = 0, i = 0;
while(ia != a.length || ib != b.length) {
if (ia == a.length) {
x[i] = b[ib];
ib++;
}
else if (ib == b.length) {
x[i] = a[ia];
ia++;
}
else if (a[ia] < b[ib]) {
x[i] = a[ia];
ia++;
}
else {
x[i] = b[ib];
ib++;
}
i++;
}
return x;
}
The other is in the 'run' function of a class that extends thread, and recursively creates two new threads each time it is called:
public class Merger extends Thread
{
int[] x;
boolean finished;
public Merger(int[] x)
{
this.x = x;
}
public void run()
{
if (x.length == 1) {
finished = true;
return;
}
// Divide the array in half
int[] a = new int[x.length/2];
int[] b = new int[x.length/2+((x.length%2 == 1)?1:0)];
for(int i = 0; i < x.length/2; i++)
a[i] = x[i];
for(int i = 0; i < x.length/2+((x.length%2 == 1)?1:0); i++)
b[i] = x[i+x.length/2];
// Begin two threads to continue to divide the array
Merger ma = new Merger(a);
ma.run();
Merger mb = new Merger(b);
mb.run();
// Wait for the two other threads to finish
while(!ma.finished || !mb.finished) ;
// Recombine the two arrays
int ia = 0, ib = 0, i = 0;
while(ia != a.length || ib != b.length) {
if (ia == a.length) {
x[i] = b[ib];
ib++;
}
else if (ib == b.length) {
x[i] = a[ia];
ia++;
}
else if (a[ia] < b[ib]) {
x[i] = a[ia];
ia++;
}
else {
x[i] = b[ib];
ib++;
}
i++;
}
finished = true;
}
}
It turns out that function that does not use multithreading actually runs faster. Why? Does the operating system and the java virtual machine not "communicate" effectively enough to place the different threads on different cores? Or am I missing something obvious?
The problem is not multi-threading: I've written a correctly multi-threaded QuickSort in Java and it owns the default Java sort. I did this after witnessing a gigantic dataset being process and had only one core of a 16-cores machine working.
One of your issue (a huge one) is that you're busy looping:
// Wait for the two other threads to finish
while(!ma.finished || !mb.finished) ;
This is a HUGE no-no: it is called busy looping and you're destroying the perfs.
(Another issue is that your code is not spawning any new threads, as it has already been pointed out to you)
You need to use other way to synchronize: an example would be to use a CountDownLatch.
Another thing: there's no need to spawn two new threads when you divide the workload: spawn only one new thread, and do the other half in the current thread.
Also, you probably don't want to create more threads than there are cores availables.
See my question here (asking for a good Open Source multithreaded mergesort/quicksort/whatever). The one I'm using is proprietary, I can't paste it.
Multithreaded quicksort or mergesort
I haven't implemented Mergesort but QuickSort and I can tell you that there's no array copying going on.
What I do is this:
pick a pivot
exchange values as needed
have we reached the thread limit? (depending on the number of cores)
yes: sort first part in this thread
no: spawn a new thread
sort second part in current thread
wait for first part to finish if it's not done yet (using a CountDownLatch).
The code spawning a new thread and creating the CountDownLatch may look like this:
final CountDownLatch cdl = new CountDownLatch( 1 );
final Thread t = new Thread( new Runnable() {
public void run() {
quicksort(a, i+1, r );
cdl.countDown();
}
} };
The advantage of using synchronization facilities like the CountDownLatch is that it is very efficient and that your not wasting time dealing with low-level Java synchronization idiosynchrasies.
In your case, the "split" may look like this (untested, it is just to give an idea):
if ( threads.getAndIncrement() < 4 ) {
final CountDownLatch innerLatch = new CountDownLatch( 1 );
final Thread t = new Merger( innerLatch, b );
t.start();
mergeSort( a );
while ( innerLatch.getCount() > 0 ) {
try {
innerLatch.await( 1000, TimeUnit.SECONDS );
} catch ( InterruptedException e ) {
// Up to you to decide what to do here
}
}
} else {
mergeSort( a );
mergeSort( b );
}
(don't forget to "countdown" the latch when each merge is done)
Where you'd replace the number of threads (up to 4 here) by the number of available cores. You may use the following (once, say to initialize some static variable at the beginning of your program: the number of cores is unlikely to change [unless you're on a machine allowing CPU hotswapping like some Sun systems allows]):
Runtime.getRuntime().availableProcessors()
As others said; This code isn't going to work because it starts no new threads. You need to call the start() method instead of the run() method to create new threads. It also has concurrency errors: the checks on the finished variable are not thread safe.
Concurrent programming can be pretty difficult if you do not understand the basics. You might read the book Java Concurrency in Practice by Brian Goetz. It explains the basics and explains constructs (such as Latch, etc) to ease building concurrent programs.
The overhead cost of synchronization may be comparatively large and prevent many optimizations.
Furthermore you are creating way too many threads.
The other is in the 'run' function of a class that extends thread, and recursively creates two new threads each time it is called.
You would be better off with a fixed number of threads, suggestively 4 on a quad core. This could be realized with a thread pool (tutorial) and the pattern would be "bag of tasks". But perhaps it would be better yet, to initially divide the task into four equally large tasks and do "single-threaded" sorting on those tasks. This would then utilize the caches a lot better.
Instead of having a "busy-loop" waiting for the threads to finish (stealing cpu-cycles) you should have a look at Thread.join().
How many elements in the array you have to do sort? If there are too few elements, the time of sync and CPU switching will over the time you save for dividing the job for paralleling

Categories

Resources