Multi thread array sort Java - java

So hello!
I'm trying to sort int array with several threads. Now I've got something like this:
import java.util.Arrays;
public class MultiTread extends Thread{
int sizeOfArray;
public static int treadsN = 4;
public static int sortFrom;
public static int sortTo;
public MultiTread(int sizeOfArray, int treadsN) {
this.sizeOfArray = sizeOfArray;
this.treadsN = treadsN;
}
public MultiTread(int[] arrayToSort, int sizeOfArray) {
this.sizeOfArray = sizeOfArray;
}
public static int [] creatingArray(int sizeOfArray) {
int [] arrayToSort = new int[sizeOfArray];
int arrayLength = arrayToSort.length;
for (int counter = 0; counter<arrayLength; counter++){
arrayToSort[counter] = (int)(8000000*Math.random());
}
return arrayToSort;
}
public static int [] sortInSeveralTreads(final int [] arrayToSort){
int [] newArr = new int[arrayToSort.length];
int numberOfThreads = treadsN;
if (numberOfThreads == 0){
System.out.println("Incorrect value");
return arrayToSort;
}
if (numberOfThreads == 1){
Arrays.sort(arrayToSort);
System.out.println("Array sorted in 1 thread");
} else {
final int lengthOfSmallArray = arrayToSort.length/numberOfThreads;
sortFrom = 0;
sortTo = lengthOfSmallArray;
for (int progress = 0; progress < numberOfThreads; progress++){
final int [] tempArr = Arrays.copyOfRange(arrayToSort,sortFrom,sortTo);
new Thread(){public void run() {
Arrays.sort(tempArr);
}}.start();
sortFrom = sortTo;
sortTo += lengthOfSmallArray;
newArr = mergeSort(newArr, tempArr);
}
new Thread(){public void run() {
Arrays.sort(Arrays.copyOfRange(arrayToSort, arrayToSort.length-lengthOfSmallArray, arrayToSort.length));
}}.start();
newArr = mergeSort(newArr, arrayToSort);
}
return newArr;
}
public static int [] mergeSort(int [] arrayFirst, int [] arraySecond){
int [] outputArray = new int[arrayFirst.length+arraySecond.length];
while (arrayFirst.length != 0 && arraySecond.length != 0){
int counter = 0;
if (arrayFirst[0] < arraySecond[0]){
outputArray[counter] = arrayFirst[0];
counter++;
arrayFirst = Arrays.copyOfRange(arrayFirst, 1, arrayFirst.length);
}else {
outputArray[counter] = arraySecond[0];
counter++;
arraySecond = Arrays.copyOfRange(arraySecond, 1, arraySecond.length);
}
}
return outputArray;
}
public static void main(String[] args){
long startTime;
long endTime;
int [] a = creatingArray(8000000);
startTime = System.currentTimeMillis();
a = sortInSeveralTreads(a);
endTime = System.currentTimeMillis();
System.out.println(Thread.activeCount());
System.out.printf("Sorted by: %d treads in %.7f seconds %n", treadsN, (float)(endTime-startTime)*1e-6);
try {
Thread.currentThread().sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println(Thread.activeCount());
}
}
I know that it's not very good realization. All works fine but merge sort works too bad - it crashes... Error should be in that lines:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOfRange(Arrays.java:3137)
at MultiTread.mergeSort(MultiTread.java:78)
at MultiTread.sortInSeveralTreads(MultiTread.java:61)
at MultiTread.main(MultiTread.java:94)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
What i'm doing wrong?

Well, you're running out of the memory allowed by use by the J.V.M..
I've not checked to see that your algorithm is correct, but you should try the code with a much smaller array to see if it works alright, say, 1000.
You make several copies of the array (or at least partial copies) throughout your program, in threads. Each thread then is allocating a large amount of data. For this reason, you may wish to reconsider your design, or you will continue to run into this problem. If you find no way to reduce this allocation and my next paragraph does not help you, then you may need to resort to the use of files to sort these large arrays, instead of attempting to hold everything in memory at once.
You can increase the heap size by following instructions on this page (first link I found, but it has the right information):
http://viralpatel.net/blogs/2009/01/jvm-java-increase-heap-size-setting-heap-size-jvm-heap.html
This will allow you to hold allocate more memory from your Java program.

Okay there are several issues:
First your array size is way too large. Which is why you are running out of Memory (8000000) should be tried with a low number say 1000. Even with that your code will probably crash elsewhere as it stands.
Second your code makes very little sense as it stands you are mixing static and non static calls ... e.g. Thread.currentThread().sleep(1000) is bizarre, you are not in the current Thread.
The thread creation where you perform Array.sort.
I suggest that first you create a simple multi-threaded program before dealing with sorting job. Also recommend that you implement Runnable interface rather than extending the Thread class to create your worker class.

while (arrayFirst.length != 0 && arraySecond.length != 0){
here it goes in infinite loop once the conditions are satisfied and hence the memory heap is getting OutOfMemoryError
It is not at all terminated so You should include some code to terminate this loop.

Related

Correcting and Condensing Java Program

I think I've almost figured out my java program. It is designed to read a text file and find the largest integer by using 10 different threads. I'm getting this error though:
Error:(1, 8) java: class Worker is public, should be declared in a file named Worker.java
I feel my code may be more complex than it needs to be so I'm trying to figure out how to shrink it down in size while also fixing the error above. Any assistance in this matter would be greatly appreciated and please let me know if I can clarify anything. Also, does the "worker" class have to be a seperate file? I added it to the same file but getting the error above.
import java.io.BufferedReader;
import java.io.FileReader;
public class datafile {
public static void main(String[] args) {
int[] array = new int[100000];
int count;
int index = 0;
String datafile = "dataset529.txt"; //string which contains datafile
String line; //current line of text file
try (BufferedReader br = new BufferedReader(new FileReader(datafile))) { //reads in the datafile
while ((line = br.readLine()) != null) { //reads through each line
array[index++] = Integer.parseInt(line); //pulls out the number of each line and puts it in numbers[]
}
}
Thread[] threads = new Thread[10];
worker[] workers = new worker[10];
int range = array.length / 10;
for (count = 0; count < 10; count++) {
int startAt = count * range;
int endAt = startAt + range;
workers[count] = new worker(startAt, endAt, array);
}
for (count = 0; count < 10; count++) {
threads[count] = new Thread(workers[count]);
threads[count].start();
}
boolean isProcessing = false;
do {
isProcessing = false;
for (Thread t : threads) {
if (t.isAlive()) {
isProcessing = true;
break;
}
}
} while (isProcessing);
for (worker worker : workers) {
System.out.println("Max = " + worker.getMax());
}
}
}
public class worker implements Runnable {
private int startAt;
private int endAt;
private int randomNumbers[];
int max = Integer.MIN_VALUE;
public worker(int startAt, int endAt, int[] randomNumbers) {
this.startAt = startAt;
this.endAt = endAt;
this.randomNumbers = randomNumbers;
}
#Override
public void run() {
for (int index = startAt; index < endAt; index++) {
if (randomNumbers != null && randomNumbers[index] > max)
max = randomNumbers[index];
}
}
public int getMax() {
return max;
}
}
I've written a few comments but I'm going to gather them all in an answer so anyone in future can see the aggregate info:
At the end of your source for the readtextfile class (which should be ReadTextile per java naming conventions) you have too many closing braces,
} while (isProcessing);
for (Worker worker : workers) {
System.out.println("Max = " + worker.getMax());
}
}
}
}
}
The above should end on the first brace that hits the leftmost column. This is a good rule of thumb when making any Java class, if you have more than one far-left brace or your last brace isn't far-left you've probably made a mistake somewhere and should go through checking your braces.
As for your file issues You should have all your classes named following Java conventions and each class should be stored in a file called ClassName.java (case sensitive). EG:
public class ReadTextFileshould be stored in ReadTextFile.java
You can also have Worker be an inner class. To do this you could pretty much just copy the source code into the ReadTextFile class (make sure it's outside of the main method). See this tutorial on inner classes for a quick overview.
As for the rest of your question Code Review SE is the proper place to ask that, and the smart folks over there probably will provide better answers than I could. However I'd also suggest using 10 threads is probably not the most efficient way in to find the largest int in a text file (both in development and execution times).

JVM seems to stop context switching very quickly

I'm implementing the naive version of the Producer-Consumer concurrency problem. And it the threads are switched between at first very quickly but then stop around i = 50. Adding additional print statements for some reason allows the JVM to context switch the threads and complete the program.
Why doesn't the JVM context switch the threads so that the program will complete?
// Producer - Consumer problem
// Producer constantly puts items into array, while consumer takes them out
class IntBuffer {
private int[] buffer;
private int index;
public IntBuffer(int size) {
buffer = new int[size];
index = 0;
}
public void add(int item) {
while (true) {
if (index < buffer.length) {
buffer[index] = item;
index++;
return;
}
}
}
public int remove() {
while (true) {
if (index > 0) {
index--;
int tmp = buffer[index];
buffer[index] = 0;
return tmp;
}
}
}
public void printState() {
System.out.println("Index " + index);
System.out.println("State " + this);
}
public String toString() {
String res = "";
for (int i = 0; i < buffer.length; i++) {
res += buffer[i] + " ";
}
return res;
}
}
class Producer extends Thread {
private IntBuffer buffer;
public Producer(IntBuffer buffer) {
this.buffer = buffer;
}
public void run() {
for (int i = 0; i < 1000; i++) {
System.out.println("added " + i);
buffer.add(i);
}
}
}
class Consumer extends Thread {
private IntBuffer buffer;
public Consumer(IntBuffer buffer) {
this.buffer = buffer;
}
public void run() {
for (int i = 0; i < 1000; i++) {
System.out.println("removed " + i);
buffer.remove();
}
}
}
public class Main {
public static void main(String[] args) {
IntBuffer buf = new IntBuffer(10);
Thread t1 = new Thread(new Producer(buf));
Thread t2 = new Thread(new Consumer(buf));
t1.start();
t2.start();
System.out.println(buf);
}
}
Your question does not provide enough details to give an answer with a confidence(at least, it is not clear where those additional print statements go), so I'll make some(reasonable) guesses here.
Your code is not correct. IntBuffer is not thread-safe, but it is accessed from multiple threads.
Any operations on the IntBuffer do not establish a happens-before relationship, so the changes made by one thread may be not visible for another thread. That's why the Producer thread can "believe" that the buffer is full while the Consumer thread "believes" that it is empty. In this case the program never terminates.
This two statements are not guesses, they are facts based on the Java memory model. And here goes my guess why the additional print statements sorta fix it:
In many JVM implementations, the println methods uses syncronization internally. That's why a call to it creates a memory fence and makes changes made in one thread visible to the other one, eliminating the issue described in 2).
However, if you really want to solve this problem, you should make the IntBuffer thread-safe.
At the minimum you need the volatile keyword on both the buffer and index. Second, you need to access index only once under the true arm of the ifs you have there. Even after that, you will face out of bounds access at 10, you will need more fixing to work around that. Your buffer is de facto stack. So, even after all of this, your remove() can be working with stale index, thus you will be removing in the middle of the stack. You could use 0 as special value marking the slot already handled end empty.
With all of this, I do not think your code is easily salvageable. It pretty much needs complete rewrite using proper facilities. I agree with #kraskevich:
#StuartHa Naive usually means simple(and most likely inefficent) solution, not an incorrect one.

Is my code in a state of deadlock?

On compiling my code below it seems to be in a state of deadlock, and i don't know how i can fix it. I am attempting to write a pipeline as a sequence of threads linked together as a buffer, and each thread can read the preceding node in the pipeline, and consequentially write to the next one. The overall goal is to spilt a randomly generated arraylist of data over 10 threads and sort it.
class Buffer{
// x is the current node
private int x;
private boolean item;
private Lock lock = new ReentrantLock();
private Condition full = lock.newCondition();
private Condition empty = lock.newCondition();
public Buffer(){item = false;}
public int read(){
lock.lock();
try{
while(!item)
try{full.await();}
catch(InterruptedException e){}
item = false;
empty.signal();
return x;
}finally{lock.unlock();}
}
public void write(int k){
lock.lock();
try{
while(item)
try{empty.await();}
catch(InterruptedException e){}
x = k; item = true;
full.signal();
}finally{lock.unlock();}
}
}
class Pipeline extends Thread {
private Buffer b;
//private Sorted s;
private ArrayList<Integer> pipe; // array pipeline
private int ub; // upper bounds
private int lb; // lower bounds
public Pipeline(Buffer bf, ArrayList<Integer> p, int u, int l) {
pipe = p;ub = u;lb = l;b = bf;//s = ss;
}
public void run() {
while(lb < ub) {
if(b.read() > pipe.get(lb+1)) {
b.write(pipe.get(lb+1));
}
lb++;
}
if(lb == ub) {
// store sorted array segment
Collections.sort(pipe);
new Sorted(pipe, this.lb, this.ub);
}
}
}
class Sorted {
private volatile ArrayList<Integer> shared;
private int ub;
private int lb;
public Sorted(ArrayList<Integer> s, int u, int l) {
ub = u;lb = l;shared = s;
// merge data to array from given bounds
}
}
class Test1 {
public static void main(String[] args) {
int N = 1000000;
ArrayList<Integer> list = new ArrayList<Integer>();
for(int i=0;i<N;i++) {
int k = (int)(Math.random()*N);
list.add(k);
}
// write to buffer
Buffer b = new Buffer();
b.write(list.get(0));
//Sorted s = new Sorted();
int maxBuffer = 10;
int index[] = new int[maxBuffer+1];
Thread workers[] = new Pipeline[maxBuffer];
// Distribute data evenly over threads
for(int i=0;i<maxBuffer;i++)
index[i] = (i*N) / maxBuffer;
for(int i=0;i<maxBuffer;i++) {
// create instacen of pipeline
workers[i] = new Pipeline(b,list,index[i],index[i+1]);
workers[i].start();
}
// join threads
try {
for(int i=0;i<maxBuffer;i++) {
workers[i].join();
}
} catch(InterruptedException e) {}
boolean sorted = true;
System.out.println();
for(int i=0;i<list.size()-1;i++) {
if(list.get(i) > list.get(i+1)) {
sorted = false;
}
}
System.out.println(sorted);
}
}
When you start the run methods, all threads will block until the first thread hits full.await(). then one after the other, all threads will end up hitting full.await(). they will wait for this signal.
However the only place where full.signal occurs is after one of the read methods finishes.
As this code is never reached (because the signal is never fired) you end up with all threads waiting.
in short, only after 1 read finishes, will the writes trigger.
if you reverse the logic, you start empty, you write to the buffer (with signal, etc, etc) and then the threads try to read, I expect it will work.
generally speaking you want to write to a pipeline before reading from it. (or there's nothing to read).
I hope i'm not misreading your code but that's what I see on first scan.
Your Buffer class it flipping between read and write mode. Each read must be followed by a write, that by a read and so on.
You write the buffer initially in your main method.
Now one of your threads reaches if(b.read() > pipe.get(lb+1)) { in Pipeline#run. If that condition evaluates to false, then nothing gets written. And since every other thread must still be the very same if(b.read(), you end up with all reading threads that can't progress. You will either have to write in the else branch or allow multiple reads.

Multi threaded object creation slower then in a single thread

I have what probably is a basic question. When I create 100 million Hashtables it takes approximately 6 seconds (runtime = 6 seconds per core) on my machine if I do it on a single core. If I do this multi-threaded on 12 cores (my machine has 6 cores that allow hyperthreading) it takes around 10 seconds (runtime = 112 seconds per core).
This is the code I use:
Main
public class Tests
{
public static void main(String args[])
{
double start = System.currentTimeMillis();
int nThreads = 12;
double[] runTime = new double[nThreads];
TestsThread[] threads = new TestsThread[nThreads];
int totalJob = 100000000;
int jobsize = totalJob/nThreads;
for(int i = 0; i < threads.length; i++)
{
threads[i] = new TestsThread(jobsize,runTime, i);
threads[i].start();
}
waitThreads(threads);
for(int i = 0; i < runTime.length; i++)
{
System.out.println("Runtime thread:" + i + " = " + (runTime[i]/1000000) + "ms");
}
double end = System.currentTimeMillis();
System.out.println("Total runtime = " + (end-start) + " ms");
}
private static void waitThreads(TestsThread[] threads)
{
for(int i = 0; i < threads.length; i++)
{
while(threads[i].finished == false)//keep waiting untill the thread is done
{
//System.out.println("waiting on thread:" + i);
try {
Thread.sleep(1);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
}
}
Thread
import java.util.HashMap;
import java.util.Map;
public class TestsThread extends Thread
{
int jobSize = 0;
double[] runTime;
boolean finished;
int threadNumber;
TestsThread(int job, double[] runTime, int threadNumber)
{
this.finished = false;
this.jobSize = job;
this.runTime = runTime;
this.threadNumber = threadNumber;
}
public void run()
{
double start = System.nanoTime();
for(int l = 0; l < jobSize ; l++)
{
double[] test = new double[65];
}
double end = System.nanoTime();
double difference = end-start;
runTime[threadNumber] += difference;
this.finished = true;
}
}
I do not understand why creating the object simultaneously in multiple threads takes longer per thread then doing it in serial in only 1 thread. If I remove the line where I create the Hashtable this problem disappears. If anyone could help me with this I would be greatly thankful.
Update: This problem has an associated bug report and has been fixed with Java 1.7u40. And it was never an issue for Java 1.8 as Java 8 has an entirely different hash table algorithm.
Since you are not using the created objects that operation will get optimized away. So you’re only measuring the overhead of creating threads. This is surely the more overhead the more threads you start.
I have to correct my answer regarding a detail, I didn’t know yet: there is something special with the classes Hashtable and HashMap. They both invoke sun.misc.Hashing.randomHashSeed(this) in the constructor. In other words, their instances escape during construction which has an impact on the memory visibility. This implies that their construction, unlike let’s say for an ArrayList, cannot optimized away, and multi-threaded construction slows down due to what happens inside that method (i.e. synchronization).
As said, that’s special to these classes and of course this implementation (my setup:1.7.0_13). For ordinary classes the construction time goes straight to zero for such code.
Here I add a more sophisticated benchmark code. Watch the difference between DO_HASH_MAP = true and DO_HASH_MAP = false (when false it will create an ArrayList instead which has no such special behavior).
import java.util.*;
import java.util.concurrent.*;
public class AllocBench {
static final int NUM_THREADS = 1;
static final int NUM_OBJECTS = 100000000 / NUM_THREADS;
static final boolean DO_HASH_MAP = true;
public static void main(String[] args) throws InterruptedException, ExecutionException {
ExecutorService threadPool = Executors.newFixedThreadPool(NUM_THREADS);
Callable<Long> task=new Callable<Long>() {
public Long call() {
return doAllocation(NUM_OBJECTS);
}
};
long startTime=System.nanoTime(), cpuTime=0;
for(Future<Long> f: threadPool.invokeAll(Collections.nCopies(NUM_THREADS, task))) {
cpuTime+=f.get();
}
long time=System.nanoTime()-startTime;
System.out.println("Number of threads: "+NUM_THREADS);
System.out.printf("entire allocation required %.03f s%n", time*1e-9);
System.out.printf("time x numThreads %.03f s%n", time*1e-9*NUM_THREADS);
System.out.printf("real accumulated cpu time %.03f s%n", cpuTime*1e-9);
threadPool.shutdown();
}
static long doAllocation(int numObjects) {
long t0=System.nanoTime();
for(int i=0; i<numObjects; i++)
if(DO_HASH_MAP) new HashMap<Object, Object>(); else new ArrayList<Object>();
return System.nanoTime()-t0;
}
}
What about if you do it on 6 cores? Hyperthreading isn't the exact same as having double the cores, so you might want to try the amount of real cores too.
Also the OS won't necessarily schedule each of your threads to their own cores.
Since all you are doing is measuring the time and churning memory, your bottleneck is likely to be in your L3 cache or bus to main memory. In this cases, coordinating the work between threads could be producing so much overhead it is worse instead of better.
This is too long for a comment but your inner loop can be just
double start = System.nanoTime();
for(int l = 0; l < jobSize ; l++){
Map<String,Integer> test = new HashMap<String,Integer>();
}
// runtime is an AtomicLong for thread safety
runtime.addAndGet(System.nanoTime() - start); // time in nano-seconds.
Taking the time can be as slow creating a HashMap so you might not be measuring what you think you if you call the timer too often.
BTW Hashtable is synchronized and you might find using HashMap is faster, and possibly more scalable.

Java lock/concurrency issue when searching array with multiple threads

I am new to Java and trying to write a method that finds the maximum value in a 2D array of longs.
The method searches through each row in a separate thread, and the threads maintain a shared current maximal value. Whenever a thread finds a value larger than its own local maximum, it compares this value with the shared local maximum and updates its current local maximum and possibly the shared maximum as appropriate. I need to make sure that appropriate synchronization is implemented so that the result is correct regardless of how to computations interleave.
My code is verbose and messy, but for starters, I have this function:
static long sharedMaxOf2DArray(long[][] arr, int r){
MyRunnableShared[] myRunnables = new MyRunnableShared[r];
for(int row = 0; row < r; row++){
MyRunnableShared rr = new MyRunnableShared(arr, row, r);
Thread t = new Thread(rr);
t.start();
myRunnables[row] = rr;
}
return myRunnables[0].sharedMax; //should be the same as any other one (?)
}
For the adapted runnable, I have this:
public static class MyRunnableShared implements Runnable{
long[][] theArray;
private int row;
private long rowMax;
public long localMax;
public long sharedMax;
private static Lock sharedMaxLock = new ReentrantLock();
MyRunnableShared(long[][] a, int r, int rm){
theArray = a;
row = r;
rowMax = rm;
}
public void run(){
localMax = 0;
for(int i = 0; i < rowMax; i++){
if(theArray[row][i] > localMax){
localMax = theArray[row][i];
sharedMaxLock.lock();
try{
if(localMax > sharedMax)
sharedMax = localMax;
}
finally{
sharedMaxLock.unlock();
}
}
}
}
}
I thought this use of a lock would be a safe way to prevent multiple threads from messing with the sharedMax at a time, but upon testing/comparing with a non-concurrent maximum-finding function on the same input, I found the results to be incorrect. I'm thinking the problem might come from the fact that I just say
...
t.start();
myRunnables[row] = rr;
...
in the sharedMaxOf2DArray function. Perhaps a given thread needs to finish before I put it in the array of myRunnables; otherwise, I will have "captured" the wrong sharedMax? Or is it something else? I'm not sure on the timing of things..
I'm not sure if this is a typo or not, but your Runnable implementation declares sharedMax as an instance variable:
public long sharedMax;
rather than a shared one:
public static long sharedMax;
In the former case, each Runnable gets its own copy and will not "see" the values of others. Changing it to the latter should help. Or, change it to:
public long[] sharedMax; // array of size 1 shared across all threads
and you can now create an array of size one outside the loop and pass it in to each Runnable to use as shared storage.
As an aside: please note that there will be tremendous lock contention since every thread checks the common sharedMax value by holding a lock for every iteration of its loop. This will likely lead to poor performance. You'd have to measure, but I'd surmise that letting each thread find the row maximum and then running a final pass to find the "max of maxes" might actually be comparable or quicker.
From JavaDocs:
public interface Callable
A task that returns a result and may
throw an exception. Implementors define a single method with no
arguments called call.
The Callable interface is similar to Runnable, in that both are
designed for classes whose instances are potentially executed by
another thread. A Runnable, however, does not return a result and
cannot throw a checked exception.
Well, you can use Callable to calculate your result from one 1darray and wait with an ExecutorService for the end. You can now compare each result of the Callable to fetch the maximum. The code may look like this:
Random random = new Random(System.nanoTime());
long[][] myArray = new long[5][5];
for (int i = 0; i < 5; i++) {
myArray[i] = new long[5];
for (int j = 0; j < 5; j++) {
myArray[i][j] = random.nextLong();
}
}
ExecutorService executor = Executors.newFixedThreadPool(myArray.length);
List<Future<Long>> myResults = new ArrayList<>();
// create a callable for each 1d array in the 2d array
for (int i = 0; i < myArray.length; i++) {
Callable<Long> callable = new SearchCallable(myArray[i]);
Future<Long> callResult = executor.submit(callable);
myResults.add(callResult);
}
// This will make the executor accept no new threads
// and finish all existing threads in the queue
executor.shutdown();
// Wait until all threads are finish
while (!executor.isTerminated()) {
}
// now compare the results and fetch the biggest one
long max = 0;
for (Future<Long> future : myResults) {
try {
max = Math.max(max, future.get());
} catch (InterruptedException | ExecutionException e) {
// something bad happend...!
e.printStackTrace();
}
}
System.out.println("The result is " + max);
And your Callable:
public class SearchCallable implements Callable<Long> {
private final long[] mArray;
public SearchCallable(final long[] pArray) {
mArray = pArray;
}
#Override
public Long call() throws Exception {
long max = 0;
for (int i = 0; i < mArray.length; i++) {
max = Math.max(max, mArray[i]);
}
System.out.println("I've got the maximum " + max + ", and you guys?");
return max;
}
}
Your code has serious lock contention and thread safety issues. Even worse, it doesn't actually wait for any of the threads to finish before the return myRunnables[0].sharedMax which is a really bad race condition. Also, using explicit locking via ReentrantLock or even synchronized blocks is usually the wrong way of doing things unless you're implementing something low level (eg your own/new concurrent data structure)
Here's a version that uses the Future concurrent primitive and an ExecutorService to handle the thread creation. The general idea is:
Submit a number of concurrent jobs to your ExecutorService
Add the Future returned backed from submit(...) to a List
Loop through the list calling get() on each Future and aggregating the result
This version has the added benefit that there is no lock contention (or locking in general) between the worker threads as each just returns back the max for its slice of the array.
import java.util.concurrent.*;
import java.util.*;
public class PMax {
public static long pmax(final long[][] arr, int numThreads) {
ExecutorService pool = Executors.newFixedThreadPool(numThreads);
try {
List<Future<Long>> list = new ArrayList<Future<Long>>();
for(int i=0;i<arr.length;i++) {
// put sub-array in a final so the inner class can see it:
final long[] subArr = arr[i];
list.add(pool.submit(new Callable<Long>() {
public Long call() {
long max = Long.MIN_VALUE;
for(int j=0;j<subArr.length;j++) {
if( subArr[j] > max ) {
max = subArr[j];
}
}
return max;
}
}));
}
// find the max of each slice's max:
long max = Long.MIN_VALUE;
for(Future<Long> future : list) {
long threadMax = future.get();
System.out.println("threadMax: " + threadMax);
if( threadMax > max ) {
max = threadMax;
}
}
return max;
} catch( RuntimeException e ) {
throw e;
} catch( Exception e ) {
throw new RuntimeException(e);
} finally {
pool.shutdown();
}
}
public static void main(String args[]) {
int x = 1000;
int y = 1000;
long max = Long.MIN_VALUE;
long[][] foo = new long[x][y];
for(int i=0;i<x;i++) {
for(int j=0;j<y;j++) {
long r = (long)(Math.random() * 100000000);
if( r > max ) {
// save this to compare against pmax:
max = r;
}
foo[i][j] = r;
}
}
int numThreads = 32;
long pmax = pmax(foo, numThreads);
System.out.println("max: " + max);
System.out.println("pmax: " + pmax);
}
}
Bonus: If you're calling this method repeatedly then it would probably make sense to pull the ExecutorService creation out of the method and have it be reused across calls.
Well, that definetly is an issue - but without more code it is hard to understand if it is the only thing.
There is basically a race condition between the access of thread[0] (and this read of sharedMax) and the modification of the sharedMax in other threads.
Think what happens if the scheduler decides to let no let any thread run for now - so when you are done creating the threads, you will return the answer without modifying it even once! (of course there are other possible scenarios...)
You can overcome it by join()ing all threads before returning an answer.

Categories

Resources