I would like to create a simple parallel Sieve of Erastosthenes Java program, that would be at least a bit more effective then a serial version I've posted below.
public void runEratosthenesSieve(int upperBound) {
int upperBoundSquareRoot = (int) Math.sqrt(upperBound);
boolean[] isComposite = new boolean[upperBound + 1];
for (int m = 2; m <= upperBoundSquareRoot; m++) {
if (!isComposite[m]) {
System.out.print(m + " ");
int threads=4;
for (int n=1; n<=threads; n++) {
int job;
if (n==1) {job = m * m;} else {job = (n-1)*upperBound/threads;}
int upToJob = n*upperBound/threads;
for (int k = job; k <= upToJob; k += m)
{
isComposite[k] = true;
}
}
}
}
for (int m = upperBoundSquareRoot; m <= upperBound; m++)
if (!isComposite[m])
System.out.print(m + " ");
}
I have created a loop for dividing work for 4 threads. Though I don't know how to make actual thread code from it. How to send variables and start 4 threads with part of job for each.
I can propose following solution: there are 4 workers thread and 1 master thread. Worker threads get jobs from queue. Job is basically 3 numbers: from, to, step. Master mean while must wait until all threads a done. When they're done it searches for next prime number and create 4 jobs. Synchronization between master and workers can be achieved using Semaphore: master tries to acquire 4 permits while every worker releases 1 permit when it's done.
public class Sieve {
// Number of workers. Make it static for simplicity.
private static final int THREADS = 4;
// array must be shared between master and workers threads so make it class property.
private boolean[] isComposite;
// Create blocking queue with size equal to number of workers.
private BlockingQueue<Job> jobs = new ArrayBlockingQueue<Job>(THREADS);
private Semaphore semaphore = new Semaphore(0);
// Create executor service in order to reuse worker threads.
// we can use just new Thread(new Worker()).start(). But using thread pools more effective.
private ExecutorService executor = Executors.newFixedThreadPool(THREADS);
public void runEratosthenesSieve(int upperBound) {
int upperBoundSquareRoot = (int) Math.sqrt(upperBound);
isComposite = new boolean[upperBound + 1];
// Start workers.
for (int i = 0; i < THREADS; i++) {
executor.submit(new Worker());
}
for (int m = 2; m <= upperBoundSquareRoot; m++) {
if (!isComposite[m]) {
System.out.print(m + " ");
for (int n=1; n<= THREADS; n++) {
int from;
if (n == 1) {
from = m * m;
} else {
from = (n-1)*upperBound/THREADS;
}
Job job = new Job(from, n*upperBound/threads, m);
// Submit job to queue. We don't care which worker gets the job.
// Important only that only 1 worker get the job. But BlockingQueue does all synchronization for us.
jobs.put(job);
}
// Wait until all jobs are done.
semaphore.acquire(THREADS);
}
}
for (int i = 0; i < n; i++) {
// put null to shutdown workers.
jobs.put(null);
}
for (int m = upperBoundSquareRoot; m <= upperBound; m++) {
if (!isComposite[m]) {
System.out.print(m + " ");
}
}
}
private class Job {
public int from, to, step;
public Job(int from, int to, int step) {
this.from = from;
this.to = to;
this.step = step;
}
}
private Worker implements Runnable {
while (true) {
Job job = jobs.take();
// null means workers must shutdown
if (job == null) {
return;
}
for (int i = job.from; i <= job.to; i += job.step) {
isComposite[i] = true;
}
// Notify master thread that a job was done.
semaphore.release();
}
}
}
Related
The followiing code uses threads to calculate the max value in a subarry, and then calculates the max value out of the max values the threads returned. I have a bug that the main thread doesn't wait for the threads to finish when collecting the results.
Thread class:
public class MaxTask extends Thread {
private int[] arr;
private int max;
private int first, last;
public MaxTask(int[] arr, int first, int last) {
this.arr = arr;
this.first = first;
this.last = last;
}
public int getMax() {
return max;
}
public void run() {
max = arr[first];
for (int i = first + 1; i <= last; i++) {
if (arr[i] > max) max = arr[i];
}
}
}
Main:
public class MainMax {
public static void main(String[] args) throws Exception {
int size = 100;
int workers = 10;
int[] arr = new int[size];
int max = 0;
for (int i = 0; i < size; i++) {
arr[i] = (int)(Math.random() * 100);
if (max < arr[i]) max = arr[i];
}
System.out.println("max=" + max);
int gsize = (arr.length - 1) / workers;
MaxTask[] tasks = new MaxTask[workers];
int first = 0;
int last;
for (int i = 0; i < workers; i++) {
last = first + gsize;
tasks[i] = new MaxTask(arr, first, last);
tasks[i].start();
first = last + 1;
}
int maxmax = tasks[0].getMax();
int temp;
for (int i = 1; i < workers; i++) {
temp = tasks[i].getMax();
if (temp > maxmax) maxmax = temp;
}
System.out.println("maxmax=" + maxmax);
}
}
I am trying to solve the problem using synchronized. I managed to get it working when using synchronized on both run and getMax. But I really don't understand why this solves the problem.
First, you must understand that the main class is also running on a thread. That thread is seperate from the threads you created for the function and is thus running in parallel to them. By that logic, int maxmax = tasks[0].getMax(); is running asynchronously and possibly before the loop is finished.
One possible solution would be to lock that part of the code and force the execution to wait before executing that line. Only release the lock after everyone in the loop is done. Synchronizing access to the run method only defeats the purpose of running multiple threads since you're forcing the whole thing to be sequential.
It is also not recommended to create a thread for every single element, since there's a tradeoff between number of threads and how much you're speeding up execution.
Can someone help me with this, please? I'm trying to do a matrix multiplication, using threads. This is what I have so far:
//updated
public class Multiplication {
public static final int NUM_OF_THREADS = 8;
public static final int MATRIX_SIZE = 1000;
public static void main(String args[]) throws InterruptedException {
long startTime = System.currentTimeMillis();
int MatrixA[][] = matrixGenerator();
int MatrixB[][] = matrixGenerator();
int m1rows = MatrixA.length;
int m1cols = MatrixA[0].length;
int m2cols = MatrixB[0].length;
int MatrixC[][] = new int[m1rows][m2cols];
ExecutorService pool = Executors.newFixedThreadPool(NUM_OF_THREADS);
for (int row1 = 0; row1 < m1rows; row1++) {
for (int col1 = 0; col1 < m1cols; col1++) {
pool.submit(new MultiplicationThreading(row1, col1, MatrixA, MatrixB, MatrixC));
}
}
pool.shutdown();
pool.awaitTermination(1, TimeUnit.DAYS);
long endTime = System.currentTimeMillis();
System.out.println("Calculated in "
+ (endTime - startTime) + " milliseconds");
}
public static int[][] matrixGenerator() {
int matrix[][] = new int[MATRIX_SIZE][MATRIX_SIZE];
Random r = new Random();
for (int i = 0; i < matrix.length; i++) {
for (int j = 0; j < matrix[i].length; j++) {
matrix[i][j] = r.nextInt(10000);
}
}
return matrix;
}
}
//I have updated the code
I get better timings now. When using 2 threads I get 1.5k milliseconds and when I use 8 threads 1.3k milliseconds
You initialize the thrd array with NUM_THREADS == 9 elements. If m1rows*m1cols exceeds that value, you will get this problem, since you attempt to create more than 9 threads and assign them to elements of the array. (You are attempting to create 50 threads).
Two solutions:
Initialize thrd = new Thread[m1rows*m1cols]
Use a List<Thread>.
Note that you won't execute the threads in parallel, because you are calling Thread.join() immediately after calling Thread.start(). This just blocks the current thread until thrd[threadcount] finishes.
Move the Thread.join() calls into a separate loop, so the threads are all started before you call join on any of them.
for (row = 0; row < m1rows; row++) {
for (col = 0; col < m1cols; col++) {
// creating thread for multiplications
thrd[threadcount] = new Thread(new MultiplicationThreading(row, col, MatrixA, MatrixB, MatrixC));
thrd[threadcount].start(); //thread start
threadcount++;
}
}
for (Thread thread : thrd) {
thread.join();
}
I've been writing the following code for my OS course and I got some weird results. The code creates x threads and runs them concurrently in order to multiply two squared matrices. Every thread will multiply Number_of_rows/Number_of_threads rows of the input matrices.
When running it on a 1024X1024 matrices, with 1...8 threads, I get that the fastest multiplication happens when using only one thread. I would expect that a MacBook pro with i5 processor (2-cores) will utilize the two cores and that will yield faster results when using two threads.
Running time goes from about ~9.2 seconds using one thread, ~9.6 seconds to 27 seconds using 8.
Any idea why this is happening?
BTW, A few things about the code:
a. Assume that both matrices have identical dimensions and are square.
b. Assume that number of threads <= number of rows/columns.
public class MatrixMultThread implements Runnable {
final static int MATRIX_SIZE = 1024;
final static int MAX_THREADS = MATRIX_SIZE;
private float[][] a;
private float[][] b;
private float[][] res;
private int startIndex;
private int endIndex;
public MatrixMultThread(float[][] a, float[][]b, float[][] res, int startIndex, int endIndex) {
this.a = a;
this.b = b;
this.res = res;
this.startIndex = startIndex;
this.endIndex = endIndex;
}
public void run() {
float value = 0;
for (int k = startIndex; k < endIndex; k++) {
for (int i = 0; i < a.length; i++) {
for (int j = 0; j < a.length; j++) {
value += a[k][j]*b[j][i];
}
res[k][i] = value;
value = 0;
}
}
}
public static float[][] mult(float[][] a, float[][] b, int threadCount){
// Get number of rows per each thread.
int rowsPerThread = (int) Math.ceil(MATRIX_SIZE / threadCount);
float[][] res = new float[MATRIX_SIZE][MATRIX_SIZE];
// Create thread array
Thread[] threadsArray = new Thread[threadCount];
int rowCounter = 0;
for (int i = 0; i < threadCount; i++) {
threadsArray[i] = new Thread(new MatrixMultThread(a,b,res,rowCounter, Math.max(rowCounter + rowsPerThread, MATRIX_SIZE)));
threadsArray[i].start();
rowCounter += rowsPerThread;
}
// Wait for all threads to end before finishing execution.
for (int i = 0; i < threadCount; i++) {
try {
threadsArray[i].join();
} catch (InterruptedException e) {
System.out.println("join failed");
}
}
return res;
}
public static void main(String args[]) {
// Create matrices and random generator
Random randomGenerator = new Random();
float[][] a = new float[MATRIX_SIZE][MATRIX_SIZE];
float[][] b = new float[MATRIX_SIZE][MATRIX_SIZE];
// Initialize two matrices with initial values from 1 to 10.
for (int i = 0; i < a.length; i++) {
for (int j = 0; j < a.length; j++) {
a[i][j] = randomGenerator.nextFloat() * randomGenerator.nextInt(100);
b[i][j] = randomGenerator.nextFloat() * randomGenerator.nextInt(100);
}
}
long startTime;
for (int i = 1; i <= 8; i++) {
startTime = System.currentTimeMillis();
mult(a,b,i);
System.out.println("Total running time is: " + (System.currentTimeMillis() - startTime) + " ms");
}
}
}
Firstly a bit of logging helps. I did logging for this and found out a bug in your logic.
Here is the log
Starting execution for thread count: 1
Start index: 0
End index: 1024
Starting execution: MatrixMultiplier: 0
Ending executionMatrixMultiplier: 0
Total running time is: 6593 ms
Starting execution for thread count: 2
Start index: 0
End index: 1024 <------ This is the problem area
Start index: 512
End index: 1024
Starting execution: MatrixMultiplier: 1
Starting execution: MatrixMultiplier: 0
Your first thread in all iterations is performing whole multiplication everytime. That's why you are not seeing results. Figure out the bug.
I tried many ways to get the below scenario works, and the result is infinite printing fork fork fork.. I tried to debug, but it always wait in task.join(); for long time with no result. I understand the concept of fork/join well, I can use it when I have task can be divided into sub-parts such as: Fibonacci, and Maximum of arrays. The scenario here is different in sense that I have to an iterate in compute which isn't recursively. Can anyone help ?
CompositePoolTest
import java.util.Random;
import java.util.concurrent.ForkJoinPool;
public class CompositePoolTest {
Random random = new Random(123);
int done = 0;
int rest= 0;
int tt = 4;
ForkJoinPool pool = new ForkJoinPool(tt);
int M= 1000;
int N = 1000;
public static void main(String[] args) {
new CompositePoolTest().compute();
}
private void compute() {
double[][] original_matrix = new double[M][N];
original_matrix = radom_intialization();
double[][] temp_matrix = new double[M][N];
done= 0;
rest= (M * N - done) / (tt- 0);
DynamicCompositeFinder dynamicFinder = new DynamicCompositeFinder(done,rest,original_matrix,temp_matrix);
new ForkJoinPool().invoke(dynamicFinder);
}
private double[][] radom_intialization() {
double [][] grid_matrix = new double[M][N];
for (int i = 0; i < M; i++)
for (int j = 0; j < N; j++) {
grid_matrix[i][j] = random.nextDouble()+0.10;
}
return grid_matrix;
}
}
DynamicCompositeFinder
package test;
import java.util.LinkedList;
import java.util.List;
import java.util.concurrent.ForkJoinPool;
import java.util.concurrent.RecursiveAction;
public class DynamicCompositeFinder extends RecursiveAction {
int done = 0;
int rest = 0;
int pp = 4;
ForkJoinPool pool = new ForkJoinPool(pp);
// Matrix dimensions
int M = 1000;
int N = 1000;
int x = 0;
int y = 0;
int niteration = 150;
double[][] original_matrix = new double[M][N];
double[][] temp_matrix = new double[M][N];
public DynamicCompositeFinder(int done, int rest, double[][] original_matrix, double[][] temp_matrix) {
this.done = done;
this.rest = rest;
this.original_matrix = original_matrix;
this.temp_matrix = temp_matrix;
int limit = done + rest;
for (int i = done; i < limit; i++) {
x = i / M;
y = i % M;
temp_matrix[x][y] = fun_calculation(x, y, original_matrix);
}
}
private double fun_calculation(int x2, int y2, double[][] original_matrix2) {
double temp = 2 * (original_matrix2[x][y] );
return temp;
}
#Override
protected void compute() {
for (int i = 0; i < niteration; i++) {
done = 0;
List<RecursiveAction> forks = new LinkedList<RecursiveAction>();
for (int p = 0; p < pp; p++) // n is predefined n = 9
{
rest = (M * N - done) / (pp - p);
DynamicCompositeFinder finder = new DynamicCompositeFinder(done, rest, original_matrix, temp_matrix);
p++;
forks.add((RecursiveAction) finder.fork());
System.out.println("Fork-" + Thread.currentThread().getName()
+ " State: " + Thread.currentThread().getState());
}
for (RecursiveAction task : forks) {
task.join();
System.out.println("Join-" + Thread.currentThread().getName()
+ " State: " + Thread.currentThread().getState());
}
original_matrix = copy_matrix(temp_matrix);
}
}
public double[][] copy_matrix(double [][] matrix)
{
double [][] out= new double [matrix.length][matrix[0].length];
for(int i=0;i<matrix.length;i++)
{
out[i]= matrix[i].clone();
}
return out;
}}
Output
Fork-ForkJoinPool-1-worker-1 State: RUNNABLE
Fork-ForkJoinPool-1-worker-1 State: RUNNABLE
Fork-ForkJoinPool-1-worker-2 State: RUNNABLE
Fork-ForkJoinPool-1-worker-2 State: RUNNABLE
Fork-ForkJoinPool-1-worker-3 State: RUNNABLE
Fork-ForkJoinPool-1-worker-4 State: RUNNABLE
Fork-ForkJoinPool-1-worker-4 State: RUNNABLE
Fork-ForkJoinPool-1-worker-3 State: RUNNABLE
Fork-ForkJoinPool-1-worker-5 State: RUNNABLE
Fork-ForkJoinPool-1-worker-5 State: RUNNABLE
Fork-ForkJoinPool-1-worker-6 State: RUNNABLE
Fork-ForkJoinPool-1-worker-6 State: RUNNABLE
Fork-ForkJoinPool-1-worker-7 State: RUNNABLE
Fork-ForkJoinPool-1-worker-7 State: RUNNABLE
Fork-ForkJoinPool-1-worker-8 State: RUNNABLE
Fork-ForkJoinPool-1-worker-8 State: RUNNABLE
Fork-ForkJoinPool-1-worker-9 State: RUNNABLE
Fork-ForkJoinPool-1-worker-9 State: RUNNABLE
Fork-ForkJoinPool-1-worker-10 State: RUNNABLE
Fork-ForkJoinPool-1-worker-10 State: RUNNABLE
Fork-ForkJoinPool-1-worker-11 State: RUNNABLE
Fork-ForkJoinPool-1-worker-11 State: RUNNABLE
Fork-ForkJoinPool-1-worker-12 State: RUNNABLE
Fork-ForkJoinPool-1-worker-12 State: RUNNABLE
Fork-ForkJoinPool-1-worker-13 State: RUNNABLE
.....
......
The major problem is that you fork() forever. There is no stopper code, such as:
if (computed < limiter) return;
Therefore, you add tasks to the deque, the thread picks up each task and forks more tasks, forever. I added a stopper to your code and ran it in Java7. The join() gets called but the outside iteration keeps going forever. So you have some logic problem there.
The second problem is that you misunderstand the F/J framework. This framework is not a general purpose parallel engine. It is academic code specifically designed to recursively walk down the leaves of a balanced tree (D.A.G.) Since you do not have a balanced tree you cannot process according to the examples given in the JavaDoc:
split left, right;
left.fork();
right.compute();
left.join();
And you are not doing recursive decomposition. Your code would be more appropriate for Java8's CountedCompler()
I want to make a simple math operations on a vector(array) using two cores of my CPU. The program doesn't work correctly. Please explain me how to solve my problem.
public class MyRunnable implements Runnable {
private int startIndex;
private int endIndex;
private float[] tab;
public MyRunnable(int startIndex, int endIndex, float[] tab)
{
this.startIndex = startIndex;
this.endIndex = endIndex;
this.tab = tab;
}
#Override
public void run()
{
System.out.println(Thread.currentThread());
for(int i = startIndex; i < endIndex; i++)
{
tab[i] = i * 2;
}
System.out.println("Finished");
}
}
public class Test {
public static void main(String[] args) {
int size = 10;
int n_threads = 2;
float tab[] = new float[size];
for(int i = 0; i < size; i++)
{
tab[i] = i;
}
System.out.println(Thread.currentThread());
for(int i = 0; i < size; i++)
{
System.out.println(tab[i]);
}
Runnable r1 = new MyRunnable(0, size / n_threads, tab );
Runnable r2 = new MyRunnable(size / n_threads, size, tab );
Thread t1 = new Thread(r1);
Thread t2 = new Thread(r2);
t1.start();
t2.start();
for(int i = 0; i < size; i++)
{
System.out.println(tab[i]);
}
}
It seems like you don't wait for the threads to finish. Use the join method and add
t1.join();
t2.join();
just before the output loop.
As pointed out by others, you are not waiting for your threads to finish execution. You should follow the advice of #Howard and #JK and that will fix your basic issue. If you decide to do more with threads and parallel processing though I would highly advice looking into the java.util.concurrent packages - they have many useful classes that will make your life much easier.
I took the liberty of recoding your example using Callable and ExecutorService. Please see the sample code below:
public static void main(String[] args) {
int size = 10;
int n_threads = 2;
float tab[] = new float[size];
for (int i = 0; i < size; i++) {
tab[i] = i;
}
System.out.println(Thread.currentThread());
for (int i = 0; i < size; i++) {
System.out.println(tab[i]);
}
// Determine batch size, based off of number of available
// threads.
int batchSize = (int) Math.ceil((double) size / n_threads);
System.out.println("Size: " + size + " Num threads: " + n_threads
+ " Batch Size: " + batchSize);
// Create list of tasks to run
List<Callable<Object>> tasks = new ArrayList<Callable<Object>>(
n_threads);
for (int i = 0; i < n_threads; i++) {
tasks.add(Executors.callable(new MyRunnable(i * batchSize,
((i + 1) * batchSize) - 1, tab)));
}
// Create an executor service to handle processing tasks
ExecutorService execService = Executors.newFixedThreadPool(n_threads);
try {
execService.invokeAll(tasks);
} catch (InterruptedException ie) {
ie.printStackTrace();
} finally {
execService.shutdown();
}
for (int i = 0; i < size; i++) {
System.out.println(tab[i]);
}
}
And made one slight change in your MyRunnable class, which was skipping processing on the last index:
#Override
public void run() {
System.out.println(Thread.currentThread());
for (int i = startIndex; i <= endIndex; i++) {
tab[i] = i * 2;
}
System.out.println("Finished");
}
Works great, you can test for yourself. Theres many more classes in java.util.concurrent that can do similar functionality, feel free to explore.
Good luck!
You can wait for the threads to finish execution by inserting calls to Thread.join():
t1.join();
t2.join();
after your x.start() function call to pause until the threads have completed. Otherwise you cannnot know if they are finished executing or not.
You should also consider synchronizing your tab[] accesses within the separate threads with a mutex/semaphore or similar mechanism, and not necessarily perform calculations directly on the passed in array reference, since this can limit the amount of concurrency (if present).