I need to create a program that can calculate approximation to the constant PI, using Java multi-thread.
I'm intent to use Gregory-Leibniz Series to calculate the result for PI / 4, and then multiply by 4 to get the PI approximation.
But I have some concern about the program:
How can I seperate the calculation process so that I can implement a multi-thread processing for the program? Because the formula is for the total sum, I don't know how to split them into parts and then in the end I will collect them all.
I'm thinking about the fact that the program will execute the formula to infinite so user will need to provide some means of configuring the execution in order to determine when it should stop and return a result. Is it possible and how can I do that?
This is so far the most I can do by now.
public class PICalculate {
public static void main(String[] args) {
System.out.println(calculatePI(5000000) * 4);
}
static double calculatePI(int n) {
double result = 0.0;
if (n < 0) {
return 0.0;
}
for (int i = 0; i <= n; i++) {
result += Math.pow(-1, i) / ((2 * i) + 1);
}
return result;
}
}
The most straightforward, but not the most optimal, approach is to distribute the sequence elements between threads you have. Ie, if you have 4 threads, thread one will work with n%4 == 0 elements, thread2 with n%4 == 1 elements and so on
public static void main(String ... args) throws InterruptedException {
int threadCount = 4;
int N = 100_000;
PiThread[] threads = new PiThread[threadCount];
for (int i = 0; i < threadCount; i++) {
threads[i] = new PiThread(threadCount, i, N);
threads[i].start();
}
for (int i = 0; i < threadCount; i++) {
threads[i].join();
}
double pi = 0;
for (int i = 0; i < threadCount; i++) {
pi += threads[i].getSum();
}
System.out.print("PI/4 = " + pi);
}
static class PiThread extends Thread {
private final int threadCount;
private final int threadRemainder;
private final int N;
private double sum = 0;
public PiThread(int threadCount, int threadRemainder, int n) {
this.threadCount = threadCount;
this.threadRemainder = threadRemainder;
N = n;
}
#Override
public void run() {
for (int i = 0; i <= N; i++) {
if (i % threadCount == threadRemainder) {
sum += Math.pow(-1, i) / (2 * i + 1);
}
}
}
public double getSum() {
return sum;
}
}
PiThread is more efficient, but arguably harder to read, if the loop is shorter:
public void run() {
for (int i = threadRemainder; i <= N; i += threadCount) {
sum += Math.pow(-1, i) / (2 * i + 1);
}
}
In case you don't want to limit yourself with number of elements in sequence and just by time, you may follow an approach below. But note, that it is still limited with Long.MAX_VALUE and you'll have to use BigIntegers, BigDecimals or any other reasonable approach to improve it
public static volatile boolean running = true;
public static void main(String ... args) throws InterruptedException {
int threadCount = 4;
long timeoutMs = 5_000;
final AtomicLong counter = new AtomicLong(0);
PiThread[] threads = new PiThread[threadCount];
for (int i = 0; i < threadCount; i++) {
threads[i] = new PiThread(counter);
threads[i].start();
}
Thread.sleep(timeoutMs);
running = false;
for (int i = 0; i < threadCount; i++) {
threads[i].join();
}
double sum = 0;
for (int i = 0; i < threadCount; i++) {
sum += threads[i].getSum();
}
System.out.print("counter = " + counter.get());
System.out.print("PI = " + 4*sum);
}
static class PiThread extends Thread {
private AtomicLong counter;
private double sum = 0;
public PiThread(AtomicLong counter) {
this.counter = counter;
}
#Override
public void run() {
long i;
while (running && isValidCounter(i = counter.getAndAdd(1))) {
sum += Math.pow(-1, i) / (2 * i + 1);
}
}
private boolean isValidCounter(long value) {
return value >= 0 && value < Long.MAX_VALUE;
}
public double getSum() {
return sum;
}
}
Related
I'm trying to create a Java program with threads for matrix multiplication. This is the source code:
import java.util.Random;
public class MatrixTest {
//Creating the matrix
static int[][] mat = new int[3][3];
static int[][] mat2 = new int[3][3];
static int[][] result = new int[3][3];
public static void main(String[] args) {
//Creating the object of random class
Random rand = new Random();
//Filling first matrix with random values
for (int i = 0; i < mat.length; i++) {
for (int j = 0; j < mat[i].length; j++) {
mat[i][j] = rand.nextInt(10);
}
}
//Filling second matrix with random values
for (int i = 0; i < mat2.length; i++) {
for (int j = 0; j < mat2[i].length; j++) {
mat2[i][j] = rand.nextInt(10);
}
}
try {
//Object of multiply Class
Multiply multiply = new Multiply(3, 3);
//Threads
MatrixMultiplier thread1 = new MatrixMultiplier(multiply);
MatrixMultiplier thread2 = new MatrixMultiplier(multiply);
MatrixMultiplier thread3 = new MatrixMultiplier(multiply);
//Implementing threads
Thread th1 = new Thread(thread1);
Thread th2 = new Thread(thread2);
Thread th3 = new Thread(thread3);
//Starting threads
th1.start();
th2.start();
th3.start();
th1.join();
th2.join();
th3.join();
} catch (Exception e) {
e.printStackTrace();
}
//Printing the result
System.out.println("\n\nResult:");
for (int i = 0; i < result.length; i++) {
for (int j = 0; j < result[i].length; j++) {
System.out.print(result[i][j] + " ");
}
System.out.println();
}
}//End main
}//End Class
//Multiply Class
class Multiply extends MatrixTest {
private int i;
private int j;
private int chance;
public Multiply(int i, int j) {
this.i = i;
this.j = j;
chance = 0;
}
//Matrix Multiplication Function
public synchronized void multiplyMatrix() {
int sum = 0;
int a = 0;
for (a = 0; a < i; a++) {
sum = 0;
for (int b = 0; b < j; b++) {
sum = sum + mat[chance][b] * mat2[b][a];
}
result[chance][a] = sum;
}
if (chance >= i)
return;
chance++;
}
}//End multiply class
//Thread Class
class MatrixMultiplier implements Runnable {
private final Multiply mul;
public MatrixMultiplier(Multiply mul) {
this.mul = mul;
}
#Override
public void run() {
mul.multiplyMatrix();
}
}
I just tried on Eclipse and it works, but now I want to create another version of that program in which, I use one thread for each cell that I'll have on the result matrix. For example I've got two 3x3 matrices. So the result matrix will be 3x3. Then, I want to use 9 threads to calculate each one of the 9 cells of the result matrix.
Can anyone help me?
You can create n Threads as follows (Note: numberOfThreads is the number of threads that you want to create. This will be the number of cells):
List<Thread> threads = new ArrayList<>(numberOfThreads);
for (int x = 0; x < numberOfThreads; x++) {
Thread t = new Thread(new MatrixMultiplier(multiply));
t.start();
threads.add(t);
}
for (Thread t : threads) {
t.join();
}
Please use the new Executor framework to create Threads, instead of manually doing the plumbing.
ExecutorService executor = Executors.newFixedThreadPool(numberOfThreadsInPool);
for (int i = 0; i < numberOfThreads; i++) {
Runnable worker = new Thread(new MatrixMultiplier(multiply));;
executor.execute(worker);
}
executor.shutdown();
while (!executor.isTerminated()) {
}
With this code i think that i resolve my problem. I don't use synchronized in the methods but i think that is not necessary in that case.
import java.util.Scanner;
class MatrixProduct extends Thread {
private int[][] A;
private int[][] B;
private int[][] C;
private int rig, col;
private int dim;
public MatrixProduct(int[][] A, int[][] B, int[][] C, int rig, int col, int dim_com) {
this.A = A;
this.B = B;
this.C = C;
this.rig = rig;
this.col = col;
this.dim = dim_com;
}
public void run() {
for (int i = 0; i < dim; i++) {
C[rig][col] += A[rig][i] * B[i][col];
}
System.out.println("Thread " + rig + "," + col + " complete.");
}
}
public class MatrixMultiplication {
public static void main(String[] args) {
Scanner In = new Scanner(System.in);
System.out.print("Row of Matrix A: ");
int rA = In.nextInt();
System.out.print("Column of Matrix A: ");
int cA = In.nextInt();
System.out.print("Row of Matrix B: ");
int rB = In.nextInt();
System.out.print("Column of Matrix B: ");
int cB = In.nextInt();
System.out.println();
if (cA != rB) {
System.out.println("We can't do the matrix product!");
System.exit(-1);
}
System.out.println("The matrix result from product will be " + rA + " x " + cB);
System.out.println();
int[][] A = new int[rA][cA];
int[][] B = new int[rB][cB];
int[][] C = new int[rA][cB];
MatrixProduct[][] thrd = new MatrixProduct[rA][cB];
System.out.println("Insert A:");
System.out.println();
for (int i = 0; i < rA; i++) {
for (int j = 0; j < cA; j++) {
System.out.print(i + "," + j + " = ");
A[i][j] = In.nextInt();
}
}
System.out.println();
System.out.println("Insert B:");
System.out.println();
for (int i = 0; i < rB; i++) {
for (int j = 0; j < cB; j++) {
System.out.print(i + "," + j + " = ");
B[i][j] = In.nextInt();
}
}
System.out.println();
for (int i = 0; i < rA; i++) {
for (int j = 0; j < cB; j++) {
thrd[i][j] = new MatrixProduct(A, B, C, i, j, cA);
thrd[i][j].start();
}
}
for (int i = 0; i < rA; i++) {
for (int j = 0; j < cB; j++) {
try {
thrd[i][j].join();
} catch (InterruptedException e) {
}
}
}
System.out.println();
System.out.println("Result");
System.out.println();
for (int i = 0; i < rA; i++) {
for (int j = 0; j < cB; j++) {
System.out.print(C[i][j] + " ");
}
System.out.println();
}
}
}
Consider Matrix.java and Main.java as follows.
public class Matrix extends Thread {
private static int[][] a;
private static int[][] b;
private static int[][] c;
/* You might need other variables as well */
private int i;
private int j;
private int z1;
private int s;
private int k;
public Matrix(int[][] A, final int[][] B, final int[][] C, int i, int j, int z1) { // need to change this, might
// need some information
a = A;
b = B;
c = C;
this.i = i;
this.j = j;
this.z1 = z1; // a[0].length
}
public void run() {
synchronized (c) {
// 3. How to allocate work for each thread (recall it is the run function which
// all the threads execute)
// Here this code implements the allocated work for perticular thread
// Each element of the resulting matrix will generate by a perticular thread
for (s = 0, k = 0; k < z1; k++)
s += a[i][k] * b[k][j];
c[i][j] = s;
}
}
public static int[][] returnC() {
return c;
}
public static int[][] multiply(final int[][] a, final int[][] b) {
/*
* check if multipication can be done, if not return null allocate required
* memory return a * b
*/
final int x = a.length;
final int y = b[0].length;
final int z1 = a[0].length;
final int z2 = b.length;
if (z1 != z2) {
System.out.println("Cannnot multiply");
return null;
}
final int[][] c = new int[x][y];
int i, j;
// 1. How to use threads to parallelize the operation?
// Every element in the resulting matrix will be determined by a different
// thread
// 2. How may threads to use?
// x * y threads are used to generate the result.
for (i = 0; i < x; i++)
for (j = 0; j < y; j++) {
try {
Matrix temp_thread = new Matrix(a, b, c, i, j, z1);
temp_thread.start();
// 4. How to synchronize?
// synchronized() is used with join() to guarantee that the perticular thread
// will be accessed first
temp_thread.join();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
return Matrix.returnC();
}
}
You can use Main.java to give 2 matrices that need to be multiplied.
class Main {
public static int[][] a = {
{1, 1, 1},
{1, 1, 1},
{1, 1, 1}};
public static int[][] b = {
{1},
{1},
{1}};
public static void print_matrix(int[][] a) {
for (int i = 0; i < a.length; i++) {
for (int j = 0; j < a[i].length; j++)
System.out.print(a[i][j] + " ");
System.out.println();
}
}
public static void main(String[] args) {
int[][] x = Matrix.multiply(a, b);
print_matrix(x); // see if the multipication is correct
}
}
In simple terms, what you all need to do is,
1) Create n (no of cells in resultant matrix) threads. Assign their roles. (Ex: Consider M X N, where M and N are matrices. 'thread1' is responsible for the multiplication of M's row_1 elements with N's column_1 elements and storing the result. This is the value for the resultant matrix's cell_1.)
2) Start each thread's process. (by start() method)
3) Wait until all the threads finish their processes and store the resultant value of each cell. Because those processes should be finished before displaying the resultant matrix. (You can do this by join() methods, and other possibilities too)
4) Now, you can display the resultant matrix.
Note:
1) Since, in this example, the shared resources (M and N) are only used to read only purpose, you don't need to use 'synchronized' methods to access them.
2) You can see, in this program, there are a group of threads running and all of them needs to achieve a specific status by their own, before continuing the next step of the whole program. This multi-threaded programming model is known as a Barrier.
Tried below code in eclipse as per thread for each cell. It works fine, you can check it.
class ResMatrix {
static int[][] arrres = new int[2][2];
}
class Matrix {
int[][] arr = new int[2][2];
void setV(int v) {
//int tmp = v;
for (int i = 0; i < 2; i++) {
for (int j = 0; j < 2; j++) {
arr[i][j] = v;
v = v + 1;
}
}
}
int[][] getV() {
return arr;
}
}
class Mul extends Thread {
public int row;
public int col;
Matrix m;
Matrix m1;
Mul(int row, int col, Matrix m, Matrix m1) {
this.row = row;
this.col = col;
this.m = m;
this.m1 = m1;
}
public void run() {
//System.out.println("Started Thread: " + Thread.currentThread().getName());
int tmp = 0;
for (int i = 0; i < 2; i++) {
tmp = tmp + this.m.getV()[row][i] * this.m1.getV()[i][col];
}
ResMatrix.arrres[row][col] = tmp;
System.out.println("Started Thread END: " + Thread.currentThread().getName());
}
}
public class Test {
//static int[][] arrres =new int[2][2];
public static void main(String[] args) throws InterruptedException {
Matrix mm = new Matrix();
mm.setV(1);
Matrix mm1 = new Matrix();
mm1.setV(2);
for (int i = 0; i < 2; i++) {
for (int j = 0; j < 2; j++) {
Mul mul = new Mul(i, j, mm, mm1);
mul.start();
// mul.join();
}
}
for (int i = 0; i < 2; i++) {
for (int j = 0; j < 2; j++) {
System.out.println("VALUE: " + ResMatrix.arrres[i][j]);
}
}
}
}
In my solution I assigned to each worker a number of rows numRowForThread equals to: (number of rows of matA) / (number of threads).
public class MatMulConcur {
private final static int NUM_OF_THREAD = 1;
private static Mat matC;
public static Mat matmul(Mat matA, Mat matB) {
matC = new Mat(matA.getNRows(), matB.getNColumns());
return mul(matA, matB);
}
private static Mat mul(Mat matA, Mat matB) {
int numRowForThread;
int numRowA = matA.getNRows();
int startRow = 0;
Worker[] myWorker = new Worker[NUM_OF_THREAD];
for (int j = 0; j < NUM_OF_THREAD; j++) {
if (j < NUM_OF_THREAD - 1) {
numRowForThread = (numRowA / NUM_OF_THREAD);
} else {
numRowForThread = (numRowA / NUM_OF_THREAD) + (numRowA % NUM_OF_THREAD);
}
myWorker[j] = new Worker(startRow, startRow + numRowForThread, matA, matB);
myWorker[j].start();
startRow += numRowForThread;
}
for (Worker worker : myWorker) {
try {
worker.join();
} catch (InterruptedException e) {
}
}
return matC;
}
private static class Worker extends Thread {
private int startRow, stopRow;
private Mat matA, matB;
public Worker(int startRow, int stopRow, Mat matA, Mat matB) {
super();
this.startRow = startRow;
this.stopRow = stopRow;
this.matA = matA;
this.matB = matB;
}
#Override
public void run() {
for (int i = startRow; i < stopRow; i++) {
for (int j = 0; j < matB.getNColumns(); j++) {
double sum = 0;
for (int k = 0; k < matA.getNColumns(); k++) {
sum += matA.get(i, k) * matB.get(k, j);
}
matC.set(i, j, sum);
}
}
}
}
}
where for the class Mat, I used this implementation:
public class Mat {
private double[][] mat;
public Mat(int n, int m) {
mat = new double[n][m];
}
public void set(int i, int j, double v) {
mat[i][j] = v;
}
public double get(int i, int j) {
return mat[i][j];
}
public int getNRows() {
return mat.length;
}
public int getNColumns() {
return mat[0].length;
}
}
If my input is 1. Then 1 as 32 bits in binary is 00000000000000000000000000000001. If I invert the bits, its 11111111111111111111111111111110. And if I convert this inverted bit number from binary to decimal, I should get 4294967294. I wrote the following program to do this, but my final sum is wrong despite me being able to invert the bits correctly. I'm getting -3.
Here is my code:
public class FlippingBits {
public static void main(String[] args) {
FlippingBits fpb = new FlippingBits();
int i = 1;
int index = 0;
int[] bitArray = new int[32];
fpb.convertToBin(i, bitArray, index);
}
private void convertToBin(int decimalInput, int[] unsigned32, int index) {
if (decimalInput <= 1) {
unsigned32[index++] = flipBit(decimalInput);
for (int i = index; i < unsigned32.length; i++) {
unsigned32[i] = 1;
}
printArray(unsigned32);
System.out.println();
sumBit(unsigned32);
return;
}
int remainder = decimalInput % 2;
unsigned32[index] = flipBit(remainder);
index++;
convertToBin(decimalInput >> 1, unsigned32, index);
}
private void sumBit(int[] unsigned32) {
int sum = 0;
for (int i = 0; i < unsigned32.length; i++) {
sum += unsigned32[i] * (int) Math.pow(2, i);
}
System.out.println(sum);
}
private int flipBit(int remainder) {
if (remainder == 1) {
return 0;
} else {
return 1;
}
}
private void printArray(int[] unsigned32) {
for (int i = 0; i < unsigned32.length; i++) {
System.out.print(unsigned32[i]);
}
}
}
I'm not sure what's happening with my sumBit(int[]) method. I'm pretty sure I haven't forgotten how to convert from binary to decimal.
You aren't actually using unsigned ints. You are overflowing your variable.
This should help
public class FlippingBits {
public static void main(String[] args) {
FlippingBits fpb = new FlippingBits();
int i = 1;
int index = 0;
int[] bitArray = new int[32];
fpb.convertToBin(i, bitArray, index);
}
private void convertToBin(int decimalInput, int[] unsigned32, int index) {
if (decimalInput <= 1) {
unsigned32[index++] = flipBit(decimalInput);
for (int i = index; i < unsigned32.length; i++) {
unsigned32[i] = 1;
}
printArray(unsigned32);
System.out.println();
sumBit(unsigned32);
return;
}
int remainder = decimalInput % 2;
unsigned32[index] = flipBit(remainder);
index++;
convertToBin(decimalInput >> 1, unsigned32, index);
}
private void sumBit(int[] unsigned32) {
long sum = 0;
for (int i = unsigned32.length - 1; i >= 0; i--) {
sum += unsigned32[i] * (int) Math.pow(2, i);
}
System.out.println(sum);
}
private int flipBit(int remainder) {
if (remainder == 1) {
return 0;
} else {
return 1;
}
}
private void printArray(int[] unsigned32) {
for (int i = unsigned32.length - 1; i >= 0; i--) {
System.out.print(unsigned32[i]);
}
}
}
Java int data type is a 32-bit signed two's complement, see http://www.cs.uwm.edu/~cs151/Bacon/Lecture/HTML/ch03s09.html for more info
For some reason, my method "bishops" runs much faster when called from the main method than from the static initialization block. Is this normal, or a bug?
public class Magic
{
public static void main(String[] args)
{
bishops();
}
public static void bishops()
{
//PrintWriter out = new PrintWriter(new BufferedWriter(new FileWriter("bishops.txt")));
BISHOP_SHIFTS = new int[64];
BISHOP_COMBOS = new long[64][];
for (int square = 0; square < 64; square++) {System.out.println("bbb " + square);
int NUMBER = bitCount(BISHOP_ATTACKS[square]);
BISHOP_SHIFTS[square] = 64 - NUMBER;
long x = BISHOP_ATTACKS[square];
long[] MAPS = new long[NUMBER];
for (int n = 0; n < NUMBER; n++) {
int i = bitScan(x);
MAPS[n] = (1L << i);
x -= MAPS[n];
}
int C = 1 << NUMBER;
BISHOP_COMBOS[square] = new long[C];
for (int i = 0; i < C; i++) {
BISHOP_COMBOS[square][i] = 0;
int j = i;
for (int n = 0; n < NUMBER; n++) {
if ((j & 1) == 1)
BISHOP_COMBOS[square][i] |= MAPS[n];
j >>>= 1;
}
//out.println("SQUARE " + square);
//out.println(toBitboardString(BISHOP_COMBOS[square][i]));
//out.println();
}
}
//out.close();
bishopMagics();
}
public static void bishopMagics()
{
BISHOP_MAGICS = new long[64];
Random r = new Random();
for (int square = 0; square < 64; square++) {System.out.println("asdffff " + square);
int i;
int LENGTH = BISHOP_COMBOS[square].length;
long magic;
do {
magic = r.nextLong() & r.nextLong() & r.nextLong();
//final int COUNT = bitCount(BISHOP_MASKS[square]);
boolean[] used = new boolean[LENGTH];
for (int j = 0; j < used.length; j++)
used[j] = false;
for (i = 0; i < LENGTH; i++) {
int index = (int) ((BISHOP_COMBOS[square][i] * magic) >>> BISHOP_SHIFTS[square]);
if (used[index])
break;
else
used[index] = true;
}
} while (i < LENGTH);
BISHOP_MAGICS[square] = magic;
System.out.println(magic);
}
//bishopTable();
}
/*
* Lots of stuff omitted
*/
static
{
//bishops();
}
}
It will run much faster the second time than the first as the JVM warms up (loads class es and compiles code). The static block is always called first.
Try running it twice from the main() or the static block and see how long it takes each time
BTW: I would take out any logging to the console as this can slow down the code dramatically.
I have implemented serial and parallel algorithm for solving linear systems using jacobi method. Both implementations converge and give correct solutions.
I am having trouble with understanding:
How can parallel implementation converge after so low number of iterations compared to serial (same method is used in both). Am I facing some concurrency issues that I am not aware of?
How can number of iterations vary from run to run in parallel implementation (6,7)?
Thanks!
Program output:
Mathematica solution: {{-1.12756}, {4.70371}, {-1.89272}, {1.56218}}
Serial: iterations=7194 , error=false, solution=[-1.1270591, 4.7042074, -1.8922218, 1.5626835]
Parallel: iterations=6 , error=false, solution=[-1.1274619, 4.7035804, -1.8927546, 1.5621948]
Code:
Main
import java.util.Arrays;
public class Main {
public static void main(String[] args) {
Serial s = new Serial();
Parallel p = new Parallel(2);
s.solve();
p.solve();
System.out.println("Mathematica solution: {{-1.12756}, {4.70371}, {-1.89272}, {1.56218}}");
System.out.println(String.format("Serial: iterations=%d , error=%s, solution=%s", s.iter, s.errorFlag, Arrays.toString(s.data.solution)));
System.out.println(String.format("Parallel: iterations=%d , error=%s, solution=%s", p.iter, p.errorFlag, Arrays.toString(p.data.solution)));
}
}
Data
public class Data {
public float A[][] = {{2.886139567217389f, 0.9778259187352214f, 0.9432146432722157f, 0.9622157488990459f}
,{0.3023479007910952f,0.7503803506938734f,0.06163831478699766f,0.3856445043958068f}
,{0.4298384105199724f, 0.7787439716945019f, 1.838686110345417f, 0.6282668788698587f}
,{0.27798718418255075f, 0.09021764079496353f, 0.8765867330141233f, 1.246036349549629f}};
public float b[] = {1.0630309381779384f,3.674438173599066f,0.6796639099285651f,0.39831385324794155f};
public int size = A.length;
public float x[] = new float[size];
public float solution[] = new float[size];
}
Parallel
import java.util.Arrays;
public class Parallel {
private final int workers;
private float[] globalNorm;
public int iter;
public int maxIter = 1000000;
public double epsilon = 1.0e-3;
public boolean errorFlag = false;
public Data data = new Data();
public Parallel(int workers) {
this.workers = workers;
this.globalNorm = new float[workers];
Arrays.fill(globalNorm, 0);
}
public void solve() {
JacobiWorker[] threads = new JacobiWorker[workers];
int batchSize = data.size / workers;
float norm;
do {
for(int i=0;i<workers;i++) {
threads[i] = new JacobiWorker(i,batchSize);
threads[i].start();
}
for(int i=0;i<workers;i++)
try {
threads[i].join();
} catch (InterruptedException e) {
e.printStackTrace();
}
// At this point all worker calculations are done!
norm = 0;
for (float d : globalNorm) if (d > norm) norm = d;
if (norm < epsilon)
errorFlag = false; // Converged
else
errorFlag = true; // No desired convergence
} while (norm >= epsilon && ++iter <= maxIter);
}
class JacobiWorker extends Thread {
private final int idx;
private final int batchSize;
JacobiWorker(int idx, int batchSize) {
this.idx = idx;
this.batchSize = batchSize;
}
#Override
public void run() {
int upper = idx == workers - 1 ? data.size : (idx + 1) * batchSize;
float localNorm = 0, diff = 0;
for (int j = idx * batchSize; j < upper; j++) { // For every
// equation in batch
float s = 0;
for (int i = 0; i < data.size; i++) { // For every variable in
// equation
if (i != j)
s += data.A[j][i] * data.x[i];
data.solution[j] = (data.b[j] - s) / data.A[j][j];
}
diff = Math.abs(data.solution[j] - data.x[j]);
if (diff > localNorm) localNorm = diff;
data.x[j] = data.solution[j];
}
globalNorm[idx] = localNorm;
}
}
}
Serial
public class Serial {
public int iter;
public int maxIter = 1000000;
public double epsilon = 1.0e-3;
public boolean errorFlag = false;
public Data data = new Data();
public void solve() {
float norm,diff=0;
do {
for(int i=0;i<data.size;i++) {
float s=0;
for (int j = 0; j < data.size; j++) {
if (i != j)
s += data.A[i][j] * data.x[j];
data.solution[i] = (data.b[i] - s) / data.A[i][i];
}
}
norm = 0;
for (int i=0;i<data.size;i++) {
diff = Math.abs(data.solution[i]-data.x[i]); // Calculate convergence
if (diff > norm) norm = diff;
data.x[i] = data.solution[i];
}
if (norm < epsilon)
errorFlag = false; // Converged
else
errorFlag = true; // No desired convergence
} while (norm >= epsilon && ++iter <= maxIter);
}
}
I think its a matter of implementation and not parallelization. Look at what happens with Parallel p = new Parallel(1);
Mathematica solution: {{-1.12756}, {4.70371}, {-1.89272}, {1.56218}}
Serial: iterations=7194 , error=false, solution=[-1.1270591, 4.7042074, -1.8922218, 1.5626835]
Parallel: iterations=6 , error=false, solution=[-1.1274619, 4.7035804, -1.8927546, 1.5621948]
As it turns out - your second implementation is not doing exactly the same thing as your first one.
I added this into your parallel version and it ran in the same number of iterations.
for (int i = idx * batchSize; i < upper; i++) {
diff = Math.abs(data.solution[i] - data.x[i]); // Calculate
// convergence
if (diff > localNorm)
localNorm = diff;
data.x[i] = data.solution[i];
}
}
I'm trying to compute Pi, but what I really want to achieve is efficiency when using more than one thread. The algorithm is simple: I randomly generate points in the unit square and after that count how many of them are in the circle inscribed within the square. (more here: http://math.fullerton.edu/mathews/n2003/montecarlopimod.html)
My idea is to split the square horizontally and to run a different thread for each part of it.
But instead of speed up, all I get is a delay. Any ideas why? Here is the code:
public class TaskManager {
public static void main(String[] args) {
int threadsCount = 3;
int size = 10000000;
boolean isQuiet = false;
PiCalculator pi = new PiCalculator(size);
Thread tr[] = new Thread[threadsCount];
long time = -System.currentTimeMillis();
int i;
double s = 1.0/threadsCount;
int p = size/threadsCount;
for(i = 0; i < threadsCount; i++) {
PiRunnable r = new PiRunnable(pi, s*i, s*(1.0+i), p, isQuiet);
tr[i] = new Thread(r);
}
for(i = 0; i < threadsCount; i++) {
tr[i].start();
}
for(i = 0; i < threadsCount; i++) {
try {
tr[i].join();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
double myPi = 4.0*pi.getPointsInCircle()/pi.getPointsInSquare();
System.out.println(myPi + " time = " + (System.currentTimeMillis()+time));
}
}
public class PiRunnable implements Runnable {
PiCalculator pi;
private double minX;
private double maxX;
private int pointsToSpread;
public PiRunnable(PiCalculator pi, double minX, double maxX, int pointsToSpread, boolean isQuiet) {
super();
this.pi = pi;
this.minX = minX;
this.maxX = maxX;
this.pointsToSpread = pointsToSpread;
}
#Override
public void run() {
int n = countPointsInAreaInCircle(minX, maxX, pointsToSpread);
pi.addToPointsInCircle(n);
}
public int countPointsInAreaInCircle (double minX, double maxX, int pointsCount) {
double x;
double y;
int inCircle = 0;
for (int i = 0; i < pointsCount; i++) {
x = Math.random() * (maxX - minX) + minX;
y = Math.random();
if (x*x + y*y <= 1) {
inCircle++;
}
}
return inCircle;
}
}
public class PiCalculator {
private int pointsInSquare;
private int pointsInCircle;
public PiCalculator(int pointsInSquare) {
super();
this.pointsInSquare = pointsInSquare;
}
public synchronized void addToPointsInCircle (int pointsCount) {
this.pointsInCircle += pointsCount;
}
public synchronized int getPointsInCircle () {
return this.pointsInCircle;
}
public synchronized void setPointsInSquare (int pointsInSquare) {
this.pointsInSquare = pointsInSquare;
}
public synchronized int getPointsInSquare () {
return this.pointsInSquare;
}
}
Some results:
-for 3 threads: "3.1424696 time = 2803"
-for 1 thread: "3.1416192 time = 2337"
Your threads could be fighting/waiting for Math.random() which is synchronized, you should create an instance of java.util.Random for each thread. Also in this case speedup with multiple threads will only happen if you have more than one core/cpu.
From the javadoc of Math.random():
This method is properly synchronized
to allow correct use by more than one
thread. However, if many threads need
to generate pseudorandom numbers at a
great rate, it may reduce contention
for each thread to have its own
pseudorandom-number generator.
Here is an alternate main method that uses the java.util.concurrency package instead of manually managing the threads and waiting for them to finish.
public static void main(final String[] args) throws InterruptedException
{
final int threadsCount = Runtime.getRuntime().availableProcessors();
final int size = 10000000;
boolean isQuiet = false;
final PiCalculator pi = new PiCalculator(size);
final ExecutorService es = Executors.newFixedThreadPool(threadsCount);
long time = -System.currentTimeMillis();
int i;
double s = 1.0 / threadsCount;
int p = size / threadsCount;
for (i = 0; i < threadsCount; i++)
{
es.submit(new PiRunnable(pi, s * i, s * (1.0 + i), p, isQuiet));
}
es.shutdown();
while (!es.isTerminated()) { /* do nothing waiting for threads to complete */ }
double myPi = 4.0 * pi.getPointsInCircle() / pi.getPointsInSquare();
System.out.println(myPi + " time = " + (System.currentTimeMillis() + time));
}
I also changed the Math.random() to use local instances of Random for each Runnable.
final private Random rnd;
...
x = this.rnd.nextDouble() * (maxX - minX) + minX;
y = this.rnd.nextDouble();
this is the new output I get ...
3.1419284 time = 235
I think you could probably drop the time some more using Futures and not having to synchronized so much on the PiCalculator.