Rest parellel calls to service -Multithreading in java

Rest parellel calls to service -Multithreading in java - java

I have a rest call api where max count of result return by the api is 1000.start page=1
{
"status": "OK",
"payload": {
"EMPList":[],
count:5665
}
So to get other result I have to change the start page=2 and again hit the service.again will get 1000 results only.
but after first call i want to make it as a parallel call and I want to collect the result and combine it and send it back to calling service in java. Please suggest i am new to java.i tried using callable but it's not working

It seems to me that ideally you should be able to configure your max count to one appropriate for your use case. I'm assuming you aren't able to do that. Here is a simple, lock-less, multi threading scheme that acts as a simple reduction operation for your two network calls:
// online runnable: https://ideone.com/47KsoS
int resultSize = 5;
int[] result = new int[resultSize*2];
Thread pg1 = new Thread(){
public void run(){
System.out.println("Thread 1 Running...");
// write numbers 1-5 to indexes 0-4
for(int i = 0 ; i < resultSize; i ++) {
result[i] = i + 1;
}
System.out.println("Thread 1 Exiting...");
}
};
Thread pg2 = new Thread(){
public void run(){
System.out.println("Thread 2 Running");
// write numbers 5-10 to indexes 5-9
for(int i = 0 ; i < resultSize; i ++) {
result[i + resultSize] = i + 1 + resultSize;
}
System.out.println("Thread 2 Exiting...");
}
};
pg1.start();
pg2.start();
// ensure that pg1 execution finishes
pg1.join();
// ensure that pg2 execution finishes
pg2.join();
// print result of reduction operation
System.out.println(Arrays.toString(result));
There is a very important caveat with this implementation however. You will notice that both of the threads DO NOT overlap in their memory writes. This is very important as if you were to simply change our int[] result to ArrayList<Integer> this could lead to catastrophic failure in our reduction operation between the two threads called a Race Condition (I believe the standard ArrayList implementation in Java is not thread safe). Since we can guarantee how large our result will be I would highly suggest sticking to my usage of an array for this multi-threaded implementation as ArrayLists hide a lot of implementation logic from you that you likely won't understand until you take a basic data-structures course.

Related

Completable future waste more time?

I have a question, I am learning about CompletableFuture of Java 8, I did a dummy with One method running with runAsync of Completable future, it is a simple for 0 to 10 and in paralen a for o to 5 In the second method I run the same for to 0 to 20, but the method of runAsyn takes longer than the other method, It is normal?
Shouldn't the asynchronous method last the same or less than the other method?
Here is the code.
public class Sample{
public static void main(String x[]) throws InterruptedException {
runAsync();
System.out.println("========== SECOND TESTS ==========");
runSync();
}
static void runAsync() throws InterruptedException {
long startTimeOne = System.currentTimeMillis();
CompletableFuture<Void> cf = CompletableFuture.runAsync(() -> {
for (int i = 0; i < 10L; i++) {
System.out.println(" Async One");
}
});
for (int i = 0; i < 5; i++) {
System.out.println("two");
}
System.out.println("It is ready One? (1) " + cf.isDone());
System.out.println("It is ready One? (2)" + cf.isDone());
System.out.println("It is ready One? (3)" + cf.isDone());
System.out.println("It is ready One? (4)" + cf.isDone());
System.out.println("It is ready One? (5)" + cf.isDone());
System.out.println("It is ready One? (6)" + cf.isDone());
long estimatedTimeOne = System.currentTimeMillis() - startTimeOne;
System.out.println("Total time async: " + estimatedTimeOne);
}
static void runSync() {
long startTimeTwo = System.currentTimeMillis();
for (int i = 0; i < 20; i++) {
System.out.println("No async");
}
long estimatedTimeTwo = System.currentTimeMillis() - startTimeTwo;
System.out.println("Total time no async: " + estimatedTimeTwo);
}
}
The normal for waste 1 milisecond and the runAsync waste 54 miliseconds
Here is the result screenshot

First, you are violating basic rules mentioned in How do I write a correct micro-benchmark in Java?
Most notably, you’re running both approaches within the same runtime and allow them to affect each other.
Besides that, you are getting an output that is a sequence of messages, which is showing the fundamental problem of your operation: you can not print concurrently. The output system itself has to ensure that the printing will end up showing a sequential behavior.
When you are performing actions that can’t run in parallel through a concurrent framework, you can’t gain performance, you can only add thread communication overhead.
Besides that, the operations are not even the same:
the action you are passing to runAsync uses 10L as end boundary, in other words, is performing a long comparison where all other loops use int
"It is ready One? (6)" + cf.isDone() is performing two operations that do not appear in the sequential variant. First, polling the status of the CompletableFuture, which must be done with inter-thread semantics. Second, it bears string concatenation. Both are potentially expensive operations
The async variant is printing 21 messages whereas the sequential is printing 20. Even the total amount of characters to print is roughly 50% more in the async operation
These points may serve as examples of how easily you can do things wrong in a manual benchmark. But they do not affect the outcome significantly, due to the fundamental aspect mentioned before them. You can’t gain a performance advantage of doing the printing asynchronously at all.
Note that the output is quite consistent in your specific case. Since the common Fork/Join thread pool has not been used before your asynchronous operation, it needs to start a new thread when you submit your job, which takes so long that the subsequent local loop printing "two" completes before the asynchronous operation even starts. The next operation, polling cf.isDone() and performing string concatenation, on the other hand, is so slow, that the asynchronous operation completes entirely before these six print statements complete.
When you change the code to
CompletableFuture<Void> cf = CompletableFuture.runAsync(() -> {
for (int i = 0; i < 10; i++) {
System.out.println("Async One");
}
});
for(int i = 0; i < 10; i++) {
System.out.println("two");
}
cf.join();
you still can’t get a performance advantage, but the performance difference will be much smaller. When you add a statement like
ForkJoinPool.commonPool().execute(() -> System.out.println());
at the beginning of the main method, to ensure that the thread pool does not need to get initialized within the measured method, the perceived overhead may even reduce further.
Further, you may swap the order of the runAsync(); and runSync(); method invocations in the main method, to see how first-time execution effects influence the result when you run the two methods within the same JVM.
This all is not enough to make it a reliable benchmark but should help to understand the things that will go wrong when not understanding the pitfalls of doing a micro-benchmark.

Writing a Java program that uses threads to calculate an expression that ignores order of operations?

I am trying to author a Java program that uses threads to calculate an expression such as:
3 + 4 / 7 + 4 * 2
and outputs
Enter problem: 3 + 4 / 7 + 4 * 2
Thread-0 calculated 3+4 as 7
Thread-1 calculated 7/7 as 1
Thread-2 calculated 1+4 as 5
Thread-3 calculated 5*2 as 10
Final result: 10
In this exercise, we are ignoring order of operations. The expression is entered via user input. The goal is to get a separate thread to perform each calculation. I absolutely want each thread to perform each of the individual calculations, as I have listed above.

My honest, professional advice is don't try to use multithreading for this problem.
Learn to write clear, robust single-threaded code first. Learn how to debug it. Learn how to write the same thing in lots of different ways. It is only then that you can start to introduce the enormous complexity that is multithreading, and stand any chance of it being correct.
And learn, by reading about how to write multithreaded code correctly, what problems benefit from multithreading. This problem does not, because you need the result of the previous arithmetic operation as an input to the next.
I am only answering because of the terrible advice in comments to use global variables. Don't. This is not a good way to write multithreaded code, even in such a simple example. Even in single-threaded code, mutable global state is something which should be avoided if at all possible.
Keep your mutable state as tightly controlled as you can. Create a Runnable subclass which holds the operation you are going to perform:
class Op implements Runnable {
final int operand1, operand2;
final char oprator;
int result;
Op(int operand1, char oprator, int operand2) {
// Initialize fields.
}
#Override public void run() {
result = /* code to calculate `operand1 (oprator) operand2` */;
}
}
Now, you can calculate, say, 1 + 2 using:
Op op = new Op(1, '+', 2);
Thread t = new Thread(op);
t.start();
t.join();
int result = op.result;
(Or, you could have just used int result = 1 + 2;...)
So you can now use this in a loop:
String[] tokens = eqn.split(" ");
int result = Integer.parseInt(tokens[0]);
for (int t = 1; t < tokens.length; t += 2) {
Op op = new Op(
result,
result, tokens[t].charAt(0),
Integer.parseInt(tokens[t+1]));
Thread t = new Thread(op);
t.start();
t.join();
result = op.result;
}
All of the mutable state is confined to the scope of the op variable. If you, say, want to run a second calculation, you don't have to worry about what previous state is still hanging around: you don't have to reset anything before another run; you can invoke this code in parallel, if you want, without interference between runs.
But all of this loop could be written more cleanly - and faster - using a simple method call:
for (int t = 1; t < tokens.length; t += 2) {
result = method(
result,
result, tokens[t].charAt(0),
Integer.parseInt(tokens[t+1]));
}
Where method is a method containing /* code to calculate operand1 (oprator) operand2 */.

I'm a very beginner at programming and need some help and advice

I need to write two programs. 1 sequential (Done) and 1 parallel and I've done something but i don't know if I've made it parallel or not and i also need to know:
If (Thread.State state = Thread.currentThread().getState();) is the code to display the status of a thread
how to assign different threads to different processors (4 cores)
How to display the status of the processor
calculation for each processor
how to generate error messeges(Memory Consistency Errors, etc.)
Following is my code:
class Threads implements Runnable {
#Override
public void run() {
Thread t = Thread.currentThread();
Thread.State state = Thread.currentThread().getState();
String Assignment = "calculations in array";
String name = "os.name";
String version = "os.version";
String architecture = "os.arch";
String[] array = new String[1312500];
int size = array.length;
Random r = new Random();
int[] values = new int[1312500];
int sumarray = 0;
int CountD = 0;
for (int i = 0; i < values.length; i++) {
int randomint = r.nextInt(100);
values[i] = randomint;
if ((values[i] % 2 != 0) && (values[i] >= 25 && values[i] <= 75)) {
sumarray += values[i];
CountD++;
}
}
System.out.println(t.getName());
System.out.println("Thread Id " + t.getId());
System.out.println("Thread Priority " + t.getPriority());
System.out.println("status " + state);
System.out.println("OS Name: " + System.getProperty(name));
System.out.println("OS Version: " + System.getProperty(version));
System.out.println("OS Architechture: " + System.getProperty(architecture));
System.out.println(Assignment);
System.out.println("Size of the Array is: " + array.length);
System.out.println("Total number of system cores(processors): " + Runtime.getRuntime().availableProcessors());
System.out.println("Difference of Array and Processors 1312500/4 = "
+ array.length / Runtime.getRuntime().availableProcessors());
System.out.println("The size of array is divisible by the number of processors");
System.out.println("Summary is: " + sumarray);
System.out.println("The Average is: " + (sumarray / CountD));
}
}
Main class:
class Concurrent {
public static void main(String[] args) {
Thread t1 = new Thread(new Threads());
t1.start();
Thread t2 = new Thread(new Threads());
t2.start();
Thread t3 = new Thread(new Threads());
t3.start();
Thread t4 = new Thread(new Threads());
t4.start();
}
}

[I need to know] If Thread.State state = Thread.currentThread().getState(); is the code to display the status of a thread.
That declares a variable named state and initializes it with the state of the thread in which it is executed. (https://docs.oracle.com/javase/8/docs/api/java/lang/Thread.State.html
That code snippet doesn't display anything, but the subsequent System.out.println("status " + state); will write a representation of the state out to the console.
I have no way of knowing know whether Thread.State corresponds to what you call "status", but calling Thread.currentThread().getState(); yields no information because the state of any thread will always be RUNNABLE while it is calling getState().
[I need to know] how to assign different threads to different processors
I bet you don't.
What you are asking about is called "processor affinity." Most programs rely on the operating system to automatically schedule runnable threads on available CPUs. It's only very sophisticated programs (and note, "sophisticated" does not always equal "good") that need to tinker with it.
There is no standard way to set the CPU affinity of Java threads, but there might be some means that you can use on your particular operating system. You can google "Java processor affinity" for more info.
[I need to know] How to display the status of the processor
You're going to have to explain what "status" means and also, which processor, and when?
Depending on what "status" means, If you write code to ask for the "status" of the same processor that the code is running on, then the answer probably will never change. Just like how the result of Thread.currentThread().getState() never changes.
Processor "status" is the sort of thing that sysadmins of big data centers like to see plotted on charts, but they seldom are useful within a program unless, once again, you are doing something very sophisticated.
[I need to know] how to generate error messeges.
The simplest way to do that is to use System.out.println(...) or System.err.println(...)
Maybe you really meant to ask something else, like how to know when to report an error, or how to handle exceptions.
Most Java library routines throw an exception when something goes wrong. All you have to do in that case is completely ignore it. If you don't write any code to handle the exception, then your program will automatically print an error message and stop when the exception is thrown.

how to assign different threads to different processors (4 cores)
Let's leave this decision to java runtime. From programmer perspective, try to use the features available to you.
For effective utilizaiton of CPU cores, have a look at this question:
Java: How to scale threads according to cpu cores?
Regarding processor affinity, you can check posts:
Java thread affinity
http://tools.assembla.com/Behemoth/browser/Tests/JAVA/test/src/main/java/test/threads/ThreadAffinity.java ( written by BegemoT)
But you should avoid these type of things and focus on business logic.
how to generate error messeges(Memory Consistency Errors, etc.)
You can easily generate memory consistency errors.
Do not protect your code with thread safe constructs like synchronized or Lock and allow multiple threads to update data from your class.
Have a look at this documentation page
Refer to documentation links for better understanding of multi-threading concepts.

Multithreading a massive file read

I'm still in the process of wrapping my brain around how concurrency works in Java. I understand that (if you're subscribing to the OO Java 5 concurrency model) you implement a Task or Callable with a run() or call() method (respectively), and it behooves you to parallelize as much of that implemented method as possible.
But I'm still not understanding something inherent about concurrent programming in Java:
How is a Task's run() method assigned the right amount of concurrent work to be performed?
As a concrete example, what if I have an I/O-bound readMobyDick() method that reads the entire contents of Herman Melville's Moby Dick into memory from a file on the local system. And let's just say I want this readMobyDick() method to be concurrent and handled by 3 threads, where:
Thread #1 reads the first 1/3rd of the book into memory
Thread #2 reads the second 1/3rd of the book into memory
Thread #3 reads the last 1/3rd of the book into memory
Do I need to chunk Moby Dick up into three files and pass them each to their own task, or do I I just call readMobyDick() from inside the implemented run() method and (somehow) the Executor knows how to break the work up amongst the threads.
I am a very visual learner, so any code examples of the right way to approach this are greatly appreciated! Thanks!

You have probably chosen by accident the absolute worst example of parallel activities!
Reading in parallel from a single mechanical disk is actually slower than reading with a single thread, because you are in fact bouncing the mechanical head to different sections of the disk as each thread gets its turn to run. This is best left as a single threaded activity.
Let's take another example, which is similar to yours but can actually offer some benefit: assume I want to search for the occurrences of a certain word in a huge list of words (this list could even have come from a disk file, but like I said, read by a single thread). Assume I can use 3 threads like in your example, each searching on 1/3rd of the huge word list and keeping a local counter of how many times the searched word appeared.
In this case you'd want to partition the list in 3 parts, pass each part to a different object whose type implements Runnable and have the search implemented in the run method.
The runtime itself has no idea how to do the partitioning or anything like that, you have to specify it yourself. There are many other partitioning strategies, each with its own strengths and weaknesses, but we can stick to the static partitioning for now.
Let's see some code:
class SearchTask implements Runnable {
private int localCounter = 0;
private int start; // start index of search
private int end;
private List<String> words;
private String token;
public SearchTask(int start, int end, List<String> words, String token) {
this.start = start;
this.end = end;
this.words = words;
this.token = token;
}
public void run() {
for(int i = start; i < end; i++) {
if(words.get(i).equals(token)) localCounter++;
}
}
public int getCounter() { return localCounter; }
}
// meanwhile in main :)
List<String> words = new ArrayList<String>();
// populate words
// let's assume you have 30000 words
// create tasks
SearchTask task1 = new SearchTask(0, 10000, words, "John");
SearchTask task2 = new SearchTask(10000, 20000, words, "John");
SearchTask task3 = new SearchTask(20000, 30000, words, "John");
// create threads for each task
Thread t1 = new Thread(task1);
Thread t2 = new Thread(task2);
Thread t3 = new Thread(task3);
// start threads
t1.start();
t2.start();
t3.start();
// wait for threads to finish
t1.join();
t2.join();
t3.join();
// collect results
int counter = 0;
counter += task1.getCounter();
counter += task2.getCounter();
counter += task3.getCounter();
This should work nicely. Note that in practical cases you would build a more generic partitioning scheme. You could alternatively use an ExecutorService and implement Callable instead of Runnable if you wish to return a result.
So an alternative example using more advanced constructs:
class SearchTask implements Callable<Integer> {
private int localCounter = 0;
private int start; // start index of search
private int end;
private List<String> words;
private String token;
public SearchTask(int start, int end, List<String> words, String token) {
this.start = start;
this.end = end;
this.words = words;
this.token = token;
}
public Integer call() {
for(int i = start; i < end; i++) {
if(words.get(i).equals(token)) localCounter++;
}
return localCounter;
}
}
// meanwhile in main :)
List<String> words = new ArrayList<String>();
// populate words
// let's assume you have 30000 words
// create tasks
List<Callable> tasks = new ArrayList<Callable>();
tasks.add(new SearchTask(0, 10000, words, "John"));
tasks.add(new SearchTask(10000, 20000, words, "John"));
tasks.add(new SearchTask(20000, 30000, words, "John"));
// create thread pool and start tasks
ExecutorService exec = Executors.newFixedThreadPool(3);
List<Future> results = exec.invokeAll(tasks);
// wait for tasks to finish and collect results
int counter = 0;
for(Future f: results) {
counter += f.get();
}

You picked a bad example, as Tudor was so kind to point out. Spinning disk hardware is subject to physical constraints of moving platters and heads, and the most efficient read implementation is to read each block in order, which reduces the need to move the head or wait for the disk to align.
That said, some operating systems don't always store things continuously on disks, and for those who remember, defragmentation could provide a disk performance boost if you OS / filesystem didn't do the job for you.
As you mentioned wanting a program that would benefit, let me suggest a simple one, matrix addition.
Assuming you made one thread per core, you can trivially divide any two matrices to be added into N (one for each thread) rows. Matrix addition (if you recall) works as such:
A + B = C
or
[ a11, a12, a13 ] [ b11, b12, b13] = [ (a11+b11), (a12+b12), (a13+c13) ]
[ a21, a22, a23 ] + [ b21, b22, b23] = [ (a21+b21), (a22+b22), (a23+c23) ]
[ a31, a32, a33 ] [ b31, b32, b33] = [ (a31+b31), (a32+b32), (a33+c33) ]
So to distribute this across N threads, we simply need to take the row count and modulus divide by the number of threads to get the "thread id" it will be added with.
matrix with 20 rows across 3 threads
row % 3 == 0 (for rows 0, 3, 6, 9, 12, 15, and 18)
row % 3 == 1 (for rows 1, 4, 7, 10, 13, 16, and 19)
row % 3 == 2 (for rows 2, 5, 8, 11, 14, and 17)
// row 20 doesn't exist, because we number rows from 0
Now each thread "knows" which rows it should handle, and the results "per row" can be computed trivially because the results do not cross into other thread's domain of computation.
All that is needed now is a "result" data structure which tracks when the values have been computed, and when last value is set, then the computation is complete. In this "fake" example of a matrix addition result with two threads, computing the answer with two threads takes approximately half the time.
// the following assumes that threads don't get rescheduled to different cores for
// illustrative purposes only. Real Threads are scheduled across cores due to
// availability and attempts to prevent unnecessary core migration of a running thread.
[ done, done, done ] // filled in at about the same time as row 2 (runs on core 3)
[ done, done, done ] // filled in at about the same time as row 1 (runs on core 1)
[ done, done, .... ] // filled in at about the same time as row 4 (runs on core 3)
[ done, ...., .... ] // filled in at about the same time as row 3 (runs on core 1)
More complex problems can be solved by multithreading, and different problems are solved with different techniques. I purposefully picked one of the simplest examples.

you implement a Task or Callable with a run() or call() method
(respectively), and it behooves you to parallelize as much of that
implemented method as possible.
A Task represents a discrete unit of work
Loading a file into memory is a discrete unit of work and can therefore this activity can be delegated to a background thread. I.e. a background thread runs this task of loading the file.
It is a discrete unit of work since it has no other dependencies needed in order to do its job (load the file) and has discrete boundaries.
What you are asking is to further divide this into task. I.e. a thread loads 1/3 of the file while another thread the 2/3 etc.
If you were able to divide the task into further subtasks then it would not be a task in the first place by definition. So loading a file is a single task by itself.
To give you an example:
Let's say that you have a GUI and you need to present to the user data from 5 different files. To present them you need also to prepare some data structures to process the actual data.
All these are separate tasks.
E.g. the loading of files is 5 different tasks so could be done by 5 different threads.
The preparation of the data structures could be done a different thread.
The GUI runs of course in another thread.
All these can happen concurrently

If you system supported high-throughput I/O , here is how you can do it:
How to read a file using multiple threads in Java when a high throughput(3GB/s) file system is available
Here is the solution to read a single file with multiple threads.
Divide the file into N chunks, read each chunk in a thread, then merge them in order. Beware of lines that cross chunk boundaries. It is the basic idea as suggested by user
slaks
Bench-marking below implementation of multiple-threads for a single 20 GB file:
1 Thread : 50 seconds : 400 MB/s
2 Threads: 30 seconds : 666 MB/s
4 Threads: 20 seconds : 1GB/s
8 Threads: 60 seconds : 333 MB/s
Equivalent Java7 readAllLines() : 400 seconds : 50 MB/s
Note: This may only work on systems that are designed to support high-throughput I/O , and not on usual personal computers
Here is the essential nits of the code, for complete details , follow the link
public class FileRead implements Runnable
{
private FileChannel _channel;
private long _startLocation;
private int _size;
int _sequence_number;
public FileRead(long loc, int size, FileChannel chnl, int sequence)
{
_startLocation = loc;
_size = size;
_channel = chnl;
_sequence_number = sequence;
}
#Override
public void run()
{
System.out.println("Reading the channel: " + _startLocation + ":" + _size);
//allocate memory
ByteBuffer buff = ByteBuffer.allocate(_size);
//Read file chunk to RAM
_channel.read(buff, _startLocation);
//chunk to String
String string_chunk = new String(buff.array(), Charset.forName("UTF-8"));
System.out.println("Done Reading the channel: " + _startLocation + ":" + _size);
}
//args[0] is path to read file
//args[1] is the size of thread pool; Need to try different values to fing sweet spot
public static void main(String[] args) throws Exception
{
FileInputStream fileInputStream = new FileInputStream(args[0]);
FileChannel channel = fileInputStream.getChannel();
long remaining_size = channel.size(); //get the total number of bytes in the file
long chunk_size = remaining_size / Integer.parseInt(args[1]); //file_size/threads
//thread pool
ExecutorService executor = Executors.newFixedThreadPool(Integer.parseInt(args[1]));
long start_loc = 0;//file pointer
int i = 0; //loop counter
while (remaining_size >= chunk_size)
{
//launches a new thread
executor.execute(new FileRead(start_loc, toIntExact(chunk_size), channel, i));
remaining_size = remaining_size - chunk_size;
start_loc = start_loc + chunk_size;
i++;
}
//load the last remaining piece
executor.execute(new FileRead(start_loc, toIntExact(remaining_size), channel, i));
//Tear Down
}
}

Why is my multi threaded sorting algorithm not faster than my single threaded mergesort

There are certain algorithms whose running time can decrease significantly when one divides up a task and gets each part done in parallel. One of these algorithms is merge sort, where a list is divided into infinitesimally smaller parts and then recombined in a sorted order. I decided to do an experiment to test whether or not I could I increase the speed of this sort by using multiple threads. I am running the following functions in Java on a Quad-Core Dell with Windows Vista.
One function (the control case) is simply recursive:
// x is an array of N elements in random order
public int[] mergeSort(int[] x) {
if (x.length == 1)
return x;
// Dividing the array in half
int[] a = new int[x.length/2];
int[] b = new int[x.length/2+((x.length%2 == 1)?1:0)];
for(int i = 0; i < x.length/2; i++)
a[i] = x[i];
for(int i = 0; i < x.length/2+((x.length%2 == 1)?1:0); i++)
b[i] = x[i+x.length/2];
// Sending them off to continue being divided
mergeSort(a);
mergeSort(b);
// Recombining the two arrays
int ia = 0, ib = 0, i = 0;
while(ia != a.length || ib != b.length) {
if (ia == a.length) {
x[i] = b[ib];
ib++;
}
else if (ib == b.length) {
x[i] = a[ia];
ia++;
}
else if (a[ia] < b[ib]) {
x[i] = a[ia];
ia++;
}
else {
x[i] = b[ib];
ib++;
}
i++;
}
return x;
}
The other is in the 'run' function of a class that extends thread, and recursively creates two new threads each time it is called:
public class Merger extends Thread
{
int[] x;
boolean finished;
public Merger(int[] x)
{
this.x = x;
}
public void run()
{
if (x.length == 1) {
finished = true;
return;
}
// Divide the array in half
int[] a = new int[x.length/2];
int[] b = new int[x.length/2+((x.length%2 == 1)?1:0)];
for(int i = 0; i < x.length/2; i++)
a[i] = x[i];
for(int i = 0; i < x.length/2+((x.length%2 == 1)?1:0); i++)
b[i] = x[i+x.length/2];
// Begin two threads to continue to divide the array
Merger ma = new Merger(a);
ma.run();
Merger mb = new Merger(b);
mb.run();
// Wait for the two other threads to finish
while(!ma.finished || !mb.finished) ;
// Recombine the two arrays
int ia = 0, ib = 0, i = 0;
while(ia != a.length || ib != b.length) {
if (ia == a.length) {
x[i] = b[ib];
ib++;
}
else if (ib == b.length) {
x[i] = a[ia];
ia++;
}
else if (a[ia] < b[ib]) {
x[i] = a[ia];
ia++;
}
else {
x[i] = b[ib];
ib++;
}
i++;
}
finished = true;
}
}
It turns out that function that does not use multithreading actually runs faster. Why? Does the operating system and the java virtual machine not "communicate" effectively enough to place the different threads on different cores? Or am I missing something obvious?

The problem is not multi-threading: I've written a correctly multi-threaded QuickSort in Java and it owns the default Java sort. I did this after witnessing a gigantic dataset being process and had only one core of a 16-cores machine working.
One of your issue (a huge one) is that you're busy looping:
// Wait for the two other threads to finish
while(!ma.finished || !mb.finished) ;
This is a HUGE no-no: it is called busy looping and you're destroying the perfs.
(Another issue is that your code is not spawning any new threads, as it has already been pointed out to you)
You need to use other way to synchronize: an example would be to use a CountDownLatch.
Another thing: there's no need to spawn two new threads when you divide the workload: spawn only one new thread, and do the other half in the current thread.
Also, you probably don't want to create more threads than there are cores availables.
See my question here (asking for a good Open Source multithreaded mergesort/quicksort/whatever). The one I'm using is proprietary, I can't paste it.
Multithreaded quicksort or mergesort
I haven't implemented Mergesort but QuickSort and I can tell you that there's no array copying going on.
What I do is this:
pick a pivot
exchange values as needed
have we reached the thread limit? (depending on the number of cores)
yes: sort first part in this thread
no: spawn a new thread
sort second part in current thread
wait for first part to finish if it's not done yet (using a CountDownLatch).
The code spawning a new thread and creating the CountDownLatch may look like this:
final CountDownLatch cdl = new CountDownLatch( 1 );
final Thread t = new Thread( new Runnable() {
public void run() {
quicksort(a, i+1, r );
cdl.countDown();
}
} };
The advantage of using synchronization facilities like the CountDownLatch is that it is very efficient and that your not wasting time dealing with low-level Java synchronization idiosynchrasies.
In your case, the "split" may look like this (untested, it is just to give an idea):
if ( threads.getAndIncrement() < 4 ) {
final CountDownLatch innerLatch = new CountDownLatch( 1 );
final Thread t = new Merger( innerLatch, b );
t.start();
mergeSort( a );
while ( innerLatch.getCount() > 0 ) {
try {
innerLatch.await( 1000, TimeUnit.SECONDS );
} catch ( InterruptedException e ) {
// Up to you to decide what to do here
}
}
} else {
mergeSort( a );
mergeSort( b );
}
(don't forget to "countdown" the latch when each merge is done)
Where you'd replace the number of threads (up to 4 here) by the number of available cores. You may use the following (once, say to initialize some static variable at the beginning of your program: the number of cores is unlikely to change [unless you're on a machine allowing CPU hotswapping like some Sun systems allows]):
Runtime.getRuntime().availableProcessors()

As others said; This code isn't going to work because it starts no new threads. You need to call the start() method instead of the run() method to create new threads. It also has concurrency errors: the checks on the finished variable are not thread safe.
Concurrent programming can be pretty difficult if you do not understand the basics. You might read the book Java Concurrency in Practice by Brian Goetz. It explains the basics and explains constructs (such as Latch, etc) to ease building concurrent programs.

The overhead cost of synchronization may be comparatively large and prevent many optimizations.
Furthermore you are creating way too many threads.
The other is in the 'run' function of a class that extends thread, and recursively creates two new threads each time it is called.
You would be better off with a fixed number of threads, suggestively 4 on a quad core. This could be realized with a thread pool (tutorial) and the pattern would be "bag of tasks". But perhaps it would be better yet, to initially divide the task into four equally large tasks and do "single-threaded" sorting on those tasks. This would then utilize the caches a lot better.
Instead of having a "busy-loop" waiting for the threads to finish (stealing cpu-cycles) you should have a look at Thread.join().

How many elements in the array you have to do sort? If there are too few elements, the time of sync and CPU switching will over the time you save for dividing the job for paralleling

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.