MultiThreaded Fibonacci

MultiThreaded Fibonacci - java

public class Fibonacci {
public static class PFibo extends Thread {
private int x;
public long answer;
public PFibo(int x) {
this.x = x;
}
public void run() {
if (x <= 2)
answer = 1;
else {
try {
PFibo t = new PFibo(x - 1);
t.start();
long y = RFibo(x - 2);
t.join();
answer = t.answer + y;
} catch (InterruptedException ex) {
}
}
}
}
public static long RFibo(int no) {
if (no == 1 || no == 2) {
return 1;
}
return RFibo(no - 1) + RFibo(no - 2);
}
public static void main(String[] args) throws Exception {
try {
long start = System.currentTimeMillis();
PFibo f = new PFibo(30);
f.start();
f.join();
long end = System.currentTimeMillis();
System.out.println("Parallel-Fibonacci:" + f.answer + "\tTime:" + (end - start));
start = System.currentTimeMillis();
long result = RFibo(30);
end = System.currentTimeMillis();
System.out.println("Normal-Fibonacci:" + result + "\tTime:" + (end - start));
} catch (Exception e) {
}
}
}
I am currently reading 'Multithreaded Algorithms' from 'Introduction to Algorithms'. I tried implementing a basic multithreaded program for calculating the n-th fibonacci number. For n=30 the program gave the following output :
Parallel-Fibonacci:832040 Time:10
Normal-Fibonacci:832040 Time:3
Why is the parallel version slower that the non-parallel version. Has thread-switching or 'too-many-number-of-threads' slowed it down ?
What approach has to followed to speed-up the parallel version ?

Has thread-switching or 'too-many-number-of-threads' slowed it down ?
Yes of course. In a number of ways-
As already been pointed out in comments
You are creating a new thread per call i.e.
PFibo t = new PFibo(x - 1);
t.start();
Effectively you have created around 28 threads for PFibo(30) which means one context switch for evaluating each term
Secondly, as the evaluation of PFibo(x) depends on PFibo(x - 1) which required you to put a call to join() method there, each time you are creating/starting a new thread i.e. eventually it has become serial.
So the final cost = cost of actual serial method RFibo(n) + around n context switches + sync time (time taken by join())
What approach has to followed to speed-up the parallel version ?
Well I would say, don't do it. Fibonacci series' solution pattern does not suit to be optimized by parallelism. Just rely on serial version(you can implement an iterative version for more efficiency).

your input is too small to gain any benefit from parallelism. Nevertheless, it makes sense to parallelize this version of the Fibonacci algorithm. Your algorithm is exponential. By creating new threads, you split exponential work among the threads. Notice, however, that there is, indeed, a linear-time algorithm to compute the Fibonacci numbers, which, as people here have already said, it is better to run sequentially. So, using larger inputs with your implementation, I get, on an Intel 2.3GHz:
$ java Fib 30
Parallel-Fib:832040 Time:0.026805616
Sequential-Fib:832040 Time:0.002786453
$ java Fib 33
Parallel-Fib:3524578 Time:0.012451416
Sequential-Fib:3524578 Time:0.012420652
$ java Fib 36
Parallel-Fib:14930352 Time:0.035997556
Sequential-Fib:14930352 Time:0.056066557
$ java Fib 44
Parallel-Fib:701408733 Time:2.037292083
Sequential-Fib:701408733 Time:3.050315551

Related

Time how long a function runs (short duration)

I'm relatively new to Java programming, and I'm running into an issue calculating the amount of time it takes for a function to run.
First some background - I've got a lot of experience with Python, and I'm trying to recreate the functionality of the Jupyter Notebook/Lab %%timeit function, if you're familiar with that. Here's a pic of it in action (sorry, not enough karma to embed yet):
Snip of Jupyter %%timeit
What it does is run the contents of the cell (in this case a recursive function) either 1k, 10k, or 100k times, and give you the average run time of the function, and the standard deviation.
My first implementation (using the same recursive function) used System.nanoTime():
public static void main(String[] args) {
long t1, t2, diff;
long[] times = new long[1000];
int t;
for (int i=0; i< 1000; i++) {
t1 = System.nanoTime();
t = triangle(20);
t2 = System.nanoTime();
diff = t2-t1;
System.out.println(diff);
times[i] = diff;
}
long total = 0;
for (int j=0; j<times.length; j++) {
total += times[j];
}
System.out.println("Mean = " + total/1000.0);
}
But the mean is wildly thrown off -- for some reason, the first iteration of the function (on many runs) takes upwards of a million nanoseconds:
Pic of initial terminal output
Every iteration after the first dozen or so takes either 395 nanos or 0 -- so there could be a problem there too... not sure what's going on!
Also -- the code of the recursive function I'm timing:
static int triangle(int n) {
if (n == 1) {
return n;
} else {
return n + triangle(n -1);
}
}
Initially I had the line n = Math.abs(n) on the first line of the function, but then I removed it because... meh. I'm the only one using this.
I tried a number of different suggestions brought up in this SO post, but they each have their own problems... which I can go into if you need.
Anyway, thank you in advance for your help and expertise!

Why is my java program becoming gradually slower?

I recently built a Fibonacci generator that uses recursion and hashmaps to reduce complexity. I am using the System.nanoTime() to keep track of the time it takes for my program to print 10000 Fibonacci number. It started out good with less than a second but gradually became slower and now it takes more than 4 seconds. Can someone explain why this might be happening. The code is down here-
import java.util.*;
import java.math.*;
public class FibonacciGeneratorUnlimited {
static int numFibCalls = 0;
static HashMap<Integer, BigInteger> d = new HashMap<Integer, BigInteger>();
static Scanner fibNumber = new Scanner(System.in);
static BigInteger ans = new BigInteger("0");
public static void main(String[] args){
d.put(0 , new BigInteger("0"));
d.put(1 , new BigInteger("1"));
System.out.print("Enter the term:\t");
int n = fibNumber.nextInt();
long startTime = System.nanoTime();
for (int i = 0; i <= n; i++) {
System.out.println(i + " : " + fib_efficient(i, d));
}
System.out.println((double)(System.nanoTime() - startTime) / 1000000000);
}
public static BigInteger fib_efficient(int n, HashMap<Integer, BigInteger> d) {
numFibCalls += 1;
if (d.containsKey(n)) {
return (d.get(n));
} else {
ans = (fib_efficient(n-1, d).add(fib_efficient(n-2, d)));
d.put(n, ans);
return ans;
}
}
}

If you are restarting the program every time you make a new fibonacci sequence, then your program most likely isn't the problem. It might just be the your processor got hot after running the program a few times, or a background process in your computer suddenly started, causing your program to slow down.

More memory java -Xmx=... or less caching
public static BigInteger fib_efficient(int n, HashMap<Integer, BigInteger> d) {
numFibCalls++;
if ((n & 3) <= 1) { // Every second is cached.
BigInteger cached = d.get(n);
if (cached != null) {
return cached;
} else {
BigInteger ans = fib_efficient(n-1, d).add(fib_efficient(n-2, d));
d.put(n, ans);
return ans;
}
} else {
return fib_efficient(n-1, d).add(fib_efficient(n-2, d));
}
}
Two subsequent numbers are cached out of four in order to stop the
recursion on both branches for:
fib(n) = fib(n-1) + fib(n-2)
BigInteger isn't the nicest class where performance and memory is concerned.

It started out good with less than a second but gradually became slower and now it takes more than 4 seconds.
What do you mean by this? Do you mean that you ran this exact same program with the same input and its run-time changed from < 1 second to > 4 seconds?
If you have the same exact code running with the same exact inputs in a deterministic algorithm...
then the differences are probably external to your code - maybe other processes are taking up more CPU on one run.
Do you mean that you increased the inputs from some value X to 10,000 and now it takes > 4 seconds?
Then that's just a matter of the algorithm taking longer with larger inputs, which is perfectly normal.
recursion and hashmaps to reduce complexity
That's not quite how complexity works. You have improved the best-case and the average-case, but you have done nothing to change the worst-case.
Now for some actual performance improvement advice
Stop printing out the results... that's eating up over 99% of your processing time. Seriously, though, switch out "System.out.println(i + " : " + fib_efficient(i, d))" with "fib_efficient(i,d)" and it'll execute over 100x faster.
Concatenating strings and printing to console are very expensive processes.

It happens because the complexity for Fibonacci is Big-O(n^2). This means that, the larger the input the time increases exponentially, as you can see in the graph for Big-O(n^2) in this link. Check this answer to see a complete explanation about it´s complexity.
Now, the complexity of your algorithm increases because you are using a HashMap to search and insert elements each time that function is invoked. Consider remove this HashMap.

Why is there run time overhead on the first time a method is called in Java?

I was measuring execution time of my code and found some weird behavior when making the first call to a method (from the main method). Here is my code, please have a look at this
public static void main(String[] args) {
try (Scanner input = new Scanner(System.in)) {
int iNum = input.nextInt();
long lStartTime = System.nanoTime();
// ***********(First call in main) Calling isPrime *************
lStartTime = System.nanoTime();
printResult(isPrime(iNum));
System.out.println("Time Consumed in first call-: "
+ (System.nanoTime() - lStartTime));
// ***********(Second call in main) Calling isPrime *************
lStartTime = System.nanoTime();
printResult(isPrime(iNum));
System.out.println("Time Consumed in second call-: "
+ (System.nanoTime() - lStartTime));
}
}
private static boolean isPrime(int iNum) {
boolean bResult = true;
if (iNum <= 1 || iNum != 2 && iNum % 2 == 0) {
bResult = false;
} else {
double iSqrt = Math.sqrt((double) iNum);
for (int i = 3; i < iSqrt; i += 2) {
if (iNum % i == 0) {
bResult = false;
break;
}
}
}
return bResult;
}
private static void printResult(boolean bResult) {
if (bResult)
System.out.println("\nIt's prime number.");
else
System.out.println("\nIt's not prime number.");
}
Input
5
Output
It's prime number.
Time Consumed in first call-: 484073
It's prime number.
Time Consumed in second call-: 40710
Description
I have depicted only one test case of input and output above. But, there is always a difference in execution time between the first method invocation and the second one.
I have also tried more than two methods calls in the similar way and found that there is not such a huge difference between the other calls except one. I'm getting right execution time around 40710ns (this execution time could be different on your system) for the rest of calls except first method call which is 484073ns. Easily I can see that there is time overhead of 484073 - 40710 = 443363ns (approx) in first method call, but why is it happening? What is the root cause?

There are multiple implementations of the Java Runtime Environment, so not every implementation may behave like Oracle's (and previously Sun's).
That begin said the initial invocation of a method, in most current implementations, involves validation and a first pass compilation of the Java bytecode. Thus, subsequent invocations of the method, are faster. However, Java also uses a JIT. Wikipedia provides an entry on Just-in-time compilation which notes
JIT causes a slight delay to a noticeable delay in initial execution of an application, due to the time taken to load and compile the bytecode.
And, goes on to say,
The application code is initially interpreted, but the JVM monitors which sequences of bytecode are frequently executed and translates them to machine code for direct execution on the hardware.

Single Threaded Program vs Multithreaded Program (measuing time elapsed)

I want to know if I need to measure time elapsed then Single Threaded Program is good approach or Multithreading Program is a good approach for that.
Below is my single threaded program that is measuring the time of our service-
private static void serviceCall() {
histogram = new HashMap<Long, Long>();
keys = histogram.keySet();
long total = 5;
long runs = total;
while (runs > 0) {
long start_time = System.currentTimeMillis();
result = restTemplate.getForObject("SOME URL",String.class);
long difference = (System.currentTimeMillis() - start_time);
Long count = histogram.get(difference);
if (count != null) {
count++;
histogram.put(Long.valueOf(difference), count);
} else {
histogram.put(Long.valueOf(difference), Long.valueOf(1L));
}
runs--;
}
for (Long key : keys) {
Long value = histogram.get(key);
System.out.println("MEASUREMENT " + key + ":" + value);
}
}
Output I get from this Single Threaded Program is- Total call was 5
MEASUREMENT 163:1
MEASUREMENT 42:3
MEASUREMENT 47:1
which means 1 call came back in 163 ms. 3 calls came back in 42 ms and so on.
And also I did tried using Multithreaded program as well to measure the time elapsed. Meaning hitting the service parallely with few threads and then measuring how much each thread is taking.
Below is the code for that as well-
//create thread pool with given size
ExecutorService service = Executors.newFixedThreadPool(10);
// queue some tasks
for (int i = 0; i < 1 * 5; i++) {
service.submit(new ThreadTask(i, histogram));
}
public ThreadTask(int id, HashMap<Long, Long> histogram) {
this.id = id;
this.hg = histogram;
}
#Override
public void run() {
long start_time = System.currentTimeMillis();
result = restTemplate.getForObject("", String.class);
long difference = (System.currentTimeMillis() - start_time);
Long count = hg.get(difference);
if (count != null) {
count++;
hg.put(Long.valueOf(difference), count);
} else {
hg.put(Long.valueOf(difference), Long.valueOf(1L));
}
}
And below is the result I get from the above program-
{176=1, 213=1, 182=1, 136=1, 155=1}
One call came back in 176 ms, and so on
So my question is why Multithreading program is taking a lot more time as compared to above Single threaded program? If there is some loop hole in my Multithreading program, can anyone help me to improve it?

Your multi-threaded program likely makes all the requests at the same time which puts more strain on the server which will cause it to respond slower to all request.
As an aside, the way you are doing the update isn't threadsafe, so your count will likely be off in the multithreaded scenario given enough trials.
For instance, Thread A and B both return in 100 ms at the same time. The count in histogram for 100 is 3. A gets 3. B gets 3. A updates 3 to 4. B updates 3 to 4. A puts the value 4 in the histogram. B puts the value 4 in the histogram. You've now had 2 threads believe they incremented the count but the count in the histogram only reflects being incremented once.

Need a hand understanding this Java code please :-)

Just wondering if anyone would be able to take a look at this code for implementing the quicksort algorithm and answer me a few questions, please :-)
public class Run
{
/***************************************************************************
* Quicksort code from Sedgewick 7.1, 7.2.
**************************************************************************/
public static void quicksort(double[] a)
{
//shuffle(a); // to guard against worst-case
quicksort(a, 0, a.length - 1, 0);
}
static void quicksort(final double[] a, final int left, final int right, final int tdepth)
{
if (right <= left)
return;
final int i = partition(a, left, right);
if ((tdepth < 4) && ((i - left) > 1000))
{
final Thread t = new Thread()
{
public void run()
{
quicksort(a, left, i - 1, tdepth + 1);
}
};
t.start();
quicksort(a, i + 1, right, tdepth + 1);
try
{
t.join();
}
catch (InterruptedException e)
{
throw new RuntimeException("Cancelled", e);
}
} else
{
quicksort(a, left, i - 1, tdepth);
quicksort(a, i + 1, right, tdepth);
}
}
// partition a[left] to a[right], assumes left < right
private static int partition(double[] a, int left, int right)
{
int i = left - 1;
int j = right;
while (true)
{
while (less(a[++i], a[right]))
// find item on left to swap
; // a[right] acts as sentinel
while (less(a[right], a[--j]))
// find item on right to swap
if (j == left)
break; // don't go out-of-bounds
if (i >= j)
break; // check if pointers cross
exch(a, i, j); // swap two elements into place
}
exch(a, i, right); // swap with partition element
return i;
}
// is x < y ?
private static boolean less(double x, double y)
{
return (x < y);
}
// exchange a[i] and a[j]
private static void exch(double[] a, int i, int j)
{
double swap = a[i];
a[i] = a[j];
a[j] = swap;
}
// shuffle the array a[]
private static void shuffle(double[] a)
{
int N = a.length;
for (int i = 0; i < N; i++)
{
int r = i + (int) (Math.random() * (N - i)); // between i and N-1
exch(a, i, r);
}
}
// test client
public static void main(String[] args)
{
int N = 5000000; // Integer.parseInt(args[0]);
// generate N random real numbers between 0 and 1
long start = System.currentTimeMillis();
double[] a = new double[N];
for (int i = 0; i < N; i++)
a[i] = Math.random();
long stop = System.currentTimeMillis();
double elapsed = (stop - start) / 1000.0;
System.out.println("Generating input: " + elapsed + " seconds");
// sort them
start = System.currentTimeMillis();
quicksort(a);
stop = System.currentTimeMillis();
elapsed = (stop - start) / 1000.0;
System.out.println("Quicksort: " + elapsed + " seconds");
}
}
My questions are:
What is the purpose of the variable tdepth?
Is this considered a "proper" implementation of a parallel quicksort? I ask becuase it doesn't use implements Runnable or extends Thread...
If it doesn't already, is it possible to modify this code to use multiple threads? By passing in the number of threads you want to use as a parameter, for example...?
Many thanks,
Brian

1. It's used to keep track of recursion depth. This is checked to decide whether to run in parallel. Notice how when the function runs in parallel it passes tdepth + 1 (which becomes tdepth in the called quicksort's parameters). This is a basic way of avoiding too many parallel threads.
2. Yes, it's definitely using another thread. The code:
new Thread()
{
public void run()
{
quicksort(a, left, i - 1, tdepth + 1);
}
};
creates an anonymous inner class (which extends Thread), which is then started.

Apparently, tdepth is used to avoid creating too many threads
It uses an anonymous class, which implicitly extends Thread
It does that already (see point 1.)

tdepth is there so that there's an upper bound on the number of threads created. Note that ever time the method calls itself recursively (which is done in a new thread), tdepth is incremented by one. This way, only the first four levels of recursion will create new threads, presumably to prevent overloading the OS with many threads for little benefit.
This code launches its own threads in the definition of the quicksort method, so it will use parallel processing. One might argue that it could do with some kind of thread management and that e.g. some kind of Executor might be better, but it is definitely parallel. See the call to new Thread() ... followed by start(). Incidentally, the call to t.join() will cause the current thread to wait for the thread t to finish, in case you weren't aware of that.
This code already uses multiple threads, but you can tweak how many it spawns given the comparison on tdepth; increasing or decreasing the value will determine how many levels of recursion create threads. You could complete rewrite the code to use executors and threadpools, or perhaps to perform trinary recursion instead of binary - but I suspect that in the sense you asked; no, there's no simple way to tweak the number of threads.

I did actually wrote a (correctly) multi-threaded QuickSort in Java so maybe I can help a bit...
Question here for anyone interested:
Multithreaded quicksort or mergesort
What is the purpose of the variable
tdepth?
as other have commented, it serves to determine whether to create new threads or not.
Is this considered a "proper"
implementation of a parallel
quicksort? I ask because it doesn't
use implements Runnable or extends
Thread...
I don't think it's that proper for several reasons: first you should make it CPU dependent. There's no point in spawning 16 threads on a CPU that has just one core: a mono-threaded QuickSort shall outperfom the multi-threaded one on a single core machine. On a 16-cores machines, sure, fire up to 16 threads.
Runtime.getRuntime().availableProcessors()
Then the second reason I really don't like it is that it is using last-century low-level Java idiosyncrasish threading details: I prefer to stay away from .join() and use higher level things (see fork/join in the other question or something like CountDownLatch'es, etc.). The problem with things low-level like Java's thread "join" is that it carries no useful meaning: this is 100% Java specific and can be replaced by higher-level threading facilities whose concept are portable across languages.
Then don't comment the shuffle at the beginning. Ever. I've seen dataset where QuickSort degrades quadratically if you remove that shuffle. And it's just an O(n) shuffle, that won't slow down your sort :)
If it doesn't already, is it possible
to modify this code to use multiple
threads? By passing in the number of
threads you want to use as a
parameter, for example...?
I'd try to write and/or reuse an implementation using higher-level concurrency facilities. See the advices in the question I asked here some time ago.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.