Parallelizing Sieve of Eratosthenes in Java - java

I am trying to make a parallel implementation of the Sieve of Eratosthenes. I made a boolean list which gets filled up with true's for the given size. Whenever a prime is found, all multiples of that prime are marked false in the boolean list.
The way I am trying to make this algorithm parallel is by firing up a new thread while still filtering the initial prime number. For example, the algorithm starts with prime = 2. In the for loop for filter, when prime * prime, I make another for loop in which every number in between the prime (2) and the prime * prime (4) is checked. If that index in the boolean list is still true, I fire up another thread to filter that prime number.
The nested for loop creates more and more overhead as the prime numbers to filter are progressing, so I limited this to only do this nested for loop when the prime number < 100. I am assuming that by that time, the 100 million numbers will be somewhat filtered. The problem here is that this way, the primes to be filter stay just under 9500 primes, while the algorithm stops at 10000 primes (prime * prime < size(100m)). I also think this is not at all the correct way to go about it. I have searched a lot online, but didn't manage to find any examples of parallel Java implementations of the sieve.
My code looks like this:
Main class:
public class Main {
private static ListenableQueue<Integer> queue = new ListenableQueue<>(new LinkedList<>());
private static ArrayList<Integer> primes = new ArrayList<>();
private static boolean serialList[];
private static ArrayList<Integer> serialPrimes = new ArrayList<>();
private static ExecutorService exec = Executors.newFixedThreadPool(10);
private static int size = 100000000;
private static boolean list[] = new boolean[size];
private static int lastPrime = 2;
public static void main(String[] args) {
Arrays.fill(list, true);
parallel();
}
public static void parallel() {
Long startTime = System.nanoTime();
int firstPrime = 2;
exec.submit(new Runner(size, list, firstPrime));
}
public static void parallelSieve(int size, boolean[] list, int prime) {
int queuePrimes = 0;
for (int i = prime; i * prime <= size; i++) {
try {
list[i * prime] = false;
if (prime < 100) {
if (i == prime * prime && queuePrimes <= 1) {
for (int j = prime + 1; j < i; j++) {
if (list[j] && j % prime != 0 && j > lastPrime) {
lastPrime = j;
startNewThread(j);
queuePrimes++;
}
}
}
}
} catch (ArrayIndexOutOfBoundsException ignored) { }
}
}
private static void startNewThread(int newPrime) {
if ((newPrime * newPrime) < size) {
exec.submit(new Runner(size, list, newPrime));
}
else {
exec.shutdown();
for (int i = 2; i < list.length; i++) {
if (list[i]) {
primes.add(i);
}
}
}
}
}
Runner class:
public class Runner implements Runnable {
private int arraySize;
private boolean[] list;
private int k;
public Runner(int arraySize, boolean[] list, int k) {
this.arraySize = arraySize;
this.list = list;
this.k = k;
}
#Override
public void run() {
Main.parallelSieve(arraySize, list, k);
}
}
I feel like there is a much simpler way to solve this...
Do you guys have any suggestions as to how I can make this parallelization working and maybe a bit simpler?

Creating a performant concurrent implementation of an algorithm like the Sieve of Eratosthenes is somewhat more difficult than creating a performant single-threaded implementation. The reason is that you need to find a way to partition the work in a way that minimises communication and interference between the parallel worker threads.
If you achieve complete isolation then you can hope for a speed increase approaching the number of logical processors available, or about one order of magnitude on a typical modern PC. By contrast, using a decent single-threaded implementation of the sieve will give you a speedup of at least two to three orders of magnitude. One simple cop-out would be to simply load the data from a file when needed, or to shell out to a decent prime-sieving program like Kim Walisch's PrimeSieve.
Even if we only want to look at the parallelisation problem, it is still necessary to have some insight in the algorithm itself and into to machine it runs on.
The most important aspect is that modern computers have deep cache hierarchies where only the L1 cache - typically 32 KB - is accessible at full speed and all other memory accesses incur significant penalties. Translated to the Sieve of Eratosthenes this means that you need to sieve your target range one 32 KB window at a time, instead of striding each prime over many megabytes. The small primes up to the square root of the target range end must be sieved before the parallel dance begins, but then each segment or window can be sieved independently.
Sieving a given window or segment necessitates determining the start offsets for the small primes that you want to sieve by, which means at least one modulo divison per small prime per window and division is a an extremely slow operation. However, if you sieve consecutive segments instead of arbitrary windows placed anywhere in the range then you can keep the end offsets for each prime in a vector and use them as start offsets for the next segment, thus eliminating the expensive computation of the start offset.
Thus, one promising parallelisation strategy for the Sieve of Eratosthenes would be to give each worker thread a contiguous group of 32 KB blocks to sieve, so that the start offset calculation needs to happen only once per worker. This way there cannot be memory access contention between workers, since each has its own independent subrange of the target range.
However, before you begin to parallelise - i.e., make your code more complex - you should first slim it down and reduce the work to be done to the absolute essentials. For example, take a look at this fragment from your code:
for (int i = prime; i * prime <= size; i++)
list[i * prime] = false;
Instead of recomputing loop bounds in every iteration and indexing with a multiplication, check the loop variable against a precomputed, loop-invariant value and reduce the multiplication to iterated addition:
for (int o = prime * prime; o <= size; o += prime)
list[o] = false;
There are two simple sieve-specific optimisations that can give significant speed bosts.
1) Leave the even numbers out of your sieve and pull the prime 2 out of thin air when needed. Bingo, you just doubled your performance.
2) Instead of sieving each segment by the small odd primes 3, 5, 7 and so on, blast a precomputed pattern over the segment (or even the whole range). This saves time because these small primes make many, many steps in each segment and account for the lion's share of sieving time.
There are more possible optimisations including a couple more low-hanging fruit but either the returns are diminishing or the effort curve rises steeply. Try searching Code Review for 'sieve'. Also, don't forget that you're fighting a Java compiler in addition to the algorithmic problem and the machine architecture, i.e. things like array bounds checking which your compiler may or may not be able to hoist out of loops.
To give you a ballpark figure: a single-threaded segmented odds-only sieve with precomputed patterns can sieve the whole 32-bit range in 2 to 4 seconds in C#, depending on how much TLC you apply in addition to things mentioned above. Your much smaller problem of primes up to 100000000 (1e8) is solved in less than 100 ms on my aging notebook.
Here's some code that shows how windowed sieving works. For clarity I left off all optimisations like odds-only representation or wheel-3 stepping when reading out the primes and so on. It's C# but that should be similar enough to Java to be readable.
Note: I called the sieve array eliminated because a true value indicates a crossed-off number (saves filling the array with all true at the beginning and it is more logical anyway).
static List<uint> small_primes_between (uint m, uint n)
{
m = Math.Max(m, 2);
if (m > n)
return new List<uint>();
Trace.Assert(n - m < int.MaxValue);
uint sieve_bits = n - m + 1;
var eliminated = new bool[sieve_bits];
foreach (uint prime in small_primes_up_to((uint)Math.Sqrt(n)))
{
uint start = prime * prime, stride = prime;
if (start >= m)
start -= m;
else
start = (stride - 1) - (m - start - 1) % stride;
for (uint j = start; j < sieve_bits; j += stride)
eliminated[j] = true;
}
return remaining_numbers(eliminated, m);
}
//---------------------------------------------------------------------------------------------
static List<uint> remaining_numbers (bool[] eliminated, uint sieve_base)
{
var result = new List<uint>();
for (uint i = 0, e = (uint)eliminated.Length; i < e; ++i)
if (!eliminated[i])
result.Add(sieve_base + i);
return result;
}
//---------------------------------------------------------------------------------------------
static List<uint> small_primes_up_to (uint n)
{
Trace.Assert(n < int.MaxValue); // size_t is int32_t in .Net (!)
var eliminated = new bool[n + 1]; // +1 because indexed by numbers
eliminated[0] = true;
eliminated[1] = true;
for (uint i = 2, sqrt_n = (uint)Math.Sqrt(n); i <= sqrt_n; ++i)
if (!eliminated[i])
for (uint j = i * i; j <= n; j += i)
eliminated[j] = true;
return remaining_numbers(eliminated, 0);
}

Related

Multithreaded Segmented Sieve of Eratosthenes in Java

I am trying to create a fast prime generator in Java. It is (more or less) accepted that the fastest way for this is the segmented sieve of Eratosthenes: https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes. Lots of optimizations can be further implemented to make it faster. As of now, my implementation generates 50847534 primes below 10^9 in about 1.6 seconds, but I am looking to make it faster and at least break the 1 second barrier. To increase the chance of getting good replies, I will include a walkthrough of the algorithm as well as the code.
Still, as a TL;DR, I am looking to include multi-threading into the code
For the purposes of this question, I want to separate between the 'segmented' and the 'traditional' sieves of Eratosthenes. The traditional sieve requires O(n) space and therefore is very limited in range of the input (the limit of it). The segmented sieve however only requires O(n^0.5) space and can operate on much larger limits. (A main speed-up is using a cache-friendly segmentation, taking into account the L1 & L2 cache sizes of the specific computer). Finally, the main difference that concerns my question is that the traditional sieve is sequential, meaning it can only continue once the previous steps are completed. The segmented sieve however, is not. Each segment is independent, and is 'processed' individually against the sieving primes (the primes not larger than n^0.5). This means that theoretically, once I have the sieving primes, I can divide the work between multiple computers, each processing a different segment. The work of eachother is independent of the others. Assuming (wrongly) that each segment requires the same amount of time t to complete, and there are k segments, One computer would require total time of T = k * t, whereas k computers, each working on a different segment would require a total amount of time T = t to complete the entire process. (Practically, this is wrong, but for the sake of simplicity of the example).
This brought me to reading about multithreading - dividing the work to a few threads each processing a smaller amount of work for better usage of CPU. To my understanding, the traditional sieve cannot be multithreaded exactly because it is sequential. Each thread would depend on the previous, rendering the entire idea unfeasible. But a segmented sieve may indeed (I think) be multithreaded.
Instead of jumping straight into my question, I think it is important to introduce my code first, so I am hereby including my current fastest implementation of the segmented sieve. I have worked quite hard on it. It took quite some time, slowly tweaking and adding optimizations to it. The code is not simple. It is rather complex, I would say. I therefore assume the reader is familiar with the concepts I am introducing, such as wheel factorization, prime numbers, segmentation and more. I have included notes to make it easier to follow.
import java.math.BigInteger;
import java.util.ArrayList;
import java.util.Arrays;
public class primeGen {
public static long x = (long)Math.pow(10, 9); //limit
public static int sqrtx;
public static boolean [] sievingPrimes; //the sieving primes, <= sqrtx
public static int [] wheels = new int [] {2,3,5,7,11,13,17,19}; // base wheel primes
public static int [] gaps; //the gaps, according to the wheel. will enable skipping multiples of the wheel primes
public static int nextp; // the first prime > wheel primes
public static int l; // the amount of gaps in the wheel
public static void main(String[] args)
{
long startTime = System.currentTimeMillis();
preCalc(); // creating the sieving primes and calculating the list of gaps
int segSize = Math.max(sqrtx, 32768*8); //size of each segment
long u = nextp; // 'u' is the running index of the program. will continue from one segment to the next
int wh = 0; // the will be the gap index, indicating by how much we increment 'u' each time, skipping the multiples of the wheel primes
long pi = pisqrtx(); // the primes count. initialize with the number of primes <= sqrtx
for (long low = 0 ; low < x ; low += segSize) //the heart of the code. enumerating the primes through segmentation. enumeration will begin at p > sqrtx
{
long high = Math.min(x, low + segSize);
boolean [] segment = new boolean [(int) (high - low + 1)];
int g = -1;
for (int i = nextp ; i <= sqrtx ; i += gaps[g])
{
if (sievingPrimes[(i + 1) / 2])
{
long firstMultiple = (long) (low / i * i);
if (firstMultiple < low)
firstMultiple += i;
if (firstMultiple % 2 == 0) //start with the first odd multiple of the current prime in the segment
firstMultiple += i;
for (long j = firstMultiple ; j < high ; j += i * 2)
segment[(int) (j - low)] = true;
}
g++;
//if (g == l) //due to segment size, the full list of gaps is never used **within just one segment** , and therefore this check is redundant.
//should be used with bigger segment sizes or smaller lists of gaps
//g = 0;
}
while (u <= high)
{
if (!segment[(int) (u - low)])
pi++;
u += gaps[wh];
wh++;
if (wh == l)
wh = 0;
}
}
System.out.println(pi);
long endTime = System.currentTimeMillis();
System.out.println("Solution took "+(endTime - startTime) + " ms");
}
public static boolean [] simpleSieve (int l)
{
long sqrtl = (long)Math.sqrt(l);
boolean [] primes = new boolean [l/2+2];
Arrays.fill(primes, true);
int g = -1;
for (int i = nextp ; i <= sqrtl ; i += gaps[g])
{
if (primes[(i + 1) / 2])
for (int j = i * i ; j <= l ; j += i * 2)
primes[(j + 1) / 2]=false;
g++;
if (g == l)
g=0;
}
return primes;
}
public static long pisqrtx ()
{
int pi = wheels.length;
if (x < wheels[wheels.length-1])
{
if (x < 2)
return 0;
int k = 0;
while (wheels[k] <= x)
k++;
return k;
}
int g = -1;
for (int i = nextp ; i <= sqrtx ; i += gaps[g])
{
if(sievingPrimes[( i + 1 ) / 2])
pi++;
g++;
if (g == l)
g=0;
}
return pi;
}
public static void preCalc ()
{
sqrtx = (int) Math.sqrt(x);
int prod = 1;
for (long p : wheels)
prod *= p; // primorial
nextp = BigInteger.valueOf(wheels[wheels.length-1]).nextProbablePrime().intValue(); //the first prime that comes after the wheel
int lim = prod + nextp; // circumference of the wheel
boolean [] marks = new boolean [lim + 1];
Arrays.fill(marks, true);
for (int j = 2 * 2 ;j <= lim ; j += 2)
marks[j] = false;
for (int i = 1 ; i < wheels.length ; i++)
{
int p = wheels[i];
for (int j = p * p ; j <= lim ; j += 2 * p)
marks[j]=false; // removing all integers that are NOT comprime with the base wheel primes
}
ArrayList <Integer> gs = new ArrayList <Integer>(); //list of the gaps between the integers that are coprime with the base wheel primes
int d = nextp;
for (int p = d + 2 ; p < marks.length ; p += 2)
{
if (marks[p]) //d is prime. if p is also prime, then a gap is identified, and is noted.
{
gs.add(p - d);
d = p;
}
}
gaps = new int [gs.size()];
for (int i = 0 ; i < gs.size() ; i++)
gaps[i] = gs.get(i); // Arrays are faster than lists, so moving the list of gaps to an array
l = gaps.length;
sievingPrimes = simpleSieve(sqrtx); //initializing the sieving primes
}
}
Currently, it produces 50847534 primes below 10^9 in about 1.6 seconds. This is very impressive, at least by my standards, but I am looking to make it faster, possibly break the 1 second barrier. Even then, I believe it can be made much faster still.
The whole program is based on wheel factorization: https://en.wikipedia.org/wiki/Wheel_factorization. I have noticed I am getting the fastest results using a wheel of all primes up to 19.
public static int [] wheels = new int [] {2,3,5,7,11,13,17,19}; // base wheel primes
This means that the multiples of those primes are skipped, resulting in a much smaller searching range. The gaps between numbers which we need to take are then calculated in the preCalc method. If we make those jumps between the the numbers in the searching range we skip the multiples of the base primes.
public static void preCalc ()
{
sqrtx = (int) Math.sqrt(x);
int prod = 1;
for (long p : wheels)
prod *= p; // primorial
nextp = BigInteger.valueOf(wheels[wheels.length-1]).nextProbablePrime().intValue(); //the first prime that comes after the wheel
int lim = prod + nextp; // circumference of the wheel
boolean [] marks = new boolean [lim + 1];
Arrays.fill(marks, true);
for (int j = 2 * 2 ;j <= lim ; j += 2)
marks[j] = false;
for (int i = 1 ; i < wheels.length ; i++)
{
int p = wheels[i];
for (int j = p * p ; j <= lim ; j += 2 * p)
marks[j]=false; // removing all integers that are NOT comprime with the base wheel primes
}
ArrayList <Integer> gs = new ArrayList <Integer>(); //list of the gaps between the integers that are coprime with the base wheel primes
int d = nextp;
for (int p = d + 2 ; p < marks.length ; p += 2)
{
if (marks[p]) //d is prime. if p is also prime, then a gap is identified, and is noted.
{
gs.add(p - d);
d = p;
}
}
gaps = new int [gs.size()];
for (int i = 0 ; i < gs.size() ; i++)
gaps[i] = gs.get(i); // Arrays are faster than lists, so moving the list of gaps to an array
l = gaps.length;
sievingPrimes = simpleSieve(sqrtx); //initializing the sieving primes
}
At the end of the preCalc method, the simpleSieve method is called, efficiently sieving all the sieving primes mentioned before, the primes <= sqrtx. This is a simple Eratosthenes sieve, rather than segmented, but it is still based on wheel factorization, perviously computed.
public static boolean [] simpleSieve (int l)
{
long sqrtl = (long)Math.sqrt(l);
boolean [] primes = new boolean [l/2+2];
Arrays.fill(primes, true);
int g = -1;
for (int i = nextp ; i <= sqrtl ; i += gaps[g])
{
if (primes[(i + 1) / 2])
for (int j = i * i ; j <= l ; j += i * 2)
primes[(j + 1) / 2]=false;
g++;
if (g == l)
g=0;
}
return primes;
}
Finally, we reach the heart of the algorithm. We start by enumerating all primes <= sqrtx, with the following call:
long pi = pisqrtx();`
which used the following method:
public static long pisqrtx ()
{
int pi = wheels.length;
if (x < wheels[wheels.length-1])
{
if (x < 2)
return 0;
int k = 0;
while (wheels[k] <= x)
k++;
return k;
}
int g = -1;
for (int i = nextp ; i <= sqrtx ; i += gaps[g])
{
if(sievingPrimes[( i + 1 ) / 2])
pi++;
g++;
if (g == l)
g=0;
}
return pi;
}
Then, after initializing the pi variable which keeps track of the enumeration of primes, we perform the mentioned segmentation, starting the enumeration from the first prime > sqrtx:
int segSize = Math.max(sqrtx, 32768*8); //size of each segment
long u = nextp; // 'u' is the running index of the program. will continue from one segment to the next
int wh = 0; // the will be the gap index, indicating by how much we increment 'u' each time, skipping the multiples of the wheel primes
long pi = pisqrtx(); // the primes count. initialize with the number of primes <= sqrtx
for (long low = 0 ; low < x ; low += segSize) //the heart of the code. enumerating the primes through segmentation. enumeration will begin at p > sqrtx
{
long high = Math.min(x, low + segSize);
boolean [] segment = new boolean [(int) (high - low + 1)];
int g = -1;
for (int i = nextp ; i <= sqrtx ; i += gaps[g])
{
if (sievingPrimes[(i + 1) / 2])
{
long firstMultiple = (long) (low / i * i);
if (firstMultiple < low)
firstMultiple += i;
if (firstMultiple % 2 == 0) //start with the first odd multiple of the current prime in the segment
firstMultiple += i;
for (long j = firstMultiple ; j < high ; j += i * 2)
segment[(int) (j - low)] = true;
}
g++;
//if (g == l) //due to segment size, the full list of gaps is never used **within just one segment** , and therefore this check is redundant.
//should be used with bigger segment sizes or smaller lists of gaps
//g = 0;
}
while (u <= high)
{
if (!segment[(int) (u - low)])
pi++;
u += gaps[wh];
wh++;
if (wh == l)
wh = 0;
}
}
I have also included it as a note, but will explain as well. Because the segment size is relatively small, we will not go through the entire list of gaps within just one segment, and checking it - is redundant. (Assuming we use a 19-wheel). But in a broader scope overview of the program, we will make use of the entire array of gaps, so the variable u has to follow it and not accidentally surpass it:
while (u <= high)
{
if (!segment[(int) (u - low)])
pi++;
u += gaps[wh];
wh++;
if (wh == l)
wh = 0;
}
Using higher limits will eventually render a bigger segment, which might result in a neccessity of checking we don't surpass the gaps list even within the segment. This, or tweaking the wheel primes base might have this effect on the program. Switching to bit-sieving can largely improve the segment limit though.
As an important side-note, I am aware that efficient segmentation is
one that takes the L1 & L2 cache-sizes into account. I get the
fastest results using a segment size of 32,768 * 8 = 262,144 = 2^18. I am not sure what the cache-size of my computer is, but I do
not think it can be that big, as I see most cache sizes <= 32,768.
Still, this produces the fastest run time on my computer, so this is
why it's the chosen segment size.
As I mentioned, I am still looking to improve this by a lot. I
believe, according to my introduction, that multithreading can result
in a speed-up factor of 4, using 4 threads (corresponding to 4
cores). The idea is that each thread will still use the idea of the
segmented sieve, but work on different portions. Divide the n
into 4 equal portions - threads, each in turn performing the
segmentation on the n/4 elements it is responsible for, using the
above program. My question is how do I do that? Reading about
multithreading and examples, unfortunately, did not bring to me any
insight on how to implement it in the case above efficiently. It
seems to me, as opposed to the logic behind it, that the threads were
running sequentially, rather than simultaneously. This is why I
excluded it from the code to make it more readable. I will really
appreciate a code sample on how to do it in this specific code, but a
good explanation and reference will maybe do the trick too.
Additionally, I would like to hear about more ways of speeding-up
this program even more, any ideas you have, I would love to hear!
Really want to make it very fast and efficient. Thank you!
An example like this should help you get started.
An outline of a solution:
Define a data structure ("Task") that encompasses a specific segment; you can put all the immutable shared data into it for extra neatness, too. If you're careful enough, you can pass a common mutable array to all tasks, along with the segment limits, and only update the part of the array within these limits. This is more error-prone, but can simplify the step of joining the results (AFAICT; YMMV).
Define a data structure ("Result") that stores the result of a Task computation. Even if you just update a shared resulting structure, you may need to signal which part of that structure has been updated so far.
Create a Runnable that accepts a Task, runs a computation, and puts the results into a given result queue.
Create a blocking input queue for Tasks, and a queue for Results.
Create a ThreadPoolExecutor with the number of threads close to the number of machine cores.
Submit all your Tasks to the thread pool executor. They will be scheduled to run on the threads from the pool, and will put their results into the output queue, not necessarily in order.
Wait for all the tasks in the thread pool to finish.
Drain the output queue and join the partial results into the final result.
Extra speedup may (or may not) be achieved by joining the results in a separate task that reads the output queue, or even by updating a mutable shared output structure under synchronized, depending on how much work the joining step involves.
Hope this helps.
Are you familiar with the work of Tomas Oliveira e Silva? He has a very fast implementation of the Sieve of Eratosthenes.
How interested in speed are you? Would you consider using c++?
$ time ../c_code/segmented_bit_sieve 1000000000
50847534 primes found.
real 0m0.875s
user 0m0.813s
sys 0m0.016s
$ time ../c_code/segmented_bit_isprime 1000000000
50847534 primes found.
real 0m0.816s
user 0m0.797s
sys 0m0.000s
(on my newish laptop with an i5)
The first is from #Kim Walisch using a bit array of odd prime candidates.
https://github.com/kimwalisch/primesieve/wiki/Segmented-sieve-of-Eratosthenes
The second is my tweak to Kim's with IsPrime[] also implemented as bit array, which is slightly less clear to read, although a little faster for big N due to the reduced memory footprint.
I will read your post carefully as I am interested in primes and performance no matter what language is used. I hope this isn't too far off topic or premature. But I noticed I was already beyond your performance goal.

How to optimize algorithm to be able to determine 10 digit long prime numbers in java

I am currently new to java and programming in general, I'm working on an algorithm that determines the prime numbers in specific given ranges. Currently it works with six ranges which are numbers under 1 billion, when I tried to determine a 10 digit long number it failed. I am aware it needs to be changed to long since the digit is out of range but I am not sure how.
this is the part of the code where it determines if the umber is prime:
public ArrayList<Integer> getPrimes(int StartPos, int n) {
ArrayList<Integer> primeList = new ArrayList<>();
boolean[] primes = new boolean[n + 1];
for (int i = StartPos; i < primes.length; i++) {
primes[i] = true;
}
int num = 2;
while (true) {
for (int i = 2;; i++) {
int m = num * i;
if (m > n) {
break;
} else {
primes[m] = false;
}
}
boolean nextNum = false;
for (int i = num + 1; i < n + 1; i++) {
if (primes[i]) {
num = i;
nextNum = true;
break;
}
}
if (!nextNum) {
break;
}
}
for (int i = 0; i < primes.length; i++) {
if (primes[i]) {
primeList.add(i);
}
}
return primeList;
}
I was checking online and found that perhaps I could do it with vectors but I have no experience with them and Also they are relatively slower than an Array.
You might try BigInteger#isProbablePrime: https://www.tutorialspoint.com/java/math/biginteger_isprobableprime.htm
import java.math.BigInteger;
import java.util.List;
import java.util.stream.Collectors;
import java.util.stream.LongStream;
// Java 8
public class FindPrimes {
public static void main(String[] args) {
System.out.println("1 - 100 primes: " + findPrimes(1L, 99L));
System.out.println("some 10-digit primes: " + findPrimes(99_999_999_999L, 20000L));
}
private static List<Long> findPrimes(long start, long quant) {
return LongStream.rangeClosed(start, start + quant).filter(v ->
BigInteger.valueOf(v).isProbablePrime(1)).boxed().collect(Collectors.toList());
}
}
You say you are learning, so I will give you an outline in pseudocode, not the answer.
In general the method to find a prime in a range is:
repeat
pick a number in the range
until (the number is prime)
You want a ten digit number. One way to generate a candidate while avoiding obvious non-primes is:
start with digit in [1..9] // No leading zero.
repeat 8 times
append a digit in [0..9]
endrepeat
append a digit in [1, 3, 7, 9] // Final digits for large primes.
That will give you a ten digit possible prime.
Now you need to test to check it is prime.
There are tests like Miller-Rabin that you could try, but probably not if you are a real beginner. I would suggest setting up a Sieve of Eratosthenes covering numbers to to 10,000 which is the square root of your upper limit of 10,000,000,000. That will give you fast access to all the primes below the square root of your number. Set up the sieve once only at the start of your program. Make it a separate Class, and include a int nextPrime(int n) method which returns the next prime after the supplied parameter. Once that is in place, then you can write a trial division method to test your ten digit number:
boolean method isPrime(tenDigitNumber)
testPrime <- 2
limit <- square root of tenDigitNumber // Only calculate this once.
while (testPrime < limit)
if (tenDigitNumber MOD testPrime == 0)
return false // Number is not prime.
else
testPrime <- sieve.nextPrime(testPrime)
endif
endwhile
return true // If we get here then the number is prime.
end isPrime
Because you have set up the sieve in advance, this should run reasonably quickly. If it is too slow, then it is time to look at coding Miller-Rabin or one of the other heavy-duty prime test methods.
As well as the Sieve of Eratosthenes class, another useful utility method is an iSqrt() method that returns an integer square root. My own version uses the Newton-Raphson method, though no doubt there are other possibilities.

Making Fibonacci faster [duplicate]

This question already has answers here:
nth fibonacci number in sublinear time
(16 answers)
Closed 7 years ago.
I was required to write a simple implementation of Fibonacci's algorithm and then to make it faster.
Here is my initial implementation
public class Fibonacci {
public static long getFibonacciOf(long n) {
if (n== 0) {
return 0;
} else if (n == 1) {
return 1;
} else {
return getFibonacciOf(n-2) + getFibonacciOf(n-1);
}
}
public static void main(String[] args) {
Scanner scanner = new Scanner (System.in);
while (true) {
System.out.println("Enter n :");
long n = scanner.nextLong();
if (n >= 0) {
long beginTime = System.currentTimeMillis();
long fibo = getFibonacciOf(n);
long endTime = System.currentTimeMillis();
long delta = endTime - beginTime;
System.out.println("F(" + n + ") = " + fibo + " ... computed in " + delta + " milliseconds");
} else {
break;
}
}
}
}
As you can see I am using System.currentTimeMillis() to get a simple measure of the time elapsed while computed Fibonacci.
This implementation get rapidly kind of exponentially slow as you can see on the following picture
So I've got a simple optimisation idea. To put previous values in a HashMap and instead of re-computing them each time, to simply take them back from the HashMap if they exist. If they don't exist, we then put them in the HashMap.
Here is the new version of the code
public class FasterFibonacci {
private static Map<Long, Long> previousValuesHolder;
static {
previousValuesHolder = new HashMap<Long, Long>();
previousValuesHolder.put(Long.valueOf(0), Long.valueOf(0));
previousValuesHolder.put(Long.valueOf(1), Long.valueOf(1));
}
public static long getFibonacciOf(long n) {
if (n== 0) {
return 0;
} else if (n == 1) {
return 1;
} else {
if (previousValuesHolder.containsKey(Long.valueOf(n))) {
return previousValuesHolder.get(n);
} {
long newValue = getFibonacciOf(n-2) + getFibonacciOf(n-1);
previousValuesHolder.put(Long.valueOf(n), Long.valueOf(newValue));
return newValue;
}
}
}
public static void main(String[] args) {
Scanner scanner = new Scanner (System.in);
while (true) {
System.out.println("Enter n :");
long n = scanner.nextLong();
if (n >= 0) {
long beginTime = System.currentTimeMillis();
long fibo = getFibonacciOf(n);
long endTime = System.currentTimeMillis();
long delta = endTime - beginTime;
System.out.println("F(" + n + ") = " + fibo + " ... computed in " + delta + " milliseconds");
} else {
break;
}
}
}
This change makes the computing extremely fast. I computes all the values from 2 to 103 in no time at all and I get a long overflow at F(104) (Gives me F(104) = -7076989329685730859, which is wrong). I find it so fast that **I wonder if there is any mistakes in my code (Thank your checking and let me know please) **. Please take a look at the second picture:
Is my faster fibonacci's algorithm's implementation correct (It seems it is to me because it gets the same values as the first version, but since the first version was too slow I could not compute bigger values with it such as F(75))? What other way can I use to make it faster? Or is there a better way to make it faster? Also how can I compute Fibonacci for greater values (such as 150, 200) without getting a **long overflow**? Though it seems fast I would like to push it to the limits. I remember Mr Abrash saying 'The best optimiser is between your two ears', so I believe it can still be improved. Thank you for helping
[Edition Note:] Though this question adresses one of the main point in my question, you can see from above that I have additionnal issues.
Dynamic programming
Idea:Instead of recomputing the same value multiple times you just store the value calculated and use them as you go along.
f(n)=f(n-1)+f(n-2) with f(0)=0,f(1)=1.
So at the point when you have calculated f(n-1) you can easily calculate f(n) if you store the values of f(n) and f(n-1).
Let's take an array of Bignums first. A[1..200].
Initialize them to -1.
Pseudocode
fact(n)
{
if(A[n]!=-1) return A[n];
A[0]=0;
A[1]=1;
for i=2 to n
A[i]= addition of A[i],A[i-1];
return A[n]
}
This runs in O(n) time. Check it out yourself.
This technique is also called memoization.
The IDEA
Dynamic programming (usually referred to as DP ) is a very powerful technique to solve a particular class of problems. It demands very elegant formulation of the approach and simple thinking and the coding part is very easy. The idea is very simple, If you have solved a problem with the given input, then save the result for future reference, so as to avoid solving the same problem again.. shortly 'Remember your Past'.
If the given problem can be broken up in to smaller sub-problems and these smaller subproblems are in turn divided in to still-smaller ones, and in this process, if you observe some over-lappping subproblems, then its a big hint for DP. Also, the optimal solutions to the subproblems contribute to the optimal solution of the given problem ( referred to as the Optimal Substructure Property ).
There are two ways of doing this.
1.) Top-Down : Start solving the given problem by breaking it down. If you see that the problem has been solved already, then just return the saved answer. If it has not been solved, solve it and save the answer. This is usually easy to think of and very intuitive. This is referred to as Memoization. (I have used this idea).
2.) Bottom-Up : Analyze the problem and see the order in which the sub-problems are solved and start solving from the trivial subproblem, up towards the given problem. In this process, it is guaranteed that the subproblems are solved before solving the problem. This is referred to as Dynamic Programming. (MinecraftShamrock used this idea)
There's more!
(Other ways to do this)
Look our quest to get a better solution doesn't end here. You will see a different approach-
If you know how to solve recurrence relation then you will find a solution to this relation
f(n)=f(n-1)+f(n-2) given f(0)=0,f(1)=1
You will arrive at the formula after solving it-
f(n)= (1/sqrt(5))((1+sqrt(5))/2)^n - (1/sqrt(5))((1-sqrt(5))/2)^n
which can be written in more compact form
f(n)=floor((((1+sqrt(5))/2)^n) /sqrt(5) + 1/2)
Complexity
You can get the power a number in O(logn) operations.
You have to learn the Exponentiation by squaring.
EDIT: It is good to point out that this doesn't necessarily mean that the fibonacci number can be found in O(logn). Actually the number of digits we need to calculate frows linearly. Probably because of the position where I stated that it seems to claim the wrong idea that factorial of a number can be calculated in O(logn) time.
[Bakurui,MinecraftShamrock commented on this]
If you need to compute n th fibonacci numbers very frequently I suggest using amalsom's answer.
But if you want to compute a very big fibonacci number, you will run out of memory because you are storing all smaller fibonacci numbers. The following pseudocode only keeps the last two fibonacci numbers in memory, i.e. it requires much less memory:
fibonacci(n) {
if n = 0: return 0;
if n = 1: return 1;
a = 0;
b = 1;
for i from 2 to n: {
sum = a + b;
a = b;
b = sum;
}
return b;
}
Analysis
This can compute very high fibonacci numbers with quite low memory consumption: We have O(n) time as the loop repeats n-1 times. The space complexity is interesting as well: The nth fibonacci number has a length of O(n), which can easily be shown:
Fn <= 2 * Fn-1
Which means that the nth fibonacci number is at most twice as big as its predecessor. Doubling a number in binary is equivalent with a single left-shift, which increases the number of necessary bits by one. So representing the nth fibonacci number takes at most O(n) space. We have at most three successive fibonacci numbers in memory which makes O(n) + O(n-1) + O(n-2) = O(n) total space consumption. In contrast to this the memoization algorithm always keeps the first n fibonacci numbers in memory, which makes O(n) + O(n-1) + O(n-2) + ... + O(1) = O(n^2) space consumption.
So which way should one use?
The only reason to keep all lower fibonacci numbers in memory is if you need fibonacci numbers very frequently. It is a question of balancing time with memory consumption.
Get away from the Fibonacci recursion and use the identities
(F(2n), F(2n-1)) = (F(n)^2 + 2 F(n) F(n-1), F(n)^2+F(n-1)^2)
(F(2n+1), F(2n)) = (F(n+1)^2+F(n)^2, 2 F(n+1) F(n) - F(n)^2)
This allows you to compute (F(m+1), F(m)) in terms of (F(k+1), F(k)) for k half the size of m. Written iteratively with some bit shifting for division by 2, this should give you the theoretical O(log n) speed of exponentiation by squaring while staying entirely within integer arithmetic. (Well, O(log n) arithmetic operations. Since you will be working with numbers with roughly n bits, it won't be O(log n) time once you are forced to switch to a large integer library. After F(50), you will overflow the integer data type, which only goes up to 2^(31).)
(Apologies for not remembering Java well enough to implement this in Java; anyone who wants to is free to edit it in.)
Fibonacci(0) = 0
Fibonacci(1) = 1
Fibonacci(n) = Fibonacci(n - 1) + Fibonacci(n - 2), when n >= 2
Usually there are 2 ways to calculate Fibonacci number:
Recursion:
public long getFibonacci(long n) {
if(n <= 1) {
return n;
} else {
return getFibonacci(n - 1) + getFibonacci(n - 2);
}
}
This way is intuitive and easy to understand, while because it does not reuse calculated Fibonacci number, the time complexity is about O(2^n), but it does not store calculated result, so it saves space a lot, actually the space complexity is O(1).
Dynamic Programming:
public long getFibonacci(long n) {
long[] f = new long[(int)(n + 1)];
f[0] = 0;
f[1] = 1;
for(int i=2;i<=n;i++) {
f[i] = f[i - 1] + f[i - 2];
}
return f[(int)n];
}
This Memoization way calculated Fibonacci numbers and reuse them when calculate next one. The time complexity is pretty good, which is O(n), while space complexity is O(n). Let's investigate whether the space complexity can be optimized... Since f(i) only requires f(i - 1) and f(i - 2), there is not necessary to store all calculated Fibonacci numbers.
The more efficient implementation is:
public long getFibonacci(long n) {
if(n <= 1) {
return n;
}
long x = 0, y = 1;
long ans;
for(int i=2;i<=n;i++) {
ans = x + y;
x = y;
y = ans;
}
return ans;
}
With time complexity O(n), and space complexity O(1).
Added: Since Fibonacci number increase amazing fast, long can only handle less than 100 Fibonacci numbers. In Java, we can use BigInteger to store more Fibonacci numbers.
Precompute a large number of fib(n) results, and store them as a lookup table inside your algorithm. Bam, free "speed"
Now if you need to compute fib(101) and you already have fibs 0 to 100 stored, this is just like trying to compute fib(1).
Chances are this isn't what this homework is looking for, but it's a completely legit strategy and basically the idea of caching extracted further away from running the algorithm. If you know you're likely to be computing the first 100 fibs often and you need to do it really really fast, there's nothing faster than O(1). So compute those values entirely out of band and store them so they can be looked up later.
Of course, cache values as you compute them too :) Duplicated computation is waste.
Here is a snippet of code with an iterative approach instead of recursion.
Output example:
Enter n: 5
F(5) = 5 ... computed in 1 milliseconds
Enter n: 50
F(50) = 12586269025 ... computed in 0 milliseconds
Enter n: 500
F(500) = ...4125 ... computed in 2 milliseconds
Enter n: 500
F(500) = ...4125 ... computed in 0 milliseconds
Enter n: 500000
F(500000) = 2955561408 ... computed in 4,476 ms
Enter n: 500000
F(500000) = 2955561408 ... computed in 0 ms
Enter n: 1000000
F(1000000) = 1953282128 ... computed in 15,853 ms
Enter n: 1000000
F(1000000) = 1953282128 ... computed in 0 ms
Some pieces of results are omitted with ... for a better view.
Code snippet:
public class CachedFibonacci {
private static Map<BigDecimal, BigDecimal> previousValuesHolder;
static {
previousValuesHolder = new HashMap<>();
previousValuesHolder.put(BigDecimal.ZERO, BigDecimal.ZERO);
previousValuesHolder.put(BigDecimal.ONE, BigDecimal.ONE);
}
public static BigDecimal getFibonacciOf(long number) {
if (0 == number) {
return BigDecimal.ZERO;
} else if (1 == number) {
return BigDecimal.ONE;
} else {
if (previousValuesHolder.containsKey(BigDecimal.valueOf(number))) {
return previousValuesHolder.get(BigDecimal.valueOf(number));
} else {
BigDecimal olderValue = BigDecimal.ONE,
oldValue = BigDecimal.ONE,
newValue = BigDecimal.ONE;
for (int i = 3; i <= number; i++) {
newValue = oldValue.add(olderValue);
olderValue = oldValue;
oldValue = newValue;
}
previousValuesHolder.put(BigDecimal.valueOf(number), newValue);
return newValue;
}
}
}
public static void main(String[] args) {
Scanner scanner = new Scanner(System.in);
while (true) {
System.out.print("Enter n: ");
long inputNumber = scanner.nextLong();
if (inputNumber >= 0) {
long beginTime = System.currentTimeMillis();
BigDecimal fibo = getFibonacciOf(inputNumber);
long endTime = System.currentTimeMillis();
long delta = endTime - beginTime;
System.out.printf("F(%d) = %.0f ... computed in %,d milliseconds\n", inputNumber, fibo, delta);
} else {
System.err.println("You must enter number > 0");
System.out.println("try, enter number again, please:");
break;
}
}
}
}
This approach runs much faster than the recursive version.
In such a situation, the iterative solution tends to be a bit faster, because each
recursive method call takes a certain amount of processor time. In principle, it is
possible for a smart compiler to avoid recursive method calls if they follow simple
patterns, but most compilers don’t do that. From that point of view, an iterative
solution is preferable.
UPDATE:
After Java 8 releases and Stream API is available one more way is available for calculating Fibonacci.
Checked with JDK 17.0.2.
Code:
public static BigInteger streamFibonacci(long n) {
return Stream.iterate(new BigInteger[]{BigInteger.ONE, BigInteger.ONE},
p -> new BigInteger[]{p[1], p[0].add(p[1])})
.limit(n)
.reduce((a, b) -> b)
.get()[0];
}
Test output:
Enter n (q for quit): 5
F(5) = 5 ... computed in 2 ms
Enter n (q for quit): 50
F(50) = 1258626902 ... computed in 0 ms
Enter n (q for quit): 500
F(500) = 1394232245 ... computed in 3 ms
Enter n (q for quit): 500000
F(500000) = 2955561408 ... computed in 4,343 ms
Enter n (q for quit): 1000000
F(1000000) = 1953282128 ... computed in 19,280 ms
The results are pretty good.
Keep in mind that ... just cuts all following digits of the real numbers.
Having followed a similar approach some time ago, I've just realized there's another optimization you can make.
If you know two large consecutive answers, you can use this as a starting point. For example, if you know F(100) and F(101), then calculating F(104) is approximately as difficult (*) as calculating F(4) based on F(0) and F(1).
Calculating iteratively up is as efficient calculation-wise as doing the same using cached-recursion, but uses less memory.
Having done some sums, I have also realized that, for any given z < n:
F(n)=F(z) * F(n-z) + F(z-1) * F(n-z-1)
If n is odd, and you choose z=(n+1)/2, then this is reduced to
F(n)=F(z)^2+F(z-1)^2
It seems to me that you should be able to use this by a method I have yet to find, that you should be able use the above info to find F(n) in the number of operations equal to:
the number of bits in n doublings (as per above) + the number of 1 bits in n addings; in the case of 104, this would be (7 bits, 3 '1' bits) = 14 multiplications (squarings), 10 additions.
(*) assuming adding two numbers takes the same time, irrelevant of the size of the two numbers.
Here's a way of provably doing it in O(log n) (as the loop runs log n times):
/*
* Fast doubling method
* F(2n) = F(n) * (2*F(n+1) - F(n)).
* F(2n+1) = F(n+1)^2 + F(n)^2.
* Adapted from:
* https://www.nayuki.io/page/fast-fibonacci-algorithms
*/
private static long getFibonacci(int n) {
long a = 0;
long b = 1;
for (int i = 31 - Integer.numberOfLeadingZeros(n); i >= 0; i--) {
long d = a * ((b<<1) - a);
long e = (a*a) + (b*b);
a = d;
b = e;
if (((n >>> i) & 1) != 0) {
long c = a+b;
a = b;
b = c;
}
}
return a;
}
I am assuming here (as is conventional) that one multiply / add / whatever operation is constant time irrespective of number of bits, i.e. that a fixed-length data type will be used.
This page explains several methods of which this is the fastest. I simply translated it away from using BigInteger for readability. Here's the BigInteger version:
/*
* Fast doubling method.
* F(2n) = F(n) * (2*F(n+1) - F(n)).
* F(2n+1) = F(n+1)^2 + F(n)^2.
* Adapted from:
* http://www.nayuki.io/page/fast-fibonacci-algorithms
*/
private static BigInteger getFibonacci(int n) {
BigInteger a = BigInteger.ZERO;
BigInteger b = BigInteger.ONE;
for (int i = 31 - Integer.numberOfLeadingZeros(n); i >= 0; i--) {
BigInteger d = a.multiply(b.shiftLeft(1).subtract(a));
BigInteger e = a.multiply(a).add(b.multiply(b));
a = d;
b = e;
if (((n >>> i) & 1) != 0) {
BigInteger c = a.add(b);
a = b;
b = c;
}
}
return a;
}

Smallest java structure with relatively decent contains() solution

Alright, here's the lowdown: I'm writing a class in Java that finds the Nth Hardy's Taxi number (a number that can be summed up by two different sets of two cubed numbers). I have the discovery itself down, but I am in desperate need of some space saving. To that end, I need the smallest possible data structure where I can relatively easily use or create a method like contains(). I'm not particularly worried about speed, as my current solution can certainly get it to compute well within the time restrictions.
In short, the data structure needs:
To be able to relatively simply implement a contains() method
To use a low amount of memory
To be able to store very large number of entries
To be easily usable with the primitive long type
Any ideas? I started with a hash map (because I needed to test the values the led to the sum to ensure accuracy), then moved to hash set once I guaranteed reliable answers.
Any other general ideas on how to save some space would be greatly appreciated!
I don't think you'd need the code to answer the question, but here it is in case you're curious:
public class Hardy {
// private static HashMap<Long, Long> hm;
/**
* Find the nth Hardy number (start counting with 1, not 0) and the numbers
* whose cubes demonstrate that it is a Hardy number.
* #param n
* #return the nth Hardy number
*/
public static long nthHardyNumber(int n) {
// long i, j, oldValue;
int i, j;
int counter = 0;
long xyLimit = 2147483647; // xyLimit is the max value of a 32bit signed number
long sum;
// hm = new HashMap<Long, Long>();
int hardyCalculations = (int) (n * 1.1);
HashSet<Long> hs = new HashSet<Long>(hardyCalculations * hardyCalculations, (float) 0.95);
long[] sums = new long[hardyCalculations];
// long binaryStorage, mask = 0x00000000FFFFFFFF;
for (i = 1; i < xyLimit; i++){
for (j = 1; j <= i; j++){
// binaryStorage = ((i << 32) + j);
// long y = ((binaryStorage << 32) >> 32) & mask;
// long x = (binaryStorage >> 32) & mask;
sum = cube(i) + cube(j);
if (hs.contains(sum) && !arrayContains(sums, sum)){
// oldValue = hm.get(sum);
// long oldY = ((oldValue << 32) >> 32) & mask;
// long oldX = (oldValue >> 32) & mask;
// if (oldX != x && oldX != y){
sums[counter] = sum;
counter++;
if (counter == hardyCalculations){
// Arrays.sort(sums);
bubbleSort(sums);
return sums[n - 1];
}
} else {
hs.add(sum);
}
}
}
return 0;
}
private static void bubbleSort(long[] array){
long current, next;
int i;
boolean ordered = false;
while (!ordered) {
ordered = true;
for (i = 0; i < array.length - 1; i++){
current = array[i];
next = array[i + 1];
if (current > next) {
ordered = false;
array[i] = next;
array[i+1] = current;
}
}
}
}
private static boolean arrayContains(long[] array, long n){
for (long l : array){
if (l == n){
return true;
}
}
return false;
}
private static long cube(long n){
return n*n*n;
}
}
Have you considered using a standard tree? In java that would be a TreeSet. By sacrificing speed, a tree generally gains back space over a hash.
For that matter, sums might be a TreeMap, transforming the linear arrayContains to a logarithmic operation. Being naturally ordered, there would also be no need to re-sort it afterwards.
EDIT
The complaint against using a java tree structure for sums is that java's tree types don't support the k-select algorithm. On the assumption that Hardy numbers are rare, perhaps you don't need to sweat the complexity of this container (in which case your array is fine.)
If you did need to improve time performance of this aspect, you could consider using a selection-enabled tree such as the one mentioned here. However that solution works by increasing the space requirement, not lowering it.
Alternately we can incrementally throw out Hardy numbers we know we don't need. Suppose during the running of the algorithm, sums already contains n Hardy numbers and we discover a new one. We insert it and do whatever we need to preserve collection order, and so now contains n+1 sorted elements.
Consider that last element. We already know about n smaller Hardy numbers, and so there is no possible way this last element is our answer. Why keep it? At this point we can shrink sums again down to size n and toss the largest element out. This is both a space savings, and time savings as we have fewer elements to maintain in sorted order.
The natural data structure for sums in that approach is a max heap. In java there is no native implementation available, but a few 3rd party ones are floating around. You could "make it work" with TreeMap::lastKey, which will be slower in the end, but still faster than quadratic bubbleSort.
If you have an extremely large number of elements, and you effectively want an index to allow fast tests for containment in the underlying dataset, then take a look at Bloom Filters. These are space-efficient indexes whose sole purpose is to enable fast tests for containment in a dataset.
Bloom Filters are probabilistic, which means if they return true for containment, then you actually need to check your underlying dataset to confirm that the element is really present.
If they return false, the element is guaranteed not to be contained in the underlying dataset, and in that case the test for containment would be very cheap.
So it depends on the whether most of the time you expect a candidate to really be contained in the dataset or not.
this is core function to find if a given number is HR-number: it's in C but one should get the idea:
bool is_sum_of_cubes(int value)
{
int m = pow(value, 1.0/3);
int i = m;
int j = 1;
while(j < m && i >= 0)
{
int element = i*i*i + j*j*j;
if( value == element )
{
return true;
}
if(element < value)
{
++j;
}
else
{
--i;
}
}
return false;
}

Why am I getting an OutOfMemoryError in java?

I am working on a projecteuler problem, and I have run into an OutOfMemoryError.
And I don't understand why because my code was working beautifully (to my novice eyes at least :P).
Everything works just fine until the loop reaches 113383.
If someone could help me debug this it would be greatly appreciated because I don't understand at all why it is failing.
My Code
import java.util.Map;
import java.util.HashMap;
import java.util.Stack;
/*
* The following iterative sequence is defined for the set of positive integers:
n = n/2 (n is even)
n = 3n + 1 (n is odd)
Using the rule above and starting with 13, we generate the following sequence:
13 40 20 10 5 16 8 4 2 1
It can be seen that this sequence (starting at 13 and finishing at 1) contains
10 terms. Although it has not been proved yet (Collatz Problem), it is thought
that all starting numbers finish at 1. Which starting number, under one million,
produces the longest chain?
NOTE: Once the chain starts the terms are allowed to go above one million.
*/
public class Problem14 {
static Map<Integer, Integer> map = new HashMap<Integer, Integer>();
final static int NUMBER = 1000000;
public static void main(String[] args) {
long start = System.currentTimeMillis();
map.put(1, 1);
for (int loop = 2; loop < NUMBER; loop++) {
compute(loop);
//System.out.println(map);
System.out.println(loop);
}
long end = System.currentTimeMillis();
System.out.println("Time: " + ((end - start) / 1000.) + " seconds" );
}
private static void compute(int currentNumber) {
Stack<Integer> temp = new Stack<Integer>();
/**
* checks to see if current value is already
* part of the map. if it isn't, add to temporary
* stack. also if current value exceeds 1 million,
* don't check map or add to stack.
*/
while (!map.containsKey(currentNumber)){
temp.add(currentNumber);
currentNumber = realCompute(currentNumber);
while (currentNumber > NUMBER){
currentNumber = realCompute(currentNumber);
}
}
System.out.println(temp);
/**
* adds members of the temporary stack to the map
* based on when they were placed in the stack
*/
int initial = map.get(currentNumber) + 1;
int size = temp.size();
for (int loop = 1; loop <= size; loop++){
map.put(temp.pop(), initial);
initial++;
//key is the top of stack
//value is initially 1 greater than the current number that was
//found at the map, then incremented by 1 afterwards;
}
}
private static int realCompute(int currentNumber) {
if (currentNumber % 2 == 0) {
currentNumber /= 2;
} else {
currentNumber = (currentNumber * 3) + 1;
}
return currentNumber;
}
}
Seems to me it's just what the error says: you're running out of memory. Increase the heap size with -Xmx (eg java -Xmx512m).
If you want to reduce the memory footprint, take a look at GNU Trove for a more compact (and faster) implementation of primitive maps (instead of using HashMap<Integer,Integer>).
I have the exactly same problem as you had...
The problem actually lies with the "Stack temp", and change it to Long would be fine.
It is a simple overflow.
Which OutOfMemory error is it? Heap, PermGen..? Probably need to increase heap size or permgen size.
you should define HashMap like: map = new HashMap(NUMBER).
you know, "Entry[] table" is the map data structure,auto capacity.
the reason for inital capacity:1,not array copy. 2,run faster.

Categories

Resources