This question already has an answer here:
Parallel Algorithms for Generating Prime Numbers (possibly using Hadoop's map reduce)
(1 answer)
Closed 9 years ago.
Does anyone have any idea what's the approach for parallel prime factorization algorithm ?
I can't figure out at which stage of the algorithm I should divide it into threads ..
How can I think about Prime factorization in a parallel way ?
consider the following one thread code:
public static void primeFactorization(ArrayList<Integer> factors, int num){
//factors is an array to save the factorization elements
//num is the number to be factorized
int limit = num/2+1;
if(isPrime(num))
factors.add(num);
else{
while(num%2==0){
factors.add(2);
num=num/2;
}
for (int i=3; i<limit; i+=2){
while (isPrime(i) && num%i==0){
factors.add(i);
num = num/i;
}
}
}
}
private static boolean isPrime(int x) {
int top = (int)Math.sqrt(x);
for (int i = 2; i <= top; i++)
if ( x % i == 0 )
return false;
return true;
}
It seems like this could be a really good use for the Fork/Join Framework. It seems like you should be able to use this by recursively passing in the new factors that you find. Try taking a look at RecursiveAction as well. In pseudo code you should be able to do something like the following:
public void getFactors(List<Integer> factors, int num){
if(you can find a factor){
add the two factors to the pool to be factored further
}
else{
factors.add(num);
}
}
As a side note, it might have better performance if you started in the middle (num/2) and went from there opposed to starting at one.
Related
I am trying to make a parallel implementation of the Sieve of Eratosthenes. I made a boolean list which gets filled up with true's for the given size. Whenever a prime is found, all multiples of that prime are marked false in the boolean list.
The way I am trying to make this algorithm parallel is by firing up a new thread while still filtering the initial prime number. For example, the algorithm starts with prime = 2. In the for loop for filter, when prime * prime, I make another for loop in which every number in between the prime (2) and the prime * prime (4) is checked. If that index in the boolean list is still true, I fire up another thread to filter that prime number.
The nested for loop creates more and more overhead as the prime numbers to filter are progressing, so I limited this to only do this nested for loop when the prime number < 100. I am assuming that by that time, the 100 million numbers will be somewhat filtered. The problem here is that this way, the primes to be filter stay just under 9500 primes, while the algorithm stops at 10000 primes (prime * prime < size(100m)). I also think this is not at all the correct way to go about it. I have searched a lot online, but didn't manage to find any examples of parallel Java implementations of the sieve.
My code looks like this:
Main class:
public class Main {
private static ListenableQueue<Integer> queue = new ListenableQueue<>(new LinkedList<>());
private static ArrayList<Integer> primes = new ArrayList<>();
private static boolean serialList[];
private static ArrayList<Integer> serialPrimes = new ArrayList<>();
private static ExecutorService exec = Executors.newFixedThreadPool(10);
private static int size = 100000000;
private static boolean list[] = new boolean[size];
private static int lastPrime = 2;
public static void main(String[] args) {
Arrays.fill(list, true);
parallel();
}
public static void parallel() {
Long startTime = System.nanoTime();
int firstPrime = 2;
exec.submit(new Runner(size, list, firstPrime));
}
public static void parallelSieve(int size, boolean[] list, int prime) {
int queuePrimes = 0;
for (int i = prime; i * prime <= size; i++) {
try {
list[i * prime] = false;
if (prime < 100) {
if (i == prime * prime && queuePrimes <= 1) {
for (int j = prime + 1; j < i; j++) {
if (list[j] && j % prime != 0 && j > lastPrime) {
lastPrime = j;
startNewThread(j);
queuePrimes++;
}
}
}
}
} catch (ArrayIndexOutOfBoundsException ignored) { }
}
}
private static void startNewThread(int newPrime) {
if ((newPrime * newPrime) < size) {
exec.submit(new Runner(size, list, newPrime));
}
else {
exec.shutdown();
for (int i = 2; i < list.length; i++) {
if (list[i]) {
primes.add(i);
}
}
}
}
}
Runner class:
public class Runner implements Runnable {
private int arraySize;
private boolean[] list;
private int k;
public Runner(int arraySize, boolean[] list, int k) {
this.arraySize = arraySize;
this.list = list;
this.k = k;
}
#Override
public void run() {
Main.parallelSieve(arraySize, list, k);
}
}
I feel like there is a much simpler way to solve this...
Do you guys have any suggestions as to how I can make this parallelization working and maybe a bit simpler?
Creating a performant concurrent implementation of an algorithm like the Sieve of Eratosthenes is somewhat more difficult than creating a performant single-threaded implementation. The reason is that you need to find a way to partition the work in a way that minimises communication and interference between the parallel worker threads.
If you achieve complete isolation then you can hope for a speed increase approaching the number of logical processors available, or about one order of magnitude on a typical modern PC. By contrast, using a decent single-threaded implementation of the sieve will give you a speedup of at least two to three orders of magnitude. One simple cop-out would be to simply load the data from a file when needed, or to shell out to a decent prime-sieving program like Kim Walisch's PrimeSieve.
Even if we only want to look at the parallelisation problem, it is still necessary to have some insight in the algorithm itself and into to machine it runs on.
The most important aspect is that modern computers have deep cache hierarchies where only the L1 cache - typically 32 KB - is accessible at full speed and all other memory accesses incur significant penalties. Translated to the Sieve of Eratosthenes this means that you need to sieve your target range one 32 KB window at a time, instead of striding each prime over many megabytes. The small primes up to the square root of the target range end must be sieved before the parallel dance begins, but then each segment or window can be sieved independently.
Sieving a given window or segment necessitates determining the start offsets for the small primes that you want to sieve by, which means at least one modulo divison per small prime per window and division is a an extremely slow operation. However, if you sieve consecutive segments instead of arbitrary windows placed anywhere in the range then you can keep the end offsets for each prime in a vector and use them as start offsets for the next segment, thus eliminating the expensive computation of the start offset.
Thus, one promising parallelisation strategy for the Sieve of Eratosthenes would be to give each worker thread a contiguous group of 32 KB blocks to sieve, so that the start offset calculation needs to happen only once per worker. This way there cannot be memory access contention between workers, since each has its own independent subrange of the target range.
However, before you begin to parallelise - i.e., make your code more complex - you should first slim it down and reduce the work to be done to the absolute essentials. For example, take a look at this fragment from your code:
for (int i = prime; i * prime <= size; i++)
list[i * prime] = false;
Instead of recomputing loop bounds in every iteration and indexing with a multiplication, check the loop variable against a precomputed, loop-invariant value and reduce the multiplication to iterated addition:
for (int o = prime * prime; o <= size; o += prime)
list[o] = false;
There are two simple sieve-specific optimisations that can give significant speed bosts.
1) Leave the even numbers out of your sieve and pull the prime 2 out of thin air when needed. Bingo, you just doubled your performance.
2) Instead of sieving each segment by the small odd primes 3, 5, 7 and so on, blast a precomputed pattern over the segment (or even the whole range). This saves time because these small primes make many, many steps in each segment and account for the lion's share of sieving time.
There are more possible optimisations including a couple more low-hanging fruit but either the returns are diminishing or the effort curve rises steeply. Try searching Code Review for 'sieve'. Also, don't forget that you're fighting a Java compiler in addition to the algorithmic problem and the machine architecture, i.e. things like array bounds checking which your compiler may or may not be able to hoist out of loops.
To give you a ballpark figure: a single-threaded segmented odds-only sieve with precomputed patterns can sieve the whole 32-bit range in 2 to 4 seconds in C#, depending on how much TLC you apply in addition to things mentioned above. Your much smaller problem of primes up to 100000000 (1e8) is solved in less than 100 ms on my aging notebook.
Here's some code that shows how windowed sieving works. For clarity I left off all optimisations like odds-only representation or wheel-3 stepping when reading out the primes and so on. It's C# but that should be similar enough to Java to be readable.
Note: I called the sieve array eliminated because a true value indicates a crossed-off number (saves filling the array with all true at the beginning and it is more logical anyway).
static List<uint> small_primes_between (uint m, uint n)
{
m = Math.Max(m, 2);
if (m > n)
return new List<uint>();
Trace.Assert(n - m < int.MaxValue);
uint sieve_bits = n - m + 1;
var eliminated = new bool[sieve_bits];
foreach (uint prime in small_primes_up_to((uint)Math.Sqrt(n)))
{
uint start = prime * prime, stride = prime;
if (start >= m)
start -= m;
else
start = (stride - 1) - (m - start - 1) % stride;
for (uint j = start; j < sieve_bits; j += stride)
eliminated[j] = true;
}
return remaining_numbers(eliminated, m);
}
//---------------------------------------------------------------------------------------------
static List<uint> remaining_numbers (bool[] eliminated, uint sieve_base)
{
var result = new List<uint>();
for (uint i = 0, e = (uint)eliminated.Length; i < e; ++i)
if (!eliminated[i])
result.Add(sieve_base + i);
return result;
}
//---------------------------------------------------------------------------------------------
static List<uint> small_primes_up_to (uint n)
{
Trace.Assert(n < int.MaxValue); // size_t is int32_t in .Net (!)
var eliminated = new bool[n + 1]; // +1 because indexed by numbers
eliminated[0] = true;
eliminated[1] = true;
for (uint i = 2, sqrt_n = (uint)Math.Sqrt(n); i <= sqrt_n; ++i)
if (!eliminated[i])
for (uint j = i * i; j <= n; j += i)
eliminated[j] = true;
return remaining_numbers(eliminated, 0);
}
I have to write a java code for the 'sieve of eratosthenes' algorithm to print out primes up to a given max value on the console but I'm not allowed to use arrays. Our professor told us it is possible to do only with the help of loops.
So I thought a lot and googled a lot about this topic and couldn't find an answer. I dont think it's possible at all because you have store the information which digits are already crossed out somewhere.
my code until now:
public static void main(String[] args) {
int n = 100;
int mark = 2;
System.out.print("Primes from 1 to "+n+": 2, ");
for (int i = 2; i <= n; i++) {
if(i % mark != 0){
System.out.print(i+", ");
mark = i;
}
}
}
-> So, i'm not allowed to do the "i % mark != 0" command with numbers which are multiples of the numbers i already printed but how am i supposed to make that clear without an array where i can delete numbers on indexes?
BUT if there is a solution I would be glad if someone could share it with me! :)
The solution can be in other programming languages, i can translate it to java myself if its possible.
Thank you in advance and best regards
Update: Thank you very much all of you, i really appreciate your help but I don't think it can be done with the basic structures. All the algorithms i have seen yet which print out primes by using basic structures are no sieve of eratosthenes. :(
The Sieve is about remembering the primes you found already. As far as I know there is no way to do this without arrays or lists and only with loops.
I checked some of the examples at RosettaCode at random and did not find one without an array and only loops.
If you add Classes and Methods as options you can come up with a recursive design:
public class Sieve
{
private int current;
private int max;
private Sieve parent;
public Sieve(int current, int max, Sieve parent )
{
this.current = current;
this.max = max;
this.parent = parent;
}
public static void main(String[] args)
{
int n = 100;
System.out.print("Primes from 1 to " + n + ":\n");
printPrimes(n);
}
private static void printPrimes(int i)
{
new Sieve(2,i,null).start();
}
private void start()
{
if(current <2 || max <2)
{
return;
}
if(this.current > max)
{
parent.print();
return;
}
for(int i = this.current+1;current<=max+1;i++)
{
if(this.testPrime(i))
{
new Sieve(i,this.max,this).start();
return;
}
}
}
private boolean testPrime(int i)
{
if(i%this.current != 0)
{
if(this.parent == null)
{
return true;
}
else
{
return this.parent.testPrime(i);
}
}
return false;
}
private void print()
{
if(this.parent != null)
{
this.parent.print();
}
System.out.print(" "+this.current);
}
}
This removes the array but uses Objects to store the Prime (each Sieve holds one prime)
I'm taking back what I said earlier. Here it is, the "sieve" without arrays, in Haskell:
sieve limit = [n | n <- [2..limit], null [i | i <- [2..n-1], j <- [0,i..n], j==n]]
It is a forgetful sieve, and it is very very inefficient. Uses only additions, and integer comparisons. The list comprehensions in it can be re-coded as loops, in an imperative language. Or to put it differently, it moves counts like a sieve would, but without marking anything, and thus uses no arrays.
Of course whether you'd consider it a "true" sieve or not depends on what is your definition of a sieve. This one constantly recreates and abandons them. Or you could say it reimplements the rem function. Which is the same thing to say, actually, and goes to the essence of why the sieve suddenly becomes so efficient when reuse - via arrays usually - becomes possible.
I have some code that needs to run with some rather large numbers, and it involves incrementing into a recursive method and is therefor very slow to the point where I can't even get to my desired answer. Could someone help me optimize it? I am a beginner though, so I can't do anything very complex/difficult.
public class Euler012{
public static void main(String[]args){
int divisors=0;
for(long x=1;divisors<=501;x++){
divisors=1;
long i=triangle(x);
for(int n=1;n<=i/2;n++){
if(i%n==0){
divisors++;
}
}
//System.out.println(divisors+"\n"+ i);
System.out.println(i+": " + divisors);
}
}
public static long triangle(long x){
long n=0;
while(x>=0){
n+=x;
x--;
triangle(x);
}
return n;
}
}
First: i don't think its an optimization problem, because its a small task, but as mentioned in the comments you do many unnecessary things.
Ok, now lets see where you can optimize things:
recursion
recursion has usually a bad performance, especially if you don't save values this would be possible in your example.
e.g.: recursive triangle-number function with saving values
private static ArrayList<Integer> trianglenumbers = new ArrayList<>();
public static int triangleNumber(int n){
if(trianglenumbers.size() <= n){
if(n == 1)
trianglenumbers.add(1);
else
trianglenumbers.add(triangleNumber(n-1) + n);
}
return trianglenumbers.get(n-1);
}
but as mentioned by #RichardKennethNiescior you can simply use the formula:
(n² + n)/2
but here we can do optimization too!
you shouldnt do /2 but rather *0.5 or even >>1(shift right)
but most compilers will do that for you, so no need to make your code unreadable
your main method
public static void main(String[]args){
int divisors = 0; //skip the = 0
for(long x=1;divisors<=501;++x){ // ++x instead of x++
divisors=0;
long i=(x*x + x) >> 1; // see above, use the one you like more
/*how many divisors*/
if(i == 1) divisors = 1;
else{ /*1 is the only number with just one natural divisor*/
divisors = 2; // the 1 and itself
for(int n = 2; n*n <= i; ++n){
if(n*n == i) ++divisors;
else if(i%n == 0) divisors += 2;
}
}
System.out.println(i+": " + divisors);
}
}
the ++x instead of x++ thing is explained here
the how many divisors part:
every number except 1 has at least 2 divisors (primes, the number itself and one)
to check how many divisors a number has, we just need to go to the root of the number
(eg. 36 -> its squareroot is 6)
36 has 9 divisors (4 pares) {1 and 36, 2 and 18, 3 and 12, 4 and 8, 6 (and 6)}
1 and 36 are skiped (for(**int n = 2**)) but counted in divisors = 2
and the pares 2, 3 and 4 increase the number of divisors by 2
and if its a square number (n*n == i) then we add up 1
You dont have to generate a new triangle number from scratch each time, if you save the value to a variable, and then add x to it on the next iteration, you dont really need to have the triangle method at all.
So, I'm beggining to learn programming and trying project euler with java. The problem 10 seems pretty straight forward and I thought I could use a method I used to get primes before in another problem. The thing is, the method works except when I put it in the for loop and I can't manage to see the diference between this and the other.
So this is my code
package euler10;
public class Primesum {
public static void main(String[] args) {
int suma=0;
for (int i=0; i<2000000; i=i+2){
if (isPrime(i) == true){
System.out.println(i);
suma=suma+i;
}
}
System.out.println(suma);
}
public static boolean isPrime(int num) {
boolean prime = false;
long i;
for (i=2; i < Math.sqrt(num) ; i++){
long n = num%i;
if (n == 0){
prime = false;
} else {
prime = true;
}
}
return prime;
}
}
The isPrime method works fine out of the loop but in it it's always true. Even with even numbers it will return true, and I think those aren't very primey :)
I don't really think that there is anything to do with the loop...
However, there is a logic flaw in the code...
public static boolean isPrime(int num) {
long i;
for (i=2; i <= Math.sqrt(num) ; i++){
long n = num%i;
if (n == 0){
return false;//found a divisor : not prime
}
}
//went through all the way to sqrt(num), and found no divisor: prime!
return true;
}
We can stop whenever the first divisor is found, there is no need to find all of them -- that is another excercise...
Also, logically, if one wanted to use the boolean variable this way, it would have been initialised with true, and put to false, and kept at that, when a divisor is found...
Your isPrime function is incorrect; you should delay the return true statement until after the end of the loop, when you know that all values of i do not divide num.
Also, trial division is not a good algorithm for this problem; it will be much faster to use a sieve. Here is an algorithm for summing the primes less than n using the Sieve of Eratosthenes:
function sumPrimes(n)
sum := 0
sieve := makeArray(2..n, True)
for p from 2 to n
if sieve[p]
sum := sum + p
for i from p*p to n step p
sieve[i] := False
return sum
That should calculate the sum of the primes less than two million in less than a second, which is much faster than your program. If you are interested in programming with prime numbers, or if you intend to solve some of the more advanced Project Euler problems and you need some faster algorithms, I modestly recommend this essay at my blog.
First of all, this isn't homework... working on this outside of class to get some practice with java.
public class Problem3 {
public static void main(String[] args) {
int n = 13195;
// For every value 2 -> n
for (int i=2; i < n; i++) {
// If i is a multiple of n
if (n % i == 0) {
// For every value i -> n
for (int j=2; j < i; j++) {
if (n % j != 0) {
System.out.println(i);
break;
}
}
}
}
}
}
I keep modifying the code to try to make it do what I want.
As the problem says, you should be getting 5, 7, 13 and 29.
I get these values, plus 35, 65, 91, 145, 203, 377, 455, 1015, 1885, and 2639. I think I'm on the right track as I have all the right numbers... just have a few extras.
And in checking a few of the numbers in both being divisible by n and being prime numbers, the issue here is that the extra numbers aren't prime. Not sure what's going on though.
If anyone has any insight, please share.
This part
for (int j=2; j < i; j++) {
if (n % j != 0) {
System.out.println(i);
break;
}
doesn't check whether i is prime. Unless i is small, that will always print i at some point, because there are numbers smaller than i that don't divide n. So basically, that will print out all divisors of n (It wouldn't print the divisor 4 for n == 12, for example, but that's an exception).
Note also that the algorithm - using long instead of int to avoid overflow - even if fixed to check whether the divisor i is prime for deciding whether to print it, will take a long time to run for the actual target. You should investigate to find a better algorithm (hint: you might want to find the complete prime factorisation).
I solved this problem in Java and looking at my solution the obvious advice is start using BigInteger, look at the documentation for java.math.BigInteger
Also a lot of these problems are "Math" problems as much as they are "Computer Science" problems so research the math more, make sure you understand the math reasonably well, before coming up with your algorithm. Brute force can work some times, but often there are tricks to these problems.
Brut force can also work for checking whether factor is prime or not for this problem...
eg.
for(i=1;i<=n;i++)// n is a factor.
{
for(j=i;j>=1;j--)
{
if(i%j==0)
{
counter++;// set counter=0 befor.
}
if(counter==2) // for a prime factor the counter will always be exactly two.
{
System.out.println(i);
}
counter=0;
}
}
Don't know about Java but here is my C code if it is of any help.
# include <stdio.h>
# include <math.h>
// A function to print all prime factors of a given number n
void primeFactors(long long int n)
{
// Print the number of 2s that divide n
while (n%2 == 0)
{
printf("%d ", 2);
n = n/2;
}
int i;
// n must be odd at this point. So we can skip one element (Note i = i +2)
for ( i = 3; i <= sqrt(n); i = i+2)
{
// While i divides n, print i and divide n
while (n%i == 0)
{
printf("%d ", i);
n = n/i;
}
}
// This condition is to handle the case whien n is a prime number
// greater than 2
if (n > 2)
printf ("%ld ", n);
}
/* Driver program to test above function */
int main()
{
long long int n = 600851475143;
primeFactors(n);
return 0;
}
Its very good that you are working on such problems out of class.
Saw your code. You are writing a procedural code inside main function/thread.
Instead write functions and think step by step algorithmically first.
The simple algorithm to solve this problem can be like this:
1) Generate numbers consecutively starting from 2 which is the least prime, to 13195/2. (Any number always has its factor smaller than half of it's value)
2) Check if the generated number is prime.
3) If the number is prime then check if it is factor of 13195;
4) Return the last prime factor as it is going to be the largest prime factor of 13195;
One more advice is try writting seperate functions to avoid code complexity.
Code is like this...
public class LargestPrimeFactor {
public static long getLargestPrimeFactor(long num){
long largestprimefactor = 0;
for(long i = 2; i<=num/2;i++){
if(isPrime(i)){
if(num%i==0){
largestprimefactor = i;
System.out.println(largestprimefactor);
}
}
}
return largestprimefactor;
}
public static boolean isPrime(long num){
boolean prime=false;
int count=0;
for(long i=1;i<=num/2;i++){
if(num%i==0){
count++;
}
if(count==1){
prime = true;
}
else{
prime = false;
}
}
return prime;
}
public static void main(String[] args) {
System.out.println("Largest prime factor of 13195 is "+getLargestPrimeFactor(13195));
}
}