hi i'm trying to implement a Diffie–Hellman key exchange
public static Integer secret = 100000;
public static BigInteger g = new BigInteger("5");
public static BigInteger p = new BigInteger("315791951375393537137595555337555955191395351995195751755791151795317131135377351919777977373317997317733397199751739199735799971153399111973979977771537137371797357935195531355957399953977139577337393111951779135151171355371173379337573915193973715113971779315731713793579595533511197399993313719939759551175175337795317333957313779755351991151933337157555517575773115995775199513553337335137111");
public static BigInteger public = g.pow(secret).mod(p);
but the calculation for 100000 already takes several seconds. I won't know hoch much time it would take for a 256bit number.
Is it so slow because of the implementation of BigInteger or am i off the track?
the problem is that g.pow(secret) is a really, really, REALLY big number. It is much bigger than p, and has around secret digits. If you increase secret to be in the range of a normal Diffie-Hellman secret exponent (about the same number of digits as p), your computer won't have enough memory to hold it. All the computers on earth combined wouldn't have enough memory to hold it. It's a really big number.
But g.pow(secret).mod(p) -- the final answer you want -- only has about as many digits as p, so it's a tractable number for computer to keep track of. It's just the intermediate value that is too big to deal with.
So you need to take advantage of the distributive rule for mod with integers like this -- (a * b).mod(p) == (a.mod(p) * b.mod(p)).mod(p). With that rule, you can break down the g.pow(secret) computation into lots and lots of multiplies (O(log2secret) multiplies are all that is needed), and apply a .mod(p) at each step to keep the numbers involved from getting too big.
Use the method described here (and in many other locations):
https://math.stackexchange.com/questions/36318/modulo-arithmetic-with-big-numbers
Good luck.
in Java use modPow of BigInteger, it calculates efficiently
Related
I'm noodling through an anagram hash function, already solved several different ways, but I'm looking for extreme performance as an exercise. I already submitted a solution that passed all the given tests (beating out 100% of all competitors by at least 1ms), but I believe that although it "won", it has a weakness that just wasn't triggered. It is subject to integer overflow in a way that could affect the results.
The gist of the solution was to combine multiple commutative operations, each taking some number of bits, and concatenate them into one long variable. I chose xor, sum, and product. The xor operation cleanly fits within a fixed number of bits. The sum operation might overflow, but because of the way overflow is addressed, it would still arrive at the same result if letters and their corresponding values are rearranged. I wouldn't worry, for example, about whether this function would overflow.
private short sumHash(String s) {
short hash=0;
for (char c:s.toCharArray()) {
hash+=c;
}
return hash;
}
Where I run into trouble is in the inclusion of products. If I make a function that returns the product of a list of values (such as character values in a String), then, at the very least, the result could be rendered inaccurate if the product overflowed to exactly zero.
private short productHash(String s) {
short hash=1;
for (char c:s.toCharArray()) {
hash*=c;
}
return hash;
}
Is there any safe and performant way to avoid this weakness so that the function gains the benefit of the commutative property of multiplication to produce the same value for anagrams, but can't ever encounter a product that overflows to zero?
Sure, if you're willing to go to some lengths to do it. The simplest solution that occurs to me is to write
hash *= primes[c];
where primes is an array that maps each possible character to a distinct odd prime. Overflowing to zero can only happen if the "true" product in infinite-precision arithmetic is a multiple of 2^32, and if you're multiplying by odd primes, that's impossible.
(You do run into the problem that the hash itself will always be odd, but you could shift it right one bit to obtain a more fully mixed hash.)
You will only hit zero if
a * b = 0 mod 2^64
which is equivalent to there being an integer k such that
a * b = k * 2^64
That is, we get in trouble if factors divide 2^64, i.e. if factors are even. Therefore, the easiest solution is ensuring that all factors are odd, for instance like this:
for (char ch : chars) {
hash *= (ch << 1) | 1;
}
This allows you to keep 63 bits of information.
Note however that this technique will only avoid collisions caused by overflow, but not collisions caused by multipliers that share a common factor. If you wish to avoid that, too, you'll need coprime multipliers, which is easiest achieved if they are prime.
The naive way to avoid overflow, is to use a larger type such as int or long. However, for your purposes, modulo arithmetic might make more sense. You can do (a * b) % p for a prime p to maintain commutativity. (There is some deep mathematics here called Group Theory, if you are interested in learning more.) You will need to limit p to be small enough that each a * b does not overflow. The easiest way to do this is to pick a p so that (p - 1)^2 can still be represented in a short or whatever data type you are using.
How can i fix error Negative exponentat java.base/java.math.BigInteger.pow from my function:
static BigInteger eul(BigInteger n, BigInteger b) {
BigInteger pattern = (n.add(BigInteger.ONE)).divide(new BigInteger("4"));
BigInteger result = new BigInteger(String.valueOf(b.pow(pattern.intValue())));
return result;
}
error throw at line:
BigInteger result = new BigInteger(String.valueOf(b.pow(pattern.intValue())));
input
n=3214315236286878413828554688017932599492010774390476107169160882202257140006023
b=2424542564265464646464646412424987764446756457474245176752585282729789707797971262662662627967
output should be:
2012412197646109946818069059950164564377761312631574365459647649336933671988569
pattern.intValue()
pattern is some enormous number. .intValue() converts that to an int (hence the name), which is why this goes wrong, as this doesn't fit (and happens to turn into a negative number due to how intValue() works when you call that on a bigInteger whose value exceeds what int can represent. The reason only ints go there is because if you put a huge number there, your RAM can't hold it and your CPU would take a few million years to calculate it. There'd be no point.
How do I fix it?
By going back to whomever gave you this assignment. your code has a bug, but reading what it is intending to do, it is ((n+1)/4)^b.
Which is a number that has... a lot a lot of digits. Many, many more than what you expect for output.
Clearly that isn't what you really wanted, or if it is, no computer can calculate this, and the result would be nothing like what you wanted.
Possibly that output really is ((n+1)/4)^b, but mod something. Which the API does support: b.modPow(pattern, FigureOutWhatTheModuloIsAndPutThatHere).
I'm playing with hash tables and using a corpus of ~350,000 English words which I'd like to try to evenly distribute. Thus, I try to fit them into an array of length 810,049 (the closest prime larger than two times the input size) and I was baffled to see that a straightforward FNV1 implementation like this:
public int getHash(String s, int mod) {
final BigInteger MOD = new BigInteger(Integer.toString(mod));
final BigInteger FNV_offset_basis = new BigInteger("14695981039346656037");
final BigInteger FNV_prime = new BigInteger("1099511628211");
BigInteger hash = new BigInteger(FNV_offset_basis.toString());
for (int i = 0; i < s.length(); i++) {
int charValue = s.charAt(i);
hash = hash.multiply(FNV_prime).mod(MOD);
hash = hash.xor(BigInteger.valueOf((int) charValue & 0xffff)).mod(MOD);
}
return hash.mod(MOD).intValue();
}
results in 64,000 collisions which is a lot, 20% of the input basically. What's wrong with my implementation? Is the approach somehow flawed?
EDIT: to add to that, I've also tried and implemented other hashing algorithms like sdbm and djb2 and they all perform just the same, equally poorly. All have these ~65k collisions on this corpus. When I changed the corpus to just 350,000 integers represented as strings, a bit of variance starts to occur (like one algorithms has 20,000 collisions and the other has 40,000) but still the number of collision is astoundingly high. Why?
EDIT2: I've just tested it and the Java's built-in .hashCode() results in equally as many collisions and even if you do something ridiculously naive, like a hash being a product of multiplicating charcodes of all the characters modulo 810,049, it performs only half worse than all those notorious algorithms (60k collisions vs. 90k with the naive approach).
Since mod is a parameter to your hash function I presume it is the range into which you want the hash normalized, i.e. for your specific use case you are expecting it to be 810,049. I assume this because:
The algorithm calls for the calculations to be done modulo 2n where n is the number of bits in the desired hash.
Given that the offset basis and FNV Prime are constants within the module, and are equal to the parameters for a 64-bit hash, the value of mod should also be fixed at 264.
Since it is not, I assume it is the desired final output range.
In other words, given a fixed offset basis and FNV Prime, there is no reason to pass in the mod parameter -- it is dictated by the other two FNV parameters.
If all the above is correct then the implementation is wrong. You should be doing the calculations mod 264 and applying a final remainder operation with 810,049.
Also (but this may not be important), the algorithm calls for xoring the lower 8 bits with an ASCII character, whereas you are xoring with 16 bits. I am not sure this will make a difference since for ASCII the high-order byte will be zero anyway and it will behave exactly as if you were xoring only 8 bits.
Data type to hold a very large number say 1000 or more digits?
I need to find the factorial of a large number say 100000000.
My factorial program works nice for a smaller number.
long factorial(int x)
{
long fact=1;
if(x<0)
{
System.out.println("Incorrect input, enter a positive number");
fact=0;
}
if(x==0)
fact=1;
if(x>0)
{
fact=x;
fact=fact*factorial(x-1);
}
return fact;
}
You need a BigInteger. It can hold an arbitrarily large number.
But in your case 100000000! is such a huge number, that nothing can help.
You should use a log of gamma function, since gamma(n) = (n-1)! Far more efficient than your naive, student way of doing things. It'll be a double in that case, and the natural log will grow more slowly than the value does.
Recursion? Please. Your problem won't be the size of the value you pass back - you'll get too many stack frames and out of memory error long before that.
While BigInteger will theoretically handle such a value, you won't be able to compute it in practise.
First, your algorithm uses recursion, so you'd need 100.000.000 recursive calls of factorial. You'll get stack overflow far before computing the result. You'd have to modify your algorithm to avoid recursion (use a loop for example).
Second, the result will be huge. Using formulas for approximating factorials such as
log n! ~~ n log(n/e)
we can conclude that your number will have more than 750.000.000 digits. While with some luck you might be able to fit it into your memory, you will certainly not be able to compute the number in any reasonable time. Just imagine - you'll have to perform 100.000.000 multiplications with numbers that have hundreds of millions of digits.
I'm doing a Secret Sharing algorithm which encrypts a a message. To do that I need a bigger than message prime and some random numbers of aproximately the same size as the message.
I can do the first with BigInteger.probablePrime(MsgSize+8) but I do not know how to do the later.
I was using Random and later SecureRandom but they don't generate numbers of a given length. My solution was to do randomInt ^ randomInt to BigInteger but is obviously a bad solution.
Some ideas?
Is it Shamir's Secret Sharing that you're implementing? If so, note that you don't actually need a prime bigger than the entire message — it's perfectly fine to break the message into chunks of some manageable size and to share each chunk separately using a fixed prime.
Also, Shamir's Secret Sharing doesn't need a prime-sized field; it's possible to use any finite field GF(pn), including in particular the binary fields GF(2n). Such fields are particularly convenient for computer implementation, since the both the secret and share chunks will then be simply n-bit bitstrings.
The only complications are that, in non-prime fields, you'll have to implement finite field arithmetic (or find an existing implementation) and that you'll need to choose a particular reducing polynomial and agree upon it. However, the former isn't really as complicated as it might seem, and the latter isn't really any harder than choosing and agreeing on a prime. (In particular, a reducing polynomial for GF(2n) can be naturally represented as an n-bit bitstring, dropping the high bit which is always 1.)
Have you tried using the same probablePrime method with a smaller size, then using a large random integer as an offset from that number? That might do the trick, just an idea.
I had the same problem (thats why i found this post).
It is a little late but maybe someone else finds this method usefull:
public static BigDecimal getBigRandom(int d)
{
BigDecimal rnd = new BigDecimal(Math.random());
BigDecimal rndtmp;
for(int i=0;i<=d;i++)
{
rndtmp = new BigDecimal(Math.random());
rndtmp = rndtmp.movePointLeft(rnd.precision());
rnd = rnd.add(rndtmp);
}
return rnd;
}
Usage:
BigDecimal x = getBigRandom(y);
every y will give you approximately 50 digits.
if you need more than (2^31-1)*50 digits simply change int to long ;-)
dont know if it is good, but works for me