generate ill-conditioned data for testing floating point summation - java

I have implemented a Kahan floating point summation algorithm in Java. I want to compare it against the built-in floating point addition in Java and infinite precision addition in Mathematica. However the data set I have is not good for testing, because the numbers are close to each other. (Condition number ~= 1)
Running Kahan on my data set gives all most the same result as the built-in +.
Could anyone suggest how to generate a large amount of data that can potentially cause serious rounding off error?

However the data set I have is not good for testing, because the numbers are close to each other.
It sounds like you already know what the problem is. Get to it =)
There are a few things that you will want:
Numbers of wildly different magnitudes, so that most of the precision of the smaller number is lost with naive summation.
Numbers with different signs and nearly equal (or equal) magnitudes, such that catastrophic cancellation occurs.
Numbers that have some low-order bits set, to increase the effects of rounding.
To get you started, you could try some simple three-term sums, which should show the effect clearly:
1.0 + 1.0e-20 - 1.0
Evaluated with simple summation, this will give 0.0; clearly incorrect. You might also look at sums of the form:
a0 + a1 + a2 + ... + an - b
Where b is the sum a0 + ... + an evaluated naively.

You want a heap of high precision numbers? Try this:
double[] nums = new double[SIZE];
for (int i = 0; i < SIZE; i++)
nums[i] = Math.rand();

Are we talking about number pairs or sequences?
If pairs, start with 1 for both numbers, then in every iteration divide one by 3, multiply the other by 3. It's easy to calculate the theoretical sums of those pairs and you'll get a whole host of rounding errors. (Some from the division and some from the addition. If you don't want division errors, then use 2 instead of 3.)

By experiment, I found following pattern:
public static void main(String[] args) {
System.out.println(1.0 / 3 - 0.01 / 3);
System.out.println(1.0 / 7 - 0.01 / 7);
System.out.println(1.0 / 9 - 0.001 / 9);
}
I've subtracted close negative powers of prime numbers (which should not have exact representation in binary form). However, there are cases then such expression evaluates correctly, for example
System.out.println(1.0 / 9 - 0.01 / 9);
You can automate this approach by iterating power of subtrahend and stopping when multiplication by appropriate value doesn't yield integer number, for example:
System.out.println((1.0 / 9 - 0.001 / 9) * 9000);
if (1000 - (1.0 / 9 - 0.001 / 9) * 9000 > 1.0)
System.out.println("Found it!");

Scalacheck might be something for you. Here is a short sample:
cat DoubleSpecification.scala
import org.scalacheck._
object DoubleSpecification extends Properties ("Doubles") {
/*
(a/1000 + b/1000) = (a+b) / 1000
(a/x + b/x ) = (a+b) / x
*/
property ("distributive") = Prop.forAll { (a: Int, b: Int, c: Int) =>
(c == 0 || a*1.0/c + b*1.0/c == (a+b) * 1.0 / c) }
}
object Runner {
def main (args: Array[String]) {
DoubleSpecification.check
println ("...done")
}
}
To run it, you need scala, and the schalacheck-jar. I used version 2.8 (I don't have to say, that your c-path will vary):
scalac -cp /opt/scala/lib/scalacheck.jar:. DoubleSpecification.scala
scala -cp /opt/scala/lib/scalacheck.jar:. DoubleSpecification
! Doubles.distributive: Falsified after 6 passed tests.
> ARG_0: 28 (orig arg: 1030341)
> ARG_1: 9 (orig arg: 2147483647)
> ARG_2: 5
Scalacheck takes some random values (orig args) and tries to simplify these, if the test fails, in order to find simple examples.

Related

Generating random doubles in Java between 0 and 1 inclusively or [0..1] [duplicate]

We can easily get random floating point numbers within a desired range [X,Y) (note that X is inclusive and Y is exclusive) with the function listed below since Math.random() (and most pseudorandom number generators, AFAIK) produce numbers in [0,1):
function randomInRange(min, max) {
return Math.random() * (max-min) + min;
}
// Notice that we can get "min" exactly but never "max".
How can we get a random number in a desired range inclusive to both bounds, i.e. [X,Y]?
I suppose we could "increment" our value from Math.random() (or equivalent) by "rolling" the bits of an IEE-754 floating point double precision to put the maximum possible value at 1.0 exactly but that seems like a pain to get right, especially in languages poorly suited for bit manipulation. Is there an easier way?
(As an aside, why do random number generators produce numbers in [0,1) instead of [0,1]?)
[Edit] Please note that I have no need for this and I am fully aware that the distinction is pedantic. Just being curious and hoping for some interesting answers. Feel free to vote to close if this question is inappropriate.
I believe there is much better decision but this one should work :)
function randomInRange(min, max) {
return Math.random() < 0.5 ? ((1-Math.random()) * (max-min) + min) : (Math.random() * (max-min) + min);
}
First off, there's a problem in your code: Try randomInRange(0,5e-324) or just enter Math.random()*5e-324 in your browser's JavaScript console.
Even without overflow/underflow/denorms, it's difficult to reason reliably about floating point ops. After a bit of digging, I can find a counterexample:
>>> a=1.0
>>> b=2**-54
>>> rand=a-2*b
>>> a
1.0
>>> b
5.551115123125783e-17
>>> rand
0.9999999999999999
>>> (a-b)*rand+b
1.0
It's easier to explain why this happens with a=253 and b=0.5: 253-1 is the next representable number down. The default rounding mode ("round to nearest even") rounds 253-0.5 up (because 253 is "even" [LSB = 0] and 253-1 is "odd" [LSB = 1]), so you subtract b and get 253, multiply to get 253-1, and add b to get 253 again.
To answer your second question: Because the underlying PRNG almost always generates a random number in the interval [0,2n-1], i.e. it generates random bits. It's very easy to pick a suitable n (the bits of precision in your floating point representation) and divide by 2n and get a predictable distribution. Note that there are some numbers in [0,1) that you will will never generate using this method (anything in (0,2-53) with IEEE doubles).
It also means that you can do a[Math.floor(Math.random()*a.length)] and not worry about overflow (homework: In IEEE binary floating point, prove that b < 1 implies a*b < a for positive integer a).
The other nice thing is that you can think of each random output x as representing an interval [x,x+2-53) (the not-so-nice thing is that the average value returned is slightly less than 0.5). If you return in [0,1], do you return the endpoints with the same probability as everything else, or should they only have half the probability because they only represent half the interval as everything else?
To answer the simpler question of returning a number in [0,1], the method below effectively generates an integer [0,2n] (by generating an integer in [0,2n+1-1] and throwing it away if it's too big) and dividing by 2n:
function randominclusive() {
// Generate a random "top bit". Is it set?
while (Math.random() >= 0.5) {
// Generate the rest of the random bits. Are they zero?
// If so, then we've generated 2^n, and dividing by 2^n gives us 1.
if (Math.random() == 0) { return 1.0; }
// If not, generate a new random number.
}
// If the top bits are not set, just divide by 2^n.
return Math.random();
}
The comments imply base 2, but I think the assumptions are thus:
0 and 1 should be returned equiprobably (i.e. the Math.random() doesn't make use of the closer spacing of floating point numbers near 0).
Math.random() >= 0.5 with probability 1/2 (should be true for even bases)
The underlying PRNG is good enough that we can do this.
Note that random numbers are always generated in pairs: the one in the while (a) is always followed by either the one in the if or the one at the end (b). It's fairly easy to verify that it's sensible by considering a PRNG that returns either 0 or 0.5:
a=0   b=0  : return 0
a=0   b=0.5: return 0.5
a=0.5 b=0  : return 1
a=0.5 b=0.5: loop
Problems:
The assumptions might not be true. In particular, a common PRNG is to take the top 32 bits of a 48-bit LCG (Firefox and Java do this). To generate a double, you take 53 bits from two consecutive outputs and divide by 253, but some outputs are impossible (you can't generate 253 outputs with 48 bits of state!). I suspect some of them never return 0 (assuming single-threaded access), but I don't feel like checking Java's implementation right now.
Math.random() is twice for every potential output as a consequence of needing to get the extra bit, but this places more constraints on the PRNG (requiring us to reason about four consecutive outputs of the above LCG).
Math.random() is called on average about four times per output. A bit slow.
It throws away results deterministically (assuming single-threaded access), so is pretty much guaranteed to reduce the output space.
My solution to this problem has always been to use the following in place of your upper bound.
Math.nextAfter(upperBound,upperBound+1)
or
upperBound + Double.MIN_VALUE
So your code would look like this:
double myRandomNum = Math.random() * Math.nextAfter(upperBound,upperBound+1) + lowerBound;
or
double myRandomNum = Math.random() * (upperBound + Double.MIN_VALUE) + lowerBound;
This simply increments your upper bound by the smallest double (Double.MIN_VALUE) so that your upper bound will be included as a possibility in the random calculation.
This is a good way to go about it because it does not skew the probabilities in favor of any one number.
The only case this wouldn't work is where your upper bound is equal to Double.MAX_VALUE
Just pick your half-open interval slightly bigger, so that your chosen closed interval is a subset. Then, keep generating the random variable until it lands in said closed interval.
Example: If you want something uniform in [3,8], then repeatedly regenerate a uniform random variable in [3,9) until it happens to land in [3,8].
function randomInRangeInclusive(min,max) {
var ret;
for (;;) {
ret = min + ( Math.random() * (max-min) * 1.1 );
if ( ret <= max ) { break; }
}
return ret;
}
Note: The amount of times you generate the half-open R.V. is random and potentially infinite, but you can make the expected number of calls otherwise as close to 1 as you like, and I don't think there exists a solution that doesn't potentially call infinitely many times.
Given the "extremely large" number of values between 0 and 1, does it really matter? The chances of actually hitting 1 are tiny, so it's very unlikely to make a significant difference to anything you're doing.
What would be a situation where you would NEED a floating point value to be inclusive of the upper bound? For integers I understand, but for a float, the difference between between inclusive and exclusive is what like 1.0e-32.
Think of it this way. If you imagine that floating-point numbers have arbitrary precision, the chances of getting exactly min are zero. So are the chances of getting max. I'll let you draw your own conclusion on that.
This 'problem' is equivalent to getting a random point on the real line between 0 and 1. There is no 'inclusive' and 'exclusive'.
The question is akin to asking, what is the floating point number right before 1.0? There is such a floating point number, but it is one in 2^24 (for an IEEE float) or one in 2^53 (for a double).
The difference is negligible in practice.
private static double random(double min, double max) {
final double r = Math.random();
return (r >= 0.5d ? 1.5d - r : r) * (max - min) + min;
}
Math.round() will help to include the bound value. If you have 0 <= value < 1 (1 is exclusive), then Math.round(value * 100) / 100 returns 0 <= value <= 1 (1 is inclusive). A note here is that the value now has only 2 digits in its decimal place. If you want 3 digits, try Math.round(value * 1000) / 1000 and so on. The following function has one more parameter, that is the number of digits in decimal place - I called as precision:
function randomInRange(min, max, precision) {
return Math.round(Math.random() * Math.pow(10, precision)) /
Math.pow(10, precision) * (max - min) + min;
}
How about this?
function randomInRange(min, max){
var n = Math.random() * (max - min + 0.1) + min;
return n > max ? randomInRange(min, max) : n;
}
If you get stack overflow on this I'll buy you a present.
--
EDIT: never mind about the present. I got wild with:
randomInRange(0, 0.0000000000000000001)
and got stack overflow.
I am fairly less experienced, So I am also looking for solutions as well.
This is my rough thought:
Random number generators produce numbers in [0,1) instead of [0,1],
Because [0,1) is an unit length that can be followed by [1,2) and so on without overlapping.
For random[x, y],
You can do this:
float randomInclusive(x, y){
float MIN = smallest_value_above_zero;
float result;
do{
result = random(x, (y + MIN));
} while(result > y);
return result;
}
Where all values in [x, y] has the same possibility to be picked, and you can reach y now.
Generating a "uniform" floating-point number in a range is non-trivial. For example, the common practice of multiplying or dividing a random integer by a constant, or by scaling a "uniform" floating-point number to the desired range, have the disadvantage that not all numbers a floating-point format can represent in the range can be covered this way, and may have subtle bias problems. These problems are discussed in detail in "Generating Random Floating-Point Numbers by Dividing Integers: a Case Study" by F. Goualard.
Just to show how non-trivial the problem is, the following pseudocode generates a random "uniform-behaving" floating-point number in the closed interval [lo, hi], where the number is of the form FPSign * FPSignificand * FPRADIX^FPExponent. The pseudocode below was reproduced from my section on floating-point number generation. Note that it works for any precision and any base (including binary and decimal) of floating-point numbers.
METHOD RNDRANGE(lo, hi)
losgn = FPSign(lo)
hisgn = FPSign(hi)
loexp = FPExponent(lo)
hiexp = FPExponent(hi)
losig = FPSignificand(lo)
hisig = FPSignificand(hi)
if lo > hi: return error
if losgn == 1 and hisgn == -1: return error
if losgn == -1 and hisgn == 1
// Straddles negative and positive ranges
// NOTE: Changes negative zero to positive
mabs = max(abs(lo),abs(hi))
while true
ret=RNDRANGE(0, mabs)
neg=RNDINT(1)
if neg==0: ret=-ret
if ret>=lo and ret<=hi: return ret
end
end
if lo == hi: return lo
if losgn == -1
// Negative range
return -RNDRANGE(abs(lo), abs(hi))
end
// Positive range
expdiff=hiexp-loexp
if loexp==hiexp
// Exponents are the same
// NOTE: Automatically handles
// subnormals
s=RNDINTRANGE(losig, hisig)
return s*1.0*pow(FPRADIX, loexp)
end
while true
ex=hiexp
while ex>MINEXP
v=RNDINTEXC(FPRADIX)
if v==0: ex=ex-1
else: break
end
s=0
if ex==MINEXP
// Has FPPRECISION or fewer digits
// and so can be normal or subnormal
s=RNDINTEXC(pow(FPRADIX,FPPRECISION))
else if FPRADIX != 2
// Has FPPRECISION digits
s=RNDINTEXCRANGE(
pow(FPRADIX,FPPRECISION-1),
pow(FPRADIX,FPPRECISION))
else
// Has FPPRECISION digits (bits), the highest
// of which is always 1 because it's the
// only nonzero bit
sm=pow(FPRADIX,FPPRECISION-1)
s=RNDINTEXC(sm)+sm
end
ret=s*1.0*pow(FPRADIX, ex)
if ret>=lo and ret<=hi: return ret
end
END METHOD

I need any number + a specific number in java

So my problem lies here:
if (foo2/4 == Any Number +0.25){
jahres_code = zwischenergebnis2%7 ;
}
I constructed that method so that only the number after the comma matters. And I don't know what to write it so that I have any number.
So, what you want is true in if when foo / 4 is in form of x.25. Do this:
if (((foo / 4d) - (foo / 4)) == 0.25)
What does this do? You have int foo, so when you divide it with 4 as int, you get int so 17 / 4 will give 4. But when you divide it with 4 as double (represented as 4d), you get 4.25. Now you take from this a whole number (which you get when you divide foo with int 4) and if it is 0.25, you have true.
Note: you can do this with double, but not with float, since behavior is different, due its representation in memory.

Using bitwise operator to divide by 0 (Simulation of division by 0) [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
We know that we can use bitwise operators to divide any two numbers. For example:
int result = 10 >> 1; //reult would be 5 as it's like dividing 10 by 2^1
Is there any chance we can divide a number by 0 using bits manipulation?
Edit 1: If I rephrase my question, I want to actually divide a number by zero and break my machine. How do I do that?
Edit 2: Let's forget about Java for a moment. Is it feasible for a machine to divide a number by 0 regardless of the programming language used?
Edit 3: As it's practically impossible to do this, is there a way we can simulate this using a really small number that approaches 0?
Another edit: Some people mentioned that CPU hardware prevents division by 0. I agree, there won't be a direct way to do it. Let's see this code for example:
i = 1;
while(0 * i != 10){
i++;
}
Let's assume that there is no cap on the maximum value of i. In this case there would be no compiler error nor the CPU would resist this. Now, I want my machine to find the number that's when multiplied with 0 gives me a result (which is obviously never going to happen) or die trying.
So, as there is a way to do this. How can I achieve this by directly manipulating bits?
Final Edit: How to perform binary division in Java without using bitwise operators? (I'm sorry, it purely contradicts the title).
Note: I've tried simulating divison by 0 and posted my answer. However, I'm looking for a faster way of doing it.
If what you want is a division method faster than division by repeated subtraction (which you posted), and that will run indefinitely when you try to divide by zero, you can implement your own version of the Goldschmidt division, and not throw an error when the divisor is zero.
The algorithm works like this:
1. Create an estimate for the factor f
2. Multiply both the dividend and the divisor by f
3. If the divisor is close enough to 1
Return the dividend
4. Else
Go back to step 1
Normally, we would need to scale down the dividend and the divisor before starting, so that 0 < divisor < 1 is satisfied. In this case, since we are going to divide by zero, there's no need for this step. We also need to choose an arbitrary precision beyond which we consider the result good enough.
The code, with no check for divisor == 0, would be like this:
static double goldschmidt(double dividend, double divisor) {
double epsilon = 0.0000001;
while (Math.abs(1.0 - divisor) > epsilon) {
double f = 2.0 - divisor;
dividend *= f;
divisor *= f;
}
return dividend;
}
This is much faster than the division by repeated subtraction method, since it converges to the result quadratically instead of linearly. When dividing by zero, it would not really matter, since both methods won't converge. But if you try to divide by a small number, such as 10^(-9), you can clearly see the difference.
If you don't want the code to run indefinitely, but to return Infinity when dividing by zero, you can modify it to stop when dividend reaches Infinity:
static double goldschmidt(double dividend, double divisor) {
double epsilon = 0.0000001;
while (Math.abs(1.0 - divisor) > 0.0000001 && !Double.isInfinite(dividend)) {
double f = 2.0 - divisor;
dividend *= f;
divisor *= f;
}
return dividend;
}
If the starting values for dividend and divisor are such that dividend >= 1.0 and divisor == 0.0, you will get Infinity as a result after, at most, 2^10 iterations. That's because the worst case is when dividend == 1 and you need to multiply it by two (f = 2.0 - 0.0) 1024 times to get to 2^1024, which is greater than Double.MAX_VALUE.
The Goldschmidt division was implemented in AMD Athlon CPUs. If you want to read about some lower level details, you can check this article:
Floating Point Division and Square Root Algorithms and Implementation
in the AMD-K7
TM
Microprocessor.
Edit:
Addressing your comments:
Note that the code for the Restoring Division method you posted iterates 2048 (2^11) times. I lowered the value of n in your code to 1024, so we could compare both methods doing the same number of iterations.
I ran both implementations 100000 times with dividend == 1, which is the worst case for Goldschmidt, and measured the running time like this:
long begin = System.nanoTime();
for (int i = 0; i < 100000; i++) {
goldschmidt(1.0, 0.0); // changed this for restoringDivision(1) to test your code
}
long end = System.nanoTime();
System.out.println(TimeUnit.NANOSECONDS.toMillis(end - begin) + "ms");
The running time was ~290ms for Goldschmidt division and ~23000ms (23 seconds) for your code. So this implementation was about 80x faster in this test. This is expected, since in one case we are doing double multiplications and in the other we are working with BigInteger.
The advantage of your implementation is that, since you are using BigInteger, you can make your result as large as BigInteger supports, while the result here is limited by Double.MAX_VALUE.
In practice, when dividing by zero, the Goldschmidt division is doubling the dividend, which is equivalent to a shift left, at each iteration, until it reaches the maximum possible value. So the equivalent using BigInteger would be:
static BigInteger divideByZero(int dividend) {
return BigInteger.valueOf(dividend)
.shiftLeft(Integer.MAX_VALUE - 1 - ceilLog2(dividend));
}
static int ceilLog2(int n) {
return (int) Math.ceil(Math.log(n) / Math.log(2));
}
The function ceilLog2() is necessary, so that the shiftLeft() will not cause an overflow. Depending on how much memory you have allocated, this will probably result in a java.lang.OutOfMemoryError: Java heap space exception. So there is a compromise to be made here:
You can get the division simulation to run really fast, but with a result upper bound of Double.MAX_VALUE,
or
You can get the result to be as big as 2^(Integer.MAX_VALUE - 1), but it would probably take too much memory and time to get to that limit.
Edit 2:
Addressing your new comments:
Please note that no division is happening in your updated code. It's just trying to find the biggest possible BigInteger
First, let us show that the Goldschmidt division degenerates into a shift left when divisor == 0:
static double goldschmidt(double dividend, double divisor) {
double epsilon = 0.0000001;
while (Math.abs(1.0 - 0.0) > 0.0000001 && !Double.isInfinite(dividend)) {
double f = 2.0 - 0.0;
dividend *= f;
divisor = 0.0 * f;
}
return dividend;
}
The factor f will always be equal to 2.0 and the first while condition will always be true. So if we eliminate the redundancies:
static double goldschmidt(double dividend, 0) {
while (!Double.isInfinite(dividend)) {
dividend *= 2.0;
}
return dividend;
}
Assuming dividend is an Integer, we can do the same multiplication using a shift left:
static int goldschmidt(int dividend) {
while (...) {
dividend = dividend << 1;
}
return dividend;
}
If the maximum value we can reach is 2^n, we need to loop n times. When dividend == 1, this is equivalent to:
static int goldschmidt(int dividend) {
return 1 << n;
}
When the dividend > 1, we need to subtract ceil(log2(dividend)) to prevent an overflow:
static int goldschmidt(int dividend) {
return dividend << (n - ceil(log2(dividend));
}
Thus showing that the Goldschmidt division is equivalent to a shift left if divisor == 0.
However, shifting the bits to the left would pad bits on the right with 0. Try running this with a small dividend and left shift it (once or twice to check the results). This thing will never get to 2^(Integer.MAX_VALUE - 1).
Now that we've seen that a shift left by n is equivalent to a multiplication by 2^n, let's see how the BigInteger version works. Consider the following examples that show we will get to 2^(Integer.MAX_VALUE - 1) if there is enough memory available and the dividend is a power of 2:
For dividend = 1
BigInteger.valueOf(dividend).shiftLeft(Integer.MAX_VALUE - 1 - ceilLog2(dividend))
= BigInteger.valueOf(1).shiftLeft(Integer.MAX_VALUE - 1 - 0)
= 1 * 2^(Integer.MAX_VALUE - 1)
= 2^(Integer.MAX_VALUE - 1)
For dividend = 1024
BigInteger.valueOf(dividend).shiftLeft(Integer.MAX_VALUE - 1 - ceilLog2(dividend))
= BigInteger.valueOf(1024).shiftLeft(Integer.MAX_VALUE - 1 - 10)
= 1024 * 2^(Integer.MAX_VALUE - 1)
= 2^10 * 2^(Integer.MAX_VALUE - 1 - 10)
= 2^(Integer.MAX_VALUE - 1)
If dividend is not a power of 2, we will get as close as we can to 2^(Integer.MAX_VALUE - 1) by repeatedly doubling the dividend.
Your requirement is impossible.
The division by 0 is mathematically impossible. The concept just don't exist, so there is no way to simulate it.
If you were actually trying to do limits operation (divide by 0+ or 0-) then there is still no way to do it using bitwise as it will only allow you to divide by power of two.
Here an exemple using bitwise operation only to divide by power of 2
10 >> 1 = 5
Looking at the comments you posted, if what you want is simply to exit your program when an user try to divide by 0 you can simply validate it :
if(dividant == 0)
System.exit(/*Enter here the exit code*/);
That way you will avoid the ArithmeticException.
After exchanging a couple of comments with you, it seems like what you are trying to do is crash the operating system dividing by 0.
Unfortunately for you, as far as I know, any language that can be written on a computer are validated enought to handle the division by 0.
Just think to a simple calculator that you pay 1$, try to divide by 0 and it won't even crash, it will simply throw an error msg. This is probably validated at the processor level anyway.
Edit
After multiple edits/comments to your question, it seems like you are trying to retrieve the Infinity dividing by a 0+ or 0- that is very clause to 0.
You can achieve this with double/float division.
System.out.println(1.0f / 0.0f);//prints infinity
System.out.println(1.0f / -0.0f);//prints -Infinity
System.out.println(1.0d / 0.0d);//prints infinity
System.out.println(1.0d / -0.0d);//prints -Infinity
Note that even if you write 0.0, the value is not really equals to 0, it is simply really close to it.
No, there isn't, since you can only divide by a power of 2 using right shift.
One way to simulate division of unsigned integers (irrespective of divisor used) is by division by repeated subtraction:
BigInteger result = new BigInteger("0");
int divisor = 0;
int dividend = 2;
while(dividend >= divisor){
dividend = dividend - divisor;
result = result.add(BigInteger.ONE);
}
Second way to do this is by using Restoring Division algorithm (Thanks #harold) which is way faster than the first one:
int num = 10;
BigInteger den = new BigInteger("0");
BigInteger p = new BigInteger(new Integer(num).toString());
int n = 2048; //Can be changed to find the biggest possible number (i.e. upto 2^2147483647 - 1). Currently it shows 2^2048 - 1 as output
den = den.shiftLeft(n);
BigInteger q = new BigInteger("0");
for(int i = n; i > 0; i -= 1){
q = q.shiftLeft(1);
p = p.multiply(new BigInteger("2"));
p = p.subtract(den);
if(p.compareTo(new BigInteger("0")) == 1
|| p.compareTo(new BigInteger("0")) == 0){
q = q.add(new BigInteger("1"));
} else {
p = p.add(den);
}
}
System.out.println(q);
As others have indicated, you cannot mathematically divide by 0.
However if you want methods to divide by 0, there are some constants in Double you could use. For example you could have a method
public static double divide(double a, double b){
return b == 0 ? Double.NaN : a/b;
}
or
public static double posLimitDivide(double a, double b){
if(a == 0 && b == 0)
return Double.NaN;
return b == 0 ? (a > 0 ? Double.POSITIVE_INFINITY : Double.NEGATIVE_INFINITY) : a/b;
Which would return the limit of a/x where x approaches +b.
These should be ok, as long as you account for it in whatever methods use them. And by OK I mean bad, and could cause indeterminate behavior later if you're not careful. But it is a clear way to indicate the result with an actual value rather than an exception.

Why does changing the sum order returns a different result?

Why does changing the sum order returns a different result?
23.53 + 5.88 + 17.64 = 47.05
23.53 + 17.64 + 5.88 = 47.050000000000004
Both Java and JavaScript return the same results.
I understand that, due to the way floating point numbers are represented in binary, some rational numbers (like 1/3 - 0.333333...) cannot be represented precisely.
Why does simply changing the order of the elements affect the result?
Maybe this question is stupid, but why does simply changing the order of the elements affects the result?
It will change the points at which the values are rounded, based on their magnitude. As an example of the kind of thing that we're seeing, let's pretend that instead of binary floating point, we were using a decimal floating point type with 4 significant digits, where each addition is performed at "infinite" precision and then rounded to the nearest representable number. Here are two sums:
1/3 + 2/3 + 2/3 = (0.3333 + 0.6667) + 0.6667
= 1.000 + 0.6667 (no rounding needed!)
= 1.667 (where 1.6667 is rounded to 1.667)
2/3 + 2/3 + 1/3 = (0.6667 + 0.6667) + 0.3333
= 1.333 + 0.3333 (where 1.3334 is rounded to 1.333)
= 1.666 (where 1.6663 is rounded to 1.666)
We don't even need non-integers for this to be a problem:
10000 + 1 - 10000 = (10000 + 1) - 10000
= 10000 - 10000 (where 10001 is rounded to 10000)
= 0
10000 - 10000 + 1 = (10000 - 10000) + 1
= 0 + 1
= 1
This demonstrates possibly more clearly that the important part is that we have a limited number of significant digits - not a limited number of decimal places. If we could always keep the same number of decimal places, then with addition and subtraction at least, we'd be fine (so long as the values didn't overflow). The problem is that when you get to bigger numbers, smaller information is lost - the 10001 being rounded to 10000 in this case. (This is an example of the problem that Eric Lippert noted in his answer.)
It's important to note that the values on the first line of the right hand side are the same in all cases - so although it's important to understand that your decimal numbers (23.53, 5.88, 17.64) won't be represented exactly as double values, that's only a problem because of the problems shown above.
Here's what's going on in binary. As we know, some floating-point values cannot be represented exactly in binary, even if they can be represented exactly in decimal. These 3 numbers are just examples of that fact.
With this program I output the hexadecimal representations of each number and the results of each addition.
public class Main{
public static void main(String args[]) {
double x = 23.53; // Inexact representation
double y = 5.88; // Inexact representation
double z = 17.64; // Inexact representation
double s = 47.05; // What math tells us the sum should be; still inexact
printValueAndInHex(x);
printValueAndInHex(y);
printValueAndInHex(z);
printValueAndInHex(s);
System.out.println("--------");
double t1 = x + y;
printValueAndInHex(t1);
t1 = t1 + z;
printValueAndInHex(t1);
System.out.println("--------");
double t2 = x + z;
printValueAndInHex(t2);
t2 = t2 + y;
printValueAndInHex(t2);
}
private static void printValueAndInHex(double d)
{
System.out.println(Long.toHexString(Double.doubleToLongBits(d)) + ": " + d);
}
}
The printValueAndInHex method is just a hex-printer helper.
The output is as follows:
403787ae147ae148: 23.53
4017851eb851eb85: 5.88
4031a3d70a3d70a4: 17.64
4047866666666666: 47.05
--------
403d68f5c28f5c29: 29.41
4047866666666666: 47.05
--------
404495c28f5c28f6: 41.17
4047866666666667: 47.050000000000004
The first 4 numbers are x, y, z, and s's hexadecimal representations. In IEEE floating point representation, bits 2-12 represent the binary exponent, that is, the scale of the number. (The first bit is the sign bit, and the remaining bits for the mantissa.) The exponent represented is actually the binary number minus 1023.
The exponents for the first 4 numbers are extracted:
sign|exponent
403 => 0|100 0000 0011| => 1027 - 1023 = 4
401 => 0|100 0000 0001| => 1025 - 1023 = 2
403 => 0|100 0000 0011| => 1027 - 1023 = 4
404 => 0|100 0000 0100| => 1028 - 1023 = 5
First set of additions
The second number (y) is of smaller magnitude. When adding these two numbers to get x + y, the last 2 bits of the second number (01) are shifted out of range and do not figure into the calculation.
The second addition adds x + y and z and adds two numbers of the same scale.
Second set of additions
Here, x + z occurs first. They are of the same scale, but they yield a number that is higher up in scale:
404 => 0|100 0000 0100| => 1028 - 1023 = 5
The second addition adds x + z and y, and now 3 bits are dropped from y to add the numbers (101). Here, there must be a round upwards, because the result is the next floating point number up: 4047866666666666 for the first set of additions vs. 4047866666666667 for the second set of additions. That error is significant enough to show in the printout of the total.
In conclusion, be careful when performing mathematical operations on IEEE numbers. Some representations are inexact, and they become even more inexact when the scales are different. Add and subtract numbers of similar scale if you can.
Jon's answer is of course correct. In your case the error is no larger than the error you would accumulate doing any simple floating point operation. You've got a scenario where in one case you get zero error and in another you get a tiny error; that's not actually that interesting a scenario. A good question is: are there scenarios where changing the order of calculations goes from a tiny error to a (relatively) enormous error? The answer is unambiguously yes.
Consider for example:
x1 = (a - b) + (c - d) + (e - f) + (g - h);
vs
x2 = (a + c + e + g) - (b + d + f + h);
vs
x3 = a - b + c - d + e - f + g - h;
Obviously in exact arithmetic they would be the same. It is entertaining to try to find values for a, b, c, d, e, f, g, h such that the values of x1 and x2 and x3 differ by a large quantity. See if you can do so!
This actually covers much more than just Java and Javascript, and would likely affect any programming language using floats or doubles.
In memory, floating points use a special format along the lines of IEEE 754 (the converter provides much better explanation than I can).
Anyways, here's the float converter.
http://www.h-schmidt.net/FloatConverter/
The thing about the order of operations is the "fineness" of the operation.
Your first line yields 29.41 from the first two values, which gives us 2^4 as the exponent.
Your second line yields 41.17 which gives us 2^5 as the exponent.
We're losing a significant figure by increasing the exponent, which is likely to change the outcome.
Try ticking the last bit on the far right on and off for 41.17 and you can see that something as "insignificant" as 1/2^23 of the exponent would be enough to cause this floating point difference.
Edit: For those of you who remember significant figures, this would fall under that category. 10^4 + 4999 with a significant figure of 1 is going to be 10^4. In this case, the significant figure is much smaller, but we can see the results with the .00000000004 attached to it.
Floating point numbers are represented using the IEEE 754 format, which provides a specific size of bits for the mantissa (significand). Unfortunately this gives you a specific number of 'fractional building blocks' to play with, and certain fractional values cannot be represented precisely.
What is happening in your case is that in the second case, the addition is probably running into some precision issue because of the order the additions are evaluated. I haven't calculated the values, but it could be for example that 23.53 + 17.64 cannot be precisely represented, while 23.53 + 5.88 can.
Unfortunately it is a known problem that you just have to deal with.
I believe it has to do with the order of evaulation. While the sum is naturally the same in a math world, in the binary world instead of A + B + C = D, it's
A + B = E
E + C = D(1)
So there's that secondary step where floating point numbers can get off.
When you change the order,
A + C = F
F + B = D(2)
To add a different angle to the other answers here, this SO answer shows that there are ways of doing floating-point math where all summation orders return exactly the same value at the bit level.

How can I round manually?

I'd like to round manually without the round()-Method.
So I can tell my program that's my number, on this point i want you to round.
Let me give you some examples:
Input number: 144
Input rounding: 2
Output rounded number: 140
Input number: 123456
Input rounding: 3
Output rounded number: 123500
And as a litte addon maybe to round behind the comma:
Input number: 123.456
Input rounding: -1
Output rounded number: 123.460
I don't know how to start programming that...
Has anyone a clue how I can get started with that problem?
Thanks for helping me :)
I'd like to learn better programming, so i don't want to use the round and make my own one, so i can understand it a better way :)
A simple way to do it is:
Divide the number by a power of ten
Round it by any desired method
Multiply the result by the same power of ten in step 1
Let me show you an example:
You want to round the number 1234.567 to two decimal positions (the desired result is 1234.57).
x = 1234.567;
p = 2;
x = x * pow(10, p); // x = 123456.7
x = floor(x + 0.5); // x = floor(123456.7 + 0.5) = floor(123457.2) = 123457
x = x / pow(10,p); // x = 1234.57
return x;
Of course you can compact all these steps in one. I made it step-by-step to show you how it works. In a compact java form it would be something like:
public double roundItTheHardWay(double x, int p) {
return ((double) Math.floor(x * pow(10,p) + 0.5)) / pow(10,p);
}
As for the integer positions, you can easily check that this also works (with p < 0).
Hope this helps
if you need some advice how to start,
step by step write down calculations what you need to do to get from 144,2 --> 140
replace your math with java commands, that should be easy, but if you have problem, just look here and here
public static int round (int input, int places) {
int factor = (int)java.lang.Math.pow(10, places);
return (input / factor) * factor;
}
Basically, what this does is dividing the input by your factor, then multiplying again. When dividing integers in languages like Java, the remainder of the division is dropped from the results.
edit: the code was faulty, fixed it. Also, the java.lang.Math.pow is so that you get 10 to the n-th power, where n is the value of places. In the OP's example, the number of places to consider is upped by one.
Re-edit: as pointed out in the comments, the above will give you the floor, that is, the result of rounding down. If you don't want to always round down, you must also keep the modulus in another variable. Like this:
int mod = input % factor;
If you want to always get the ceiling, that is, rounding up, check whether mod is zero. If it is, leave it at that. Otherwise, add factor to the result.
int ceil = input + (mod == 0 ? 0 : factor);
If you want to round to nearest, then get the floor if mod is smaller than factor / 2, or the ceiling otherwise.
Divide (positive)/Multiply (negative) by the "input rounding" times 10 - 1 (144 / (10 * (2 - 1)). This will give you the same in this instance. Get the remainder of the last digit (4). Determine if it is greater than or equal to 5 (less than). Make it equal to 0 or add 10, depending on the previous answer. Multiply/Divide it back by the "input rounding" times 10 - 1. This should give you your value.
If this is for homework. The purpose is to teach you to think for yourself. I may have given you the answer, but you still need to write the code by yourself.
Next time, you should write your own code and ask what is wrong
For integers, one way would be to use a combination of the mod operator, which is the percent symbol %, and the divide operator. In your first example, you would compute 144 % 10, resulting in 4. And compute 144 / 10, which gives 14 (as an integer). You can compare the result of the mod operation to half of the denominator, to find out if you should round the 14 up to 15 or not (in this case not), and then multiply back by the denominator to get your answer.
In psuedo code, assuming n is the number to round, p is the power of 10 representing the position of the significant digits:
denom = power(10, p)
remainder = n % denom
dividend = n / denom
if (remainder < denom/2)
return dividend * denom
else
return (dividend + 1) * denom

Categories

Resources