I have read this article NUM00-J. Detect or prevent integer overflow and this question How does Java handle integer underflows and overflows and how would you check for it?.
As you can see, there are many solutions to prevent integer overflow when multiplying an integer by an integer.
But I wonder is there any solution to prevent integer overflow when multiplying an integer by float?
My current (silly) solution:
public static final int mulInt(int a, float b) {
double c = a * b;
return c > Integer.MAX_VALUE ? Integer.MAX_VALUE : (int)c;
}
But it has a lot of problems:
It can get the expected result when multiplying performs
multiplication with both parameters having to be small numbers.
When either parameter is large digits the result is bound to be
incorrect (I know in part because of the floating-point data type).
Suppose if the result of the calculation is even greater than the
maximum value of double, it will be unstoppable and return a
negative number.
So, what is the real solution to this problem?
Your answer will be very helpful, I will appreciate it!
UPDATE: There is another question here How can I check if multiplying two numbers in Java will cause an overflow? that is quite similar BUT it is about multiplying an integer by an integer instead of multiplying by a float.
Below is a C approach that may shed light in Java.
Perform the multiplication using double, not float math before the assginment to gain the extra precision/range of double. Overflow is not then expected.
A compare like c > Integer.MAX_VALUE suffers from Integer.MAX_VALUE first being converted into a double. This may lose precision.*1 Consider what happens if the converted value is Integer.MAX_VALUE + 1.0. Then if c is Integer.MAX_VALUE + 1.0, code will attempt to return (int) (Integer.MAX_VALUE + 1.0) - not good. Better to use well formed limits. (Negative ones too.) In C, maybe Java, floating point conversion to int truncates the fraction. Special care is needed near the edges.
#define INT_MAX_PLUS1_AS_DOUBLE ((INT_MAX/2 + 1)*2.0)
int mulInt(int a, float b) {
// double c = a * b;
double c = (double) a * b;
//return c > Integer.MAX_VALUE ? Integer.MAX_VALUE : (int)c;
if (c < INT_MAX_PLUS1_AS_DOUBLE && c - INT_MIN > -1.0) {
return (int) c;
}
if (c > 0) return INT_MAX;
if (c < 0) return INT_MIN;
return 0; // `b` was a NaN
}
c - INT_MIN > -1 is like c > INT_MIN - 1, but as INT_MIN is a -power-of-2, INT_MIN - 1 might not convert precisely to double. c - INT_MIN is expected to be exact near the edge cases.
*1 When int is 32-bit (or less) and double is 64-bit (with 53-bit significand) not an issue. But important with wider integer types.
We can easily get random floating point numbers within a desired range [X,Y) (note that X is inclusive and Y is exclusive) with the function listed below since Math.random() (and most pseudorandom number generators, AFAIK) produce numbers in [0,1):
function randomInRange(min, max) {
return Math.random() * (max-min) + min;
}
// Notice that we can get "min" exactly but never "max".
How can we get a random number in a desired range inclusive to both bounds, i.e. [X,Y]?
I suppose we could "increment" our value from Math.random() (or equivalent) by "rolling" the bits of an IEE-754 floating point double precision to put the maximum possible value at 1.0 exactly but that seems like a pain to get right, especially in languages poorly suited for bit manipulation. Is there an easier way?
(As an aside, why do random number generators produce numbers in [0,1) instead of [0,1]?)
[Edit] Please note that I have no need for this and I am fully aware that the distinction is pedantic. Just being curious and hoping for some interesting answers. Feel free to vote to close if this question is inappropriate.
I believe there is much better decision but this one should work :)
function randomInRange(min, max) {
return Math.random() < 0.5 ? ((1-Math.random()) * (max-min) + min) : (Math.random() * (max-min) + min);
}
First off, there's a problem in your code: Try randomInRange(0,5e-324) or just enter Math.random()*5e-324 in your browser's JavaScript console.
Even without overflow/underflow/denorms, it's difficult to reason reliably about floating point ops. After a bit of digging, I can find a counterexample:
>>> a=1.0
>>> b=2**-54
>>> rand=a-2*b
>>> a
1.0
>>> b
5.551115123125783e-17
>>> rand
0.9999999999999999
>>> (a-b)*rand+b
1.0
It's easier to explain why this happens with a=253 and b=0.5: 253-1 is the next representable number down. The default rounding mode ("round to nearest even") rounds 253-0.5 up (because 253 is "even" [LSB = 0] and 253-1 is "odd" [LSB = 1]), so you subtract b and get 253, multiply to get 253-1, and add b to get 253 again.
To answer your second question: Because the underlying PRNG almost always generates a random number in the interval [0,2n-1], i.e. it generates random bits. It's very easy to pick a suitable n (the bits of precision in your floating point representation) and divide by 2n and get a predictable distribution. Note that there are some numbers in [0,1) that you will will never generate using this method (anything in (0,2-53) with IEEE doubles).
It also means that you can do a[Math.floor(Math.random()*a.length)] and not worry about overflow (homework: In IEEE binary floating point, prove that b < 1 implies a*b < a for positive integer a).
The other nice thing is that you can think of each random output x as representing an interval [x,x+2-53) (the not-so-nice thing is that the average value returned is slightly less than 0.5). If you return in [0,1], do you return the endpoints with the same probability as everything else, or should they only have half the probability because they only represent half the interval as everything else?
To answer the simpler question of returning a number in [0,1], the method below effectively generates an integer [0,2n] (by generating an integer in [0,2n+1-1] and throwing it away if it's too big) and dividing by 2n:
function randominclusive() {
// Generate a random "top bit". Is it set?
while (Math.random() >= 0.5) {
// Generate the rest of the random bits. Are they zero?
// If so, then we've generated 2^n, and dividing by 2^n gives us 1.
if (Math.random() == 0) { return 1.0; }
// If not, generate a new random number.
}
// If the top bits are not set, just divide by 2^n.
return Math.random();
}
The comments imply base 2, but I think the assumptions are thus:
0 and 1 should be returned equiprobably (i.e. the Math.random() doesn't make use of the closer spacing of floating point numbers near 0).
Math.random() >= 0.5 with probability 1/2 (should be true for even bases)
The underlying PRNG is good enough that we can do this.
Note that random numbers are always generated in pairs: the one in the while (a) is always followed by either the one in the if or the one at the end (b). It's fairly easy to verify that it's sensible by considering a PRNG that returns either 0 or 0.5:
a=0 b=0 : return 0
a=0 b=0.5: return 0.5
a=0.5 b=0 : return 1
a=0.5 b=0.5: loop
Problems:
The assumptions might not be true. In particular, a common PRNG is to take the top 32 bits of a 48-bit LCG (Firefox and Java do this). To generate a double, you take 53 bits from two consecutive outputs and divide by 253, but some outputs are impossible (you can't generate 253 outputs with 48 bits of state!). I suspect some of them never return 0 (assuming single-threaded access), but I don't feel like checking Java's implementation right now.
Math.random() is twice for every potential output as a consequence of needing to get the extra bit, but this places more constraints on the PRNG (requiring us to reason about four consecutive outputs of the above LCG).
Math.random() is called on average about four times per output. A bit slow.
It throws away results deterministically (assuming single-threaded access), so is pretty much guaranteed to reduce the output space.
My solution to this problem has always been to use the following in place of your upper bound.
Math.nextAfter(upperBound,upperBound+1)
or
upperBound + Double.MIN_VALUE
So your code would look like this:
double myRandomNum = Math.random() * Math.nextAfter(upperBound,upperBound+1) + lowerBound;
or
double myRandomNum = Math.random() * (upperBound + Double.MIN_VALUE) + lowerBound;
This simply increments your upper bound by the smallest double (Double.MIN_VALUE) so that your upper bound will be included as a possibility in the random calculation.
This is a good way to go about it because it does not skew the probabilities in favor of any one number.
The only case this wouldn't work is where your upper bound is equal to Double.MAX_VALUE
Just pick your half-open interval slightly bigger, so that your chosen closed interval is a subset. Then, keep generating the random variable until it lands in said closed interval.
Example: If you want something uniform in [3,8], then repeatedly regenerate a uniform random variable in [3,9) until it happens to land in [3,8].
function randomInRangeInclusive(min,max) {
var ret;
for (;;) {
ret = min + ( Math.random() * (max-min) * 1.1 );
if ( ret <= max ) { break; }
}
return ret;
}
Note: The amount of times you generate the half-open R.V. is random and potentially infinite, but you can make the expected number of calls otherwise as close to 1 as you like, and I don't think there exists a solution that doesn't potentially call infinitely many times.
Given the "extremely large" number of values between 0 and 1, does it really matter? The chances of actually hitting 1 are tiny, so it's very unlikely to make a significant difference to anything you're doing.
What would be a situation where you would NEED a floating point value to be inclusive of the upper bound? For integers I understand, but for a float, the difference between between inclusive and exclusive is what like 1.0e-32.
Think of it this way. If you imagine that floating-point numbers have arbitrary precision, the chances of getting exactly min are zero. So are the chances of getting max. I'll let you draw your own conclusion on that.
This 'problem' is equivalent to getting a random point on the real line between 0 and 1. There is no 'inclusive' and 'exclusive'.
The question is akin to asking, what is the floating point number right before 1.0? There is such a floating point number, but it is one in 2^24 (for an IEEE float) or one in 2^53 (for a double).
The difference is negligible in practice.
private static double random(double min, double max) {
final double r = Math.random();
return (r >= 0.5d ? 1.5d - r : r) * (max - min) + min;
}
Math.round() will help to include the bound value. If you have 0 <= value < 1 (1 is exclusive), then Math.round(value * 100) / 100 returns 0 <= value <= 1 (1 is inclusive). A note here is that the value now has only 2 digits in its decimal place. If you want 3 digits, try Math.round(value * 1000) / 1000 and so on. The following function has one more parameter, that is the number of digits in decimal place - I called as precision:
function randomInRange(min, max, precision) {
return Math.round(Math.random() * Math.pow(10, precision)) /
Math.pow(10, precision) * (max - min) + min;
}
How about this?
function randomInRange(min, max){
var n = Math.random() * (max - min + 0.1) + min;
return n > max ? randomInRange(min, max) : n;
}
If you get stack overflow on this I'll buy you a present.
--
EDIT: never mind about the present. I got wild with:
randomInRange(0, 0.0000000000000000001)
and got stack overflow.
I am fairly less experienced, So I am also looking for solutions as well.
This is my rough thought:
Random number generators produce numbers in [0,1) instead of [0,1],
Because [0,1) is an unit length that can be followed by [1,2) and so on without overlapping.
For random[x, y],
You can do this:
float randomInclusive(x, y){
float MIN = smallest_value_above_zero;
float result;
do{
result = random(x, (y + MIN));
} while(result > y);
return result;
}
Where all values in [x, y] has the same possibility to be picked, and you can reach y now.
Generating a "uniform" floating-point number in a range is non-trivial. For example, the common practice of multiplying or dividing a random integer by a constant, or by scaling a "uniform" floating-point number to the desired range, have the disadvantage that not all numbers a floating-point format can represent in the range can be covered this way, and may have subtle bias problems. These problems are discussed in detail in "Generating Random Floating-Point Numbers by Dividing Integers: a Case Study" by F. Goualard.
Just to show how non-trivial the problem is, the following pseudocode generates a random "uniform-behaving" floating-point number in the closed interval [lo, hi], where the number is of the form FPSign * FPSignificand * FPRADIX^FPExponent. The pseudocode below was reproduced from my section on floating-point number generation. Note that it works for any precision and any base (including binary and decimal) of floating-point numbers.
METHOD RNDRANGE(lo, hi)
losgn = FPSign(lo)
hisgn = FPSign(hi)
loexp = FPExponent(lo)
hiexp = FPExponent(hi)
losig = FPSignificand(lo)
hisig = FPSignificand(hi)
if lo > hi: return error
if losgn == 1 and hisgn == -1: return error
if losgn == -1 and hisgn == 1
// Straddles negative and positive ranges
// NOTE: Changes negative zero to positive
mabs = max(abs(lo),abs(hi))
while true
ret=RNDRANGE(0, mabs)
neg=RNDINT(1)
if neg==0: ret=-ret
if ret>=lo and ret<=hi: return ret
end
end
if lo == hi: return lo
if losgn == -1
// Negative range
return -RNDRANGE(abs(lo), abs(hi))
end
// Positive range
expdiff=hiexp-loexp
if loexp==hiexp
// Exponents are the same
// NOTE: Automatically handles
// subnormals
s=RNDINTRANGE(losig, hisig)
return s*1.0*pow(FPRADIX, loexp)
end
while true
ex=hiexp
while ex>MINEXP
v=RNDINTEXC(FPRADIX)
if v==0: ex=ex-1
else: break
end
s=0
if ex==MINEXP
// Has FPPRECISION or fewer digits
// and so can be normal or subnormal
s=RNDINTEXC(pow(FPRADIX,FPPRECISION))
else if FPRADIX != 2
// Has FPPRECISION digits
s=RNDINTEXCRANGE(
pow(FPRADIX,FPPRECISION-1),
pow(FPRADIX,FPPRECISION))
else
// Has FPPRECISION digits (bits), the highest
// of which is always 1 because it's the
// only nonzero bit
sm=pow(FPRADIX,FPPRECISION-1)
s=RNDINTEXC(sm)+sm
end
ret=s*1.0*pow(FPRADIX, ex)
if ret>=lo and ret<=hi: return ret
end
END METHOD
Java 8 gave us Math.addExact() for integers but not decimals.
Is it possible for double and BigDecimal to overflow? Judging by Double.MAX_VALUE and How to get biggest BigDecimal value I'd say the answer is yes.
As such, why don't we have Math.addExact() for those types as well? What's the most maintainable way to check this ourselves?
double overflows to Infinity and -Infinity, it doesn't wrap around. BigDecimal doesn't overflow, period, it is only limited by the amount of memory in your computer. See: How to get biggest BigDecimal value
The only difference between + and .addExact is that it attempts to detect if overflow has occurred and throws an Exception instead of wraps. Here's the source code:
public static int addExact(int x, int y) {
int r = x + y;
// HD 2-12 Overflow iff both arguments have the opposite sign of the result
if (((x ^ r) & (y ^ r)) < 0) {
throw new ArithmeticException("integer overflow");
}
return r;
}
If you want to check that an overflow has occurred, in one sense it's simpler to do it with double anyway because you can simply check for Double.POSITIVE_INFINITY or Double.NEGATIVE_INFINITY; in the case of int and long, it's a slightly more complicated matter because it isn't always one fixed value, but in another, these could be inputs (e.g. Infinity + 10 = Infinity and you probably don't want to throw an exception in this case).
For all these reasons (and we haven't even mentioned NaN yet), this is probably why such an addExact method doesn't exist in the JDK. Of course, you can always add your own implementation to a utility class in your own application.
The reason you do not need a addExact function for floating point digits is because instead of wrapping around, it overflows to Double.Infinity.
Consequently you can very easily check at the end of the operation whether it overflowed or not. Since Double.POSITIVE_INFINITY + Double.NEGATIVE_INFINITY is NaN you also have to check for NaN in case of more complicated expressions.
This is not only faster but also easier to read. Instead of having Math.addExact(Math.addExact(x, y), z) to add 3 doubles together, you can instead write:
double result = x + y + z;
if (Double.isInfinite(result) || Double.isNan(result)) throw ArithmeticException("overflow");
BigDecimal on the other hand will indeed overflow and throw a corresponding exception in that case as well - this is very unlikely to ever happen in practice though.
For double, please check the other answers.
BigDecimal has the addExact() protection already built in. Many arithmetic operation methods (e.g. multiply) of BigDecimal contain a check on the scale of the result:
private int checkScale(long val) {
int asInt = (int)val;
if (asInt != val) {
asInt = val>Integer.MAX_VALUE ? Integer.MAX_VALUE : Integer.MIN_VALUE;
BigInteger b;
if (intCompact != 0 &&
((b = intVal) == null || b.signum() != 0))
throw new ArithmeticException(asInt>0 ? "Underflow":"Overflow");
}
return asInt;
}
Now I'm try to convert some js code into java , there is a problem:
In js
46022*65535 = 3016051770
and
(46022*65535)|7867 = -1278910789
In java
46022*65535 = -1278915526 this is overflow
46022L*65535L = 3016051770L this is the same result to js
(46022*65535)|7867 = -1278910789 this one and the one below is the problem
(46022L*65535L)|7867L = 3016056507L
So , why the | operator will make two positive number to be nagtive number?
What's the different between java and js when dealing with the int and long to do this operation?
And then , how to write java code compatible with js in this situation ?
Attention:I know the range of int and long , my problem is |.
More problems :
According to https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Bitwise_Operators
& is also 32bit operation, then:
In js
2996101485 & 65535 = 57709
In java
2996101485 is overflow to int so I use double to store it and cast it into int when I need to do AND.
double a = 2996101485l;
double b = 65535;
int c = (int) a & (int) b; Now c = 65535
But if I use long to cast :
long c = (long) a & (long) b; Now c = 57709
So , just simply cast double into int will cause problems. And I want to know why?
I got the problem , 2996101485 can be present in 32bit in js and in java it should be long. So I write functions to convert those operations , for example, & should use this java function to run give same result in js:
private double doOR(double x, double y) {
if (x > Integer.MAX_VALUE && x <= 1l << 32) {
if (y > Integer.MAX_VALUE && y <= 1l << 32) {
return (long) x | (long) y;
} else {
return (long) x | (int) y;
}
} else {
return (int) x | (int) y;
}
}
The problem is that while numbers in JavaScript have roughly 53-bit precision (they appear to be based on floating point doubles), the bitwise OR operates on only 32 bits.
Bitwise operators treat their operands as a sequence of 32 bits (zeroes and ones), rather than as decimal, hexadecimal, or octal numbers.
This means that when working with arithmetic, long will get you the JavaScript-like arithmetic (with numbers such as yours), since Java ints will overflow; but when working with bitwise operations, int will get you the JavaScript-like results, since then both platforms are operating on 32-bit numbers.
You should use long instead.
System.out.println(46022L*65535L); // = 3016051770
Java has ints and longs.
System.out.println(Integer.MAX_VALUE); // = 2147483647
System.out.println(Long.MAX_VALUE); // = 9,223,372,036,854,775,807
As for the language difference, I can only attribute it to different precisions between the languages. If you see this question, you'll see the largest number in JS is 9,007,199,254,740,992. That's a guess, it might be for another reason.
This question already has answers here:
How to test if a double is an integer
(18 answers)
Closed 9 years ago.
Specifically in Java, how can I determine if a double is an integer? To clarify, I want to know how I can determine that the double does not in fact contain any fractions or decimals.
I am concerned essentially with the nature of floating-point numbers. The methods I thought of (and the ones I found via Google) follow basically this format:
double d = 1.0;
if((int)d == d) {
//do stuff
}
else {
// ...
}
I'm certainly no expert on floating-point numbers and how they behave, but I am under the impression that because the double stores only an approximation of the number, the if() conditional will only enter some of the time (perhaps even a majority of the time). But I am looking for a method which is guaranteed to work 100% of the time, regardless of how the double value is stored in the system.
Is this possible? If so, how and why?
double can store an exact representation of certain values, such as small integers and (negative or positive) powers of two.
If it does indeed store an exact integer, then ((int)d == d) works fine. And indeed, for any 32-bit integer i, (int)((double)i) == i since a double can exactly represent it.
Note that for very large numbers (greater than about 2**52 in magnitude), a double will always appear to be an integer, as it will no longer be able to store any fractional part. This has implications if you are trying to cast to a Java long, for instance.
How about
if(d % 1 == 0)
This works because all integers are 0 modulo 1.
Edit To all those who object to this on the grounds of it being slow, I profiled it, and found it to be about 3.5 times slower than casting. Unless this is in a tight loop, I'd say this is a preferable way of working it out, because it's extremely clear what you're testing, and doesn't require any though about the semantics of integer casting.
I profiled it by running time on javac of
class modulo {
public static void main(String[] args) {
long successes = 0;
for(double i = 0.0; i < Integer.MAX_VALUE; i+= 0.125) {
if(i % 1 == 0)
successes++;
}
System.out.println(successes);
}
}
VS
class cast {
public static void main(String[] args) {
long successes = 0;
for(double i = 0.0; i < Integer.MAX_VALUE; i+= 0.125) {
if((int)i == i)
successes++;
}
System.out.println(successes);
}
}
Both printed 2147483647 at the end.
Modulo took 189.99s on my machine - Cast took 54.75s.
if(new BigDecimal(d).scale() <= 0) {
//do stuff
}
Your method of using if((int)d == d) should always work for any 32-bit integer. To make it work up to 64 bits, you can use if((long)d == d, which is effectively the same except that it accounts for larger magnitude numbers. If d is greater than the maximum long value (or less than the minimum), then it is guaranteed to be an exact integer. A function that tests whether d is an integer can then be constructed as follows:
boolean isInteger(double d){
if(d > Long.MAX_VALUE || d < Long.MIN_VALUE){
return true;
} else if((long)d == d){
return true;
} else {
return false;
}
}
If a floating point number is an integer, then it is an exact representation of that integer.
Doubles are a binary fraction with a binary exponent. You cannot be certain that an integer can be exactly represented as a double, especially not if it has been calculated from other values.
Hence the normal way to approach this is to say that it needs to be "sufficiently close" to an integer value, where sufficiently close typically mean "within X %" (where X is rather small).
I.e. if X is 1 then 1.98 and 2.02 would both be considered to be close enough to be 2. If X is 0.01 then it needs to be between 1.9998 and 2.0002 to be close enough.