JAVA integer overflow intrigue - java

The 1st for loop in the below code does not find the maximum correctly due to an overflow. However, the 2nd for loop does. I used godbolt.com to look at the byte codes for this program which showed that to determine which number is greater the 1st for loop uses an isub and the 2nd for loop uses an if_icmple. Makes sense. However, why is the if_icmple able to successfully do this comparison since it too at some point must do a subtraction (which I would expect to produce an overflow)?
public class Overflow {
public static void main(String[] args) {
int[] nums = {3,-2147483648, 5, 7, 27, 9};
int curMax = nums[0];
for (int num : nums) {
int diff = curMax - num;
if (diff < 0) {
curMax = num;
}
}
System.out.println("1) max is " + curMax);
curMax = nums[0];
for (int num : nums) {
if (num > curMax) {
curMax = num;
}
}
System.out.println("2) max is " + curMax);
}
}
The output is
max is -2147483648
max is 27

Let's say that comparison is implemented using subtraction. Contrary to various other opinions here, I'd say that is highly likely. Eg cmp on x86 is just a subtraction that does not update its destination register, only the flags. Various other (but maybe not all) processors that have a flags register also work that way. In the rest of this answer I'll use x86 as a representative processor for examples.
However, there is an incorrect assumption made implicitly by your code: a comparison is not equivalent to a subtraction followed by checking the sign, it is equivalent to a subtraction followed by checking some combination of the Zero, Sign, and Overflow flags. For example, if you implement if (num > curMax) using some cmp followed by jle (to skip the body of the if when the condition is false), then jle does this:
Jump short if less or equal (ZF=1 or SF≠ OF).
Expressing the condition SF≠ OF directly in Java is not so easy. But the JVM itself has no such problem, it can use a comparison (or, equivalently, subtraction) followed by exactly the right kind of conditional jump.
There are some less-fortunate processors that do not have such as full set of conditional jumps as x86 has, but even in that case, the JVM has a lot more options than you do.
The option that the JVM does not have though, is implementing comparison incorrectly.

However, why is the if_icmple able to successfully do this comparison since it too at some point must do a subtraction (which I would expect to produce an overflow)?
It doesn't need to do a subtraction. It just needs to do a comparison. Comparisons don't have to involve subtraction, certainly not in the CPU.
Of course, we can also take alternate steps to deal with overflow. Here is one simple approach:
int cmp(int a, int b) {
boolean aNeg = (a >>> 31) != 0;
boolean bNeg = (b >>> 31) != 0;
if (aNeg != bNeg) {
return aNeg ? -1 : 1;
}
int diff = a - b; // subtracting two numbers with the same sign can't overflow
return (diff == 0) ? 0 : (diff < 0) ? -1 : 1;
}
All Java has to do is compile if_icmple to something like this, or whatever other instruction is appropriate on the target CPU. Using bytecode means Java can leave that up to the runtime to get right for the target CPU, whatever's fastest in such an environment -- using the overflow bit, doing something like this, whatever.

Related

Same logic code with same data type code passes in Java but not in C++?

I was solving a leetcode question where we have to find the number of set possible that adds to a target.
Given an integer array with all positive numbers and no duplicates, find the number of possible combinations that add up to a positive integer target.
I wrote the code in Java
JAVA
class Solution {
public int combinationSum4(int[] nums, int target) {
int[] dp = new int[target+1];
dp[0] = 1;
for(int i = 1; i <= target; i++){
for(int num : nums){
if(i >= num){
dp[i] += dp[i-num];
}
}
}
return dp[target];
}
}
It passed all the test cases, but when I wrote the same code in C++. Its failing few of the test cases.
C++
class Solution {
public:
int combinationSum4(vector<int>& nums, int target) {
int dp[target+1] = {0};
dp[0] = 1;
for(int i = 1; i <= target; i++){
for(int num : nums){
if(i >= num){
dp[i] += dp[i-num];
}
}
}
return dp[target];
}
};
The test case being :
nums : [3,33,333]
target : 10000
Error that I am getting :
Line 9: Char 27: runtime error: signed integer overflow: 1941940377 + 357856184 cannot be represented in type 'int' (solution.cpp)
Note : In the code I have only changed the declaration of dp array part as you can see. Why am I getting this error. Whats going wrong?
The int at leetcode seems to be 32 bits which can usually represent numbers in the range [-2^31, 2^31).
Overflowing signed integers has undefined behaviour in C++. A signed 32 bit int have different representations on different platforms. You'll most often find the Two's complement version, but there are others.
2^31-1 = 2147483647
1941940377 + 357856184 = 2299796561 // overflow error in C++
Add (if needed)
#include <cstdint>
and replace
int dp[target+1] = {0};
with
std::vector<std::uintmax_t> dp(target+1, 0);
std::uintmax_t gives you the largest unsigned integer type available and overflowing unsigned integers has a well defined behaviour in C++, so even if you do end up with a calculation larger than the limit, say 18446744073709551615 for a 64 bit integer, it will just wrap around. 18446744073709551615 + 1 == 0 in that case.
In Java, overflowing an int has well defined behaviour.
2147483647 + 1 = -2147483648
which is why you don't get in trouble when using that code in Java.
While #TedLyngmo's answer seems to have correctly identified the problem, I disagree about the solution.
You should not just ignore overflow, even if its behavior is well defined. The coding problem you were given is in itself flawed, as, indeed, the number of possible additive decompositions of an integer increases exponentially with the value of that integer, meaning the size of the output is at least some linear function of the value of the input number - and that means any fixed-size type will be inappropriate.
If you actually wanted to solve this problem and produce correct output without overflow, you would need a "big int" class - something which can hold arbitrarily-large values.
Here's an example C++ BigInt implementation:
https://github.com/kasparsklavins/bigint
There are others, of course (but there isn't one in the standard library).

How to disable Conversions and Promotions in Java?

Ok. I think this is impossible. If you think the same, you do not need to post an answer. I have read a few lines of Chapter 5. Conversions and Promotions and it seems chapter 5 has no mention of disabling Conversions and Promotions in Java.
Here is my motivation:
long uADD(long a, long b) {
try {
long c;
c = 0;
boolean carry; //carry flag; true: need to carry; false: no need to carry
carry = false;
for (int i = 0; i < 64; ++i) { //i loops from 0 to 63,
if (((((a >>> i) & 1) ^ ((b >>> i)) & 1) != 0) ^ carry) { //calculate the ith digit of the sum
c += (1 << i);
}
if (((((a >>> i) & 1) & ((b >>> i) & 1)) != 0) || (carry && ((((a >>> i) & 1) ^ ((b >>> i) & 1)) != 0))) {
carry = true; //calculate the carry flag which will be used for calculation of the (i+1)th digit
} else {
carry = false;
}
}
if (carry) { //check if there is a last carry flag
throw new ArithmeticException(); //throw arithmetic exception if true
}
return c;
} catch (ArithmeticException arithmExcep) {
throw new ArithmeticException("Unsigned Long integer Overflow during Addition");
}
}
So basically, I am writing a method that will do unsigned addition for long integer. It will throw arithmetic exception if overflow. The code above is not readable enough, so I should try to explain it.
First, there is a for loop where i loops from 0 to 63.
Then, the first if statement acts as the sum output of the full adder, it uses the ith digit of a and that of b and carry flag to calculate the i + 1th digit (true or false). (Note that i = 0 corresponds to the units digit.) If true, it adds 1 << i to c, where c is initially 0.
After that, the second if statement acts as the carry flag output of the full adder, it uses again the ith digit of a and that of b and carry flag to calculate the carry flag of the i + 1th digit. If true, set the new carry flag to true, if false, set the new carry flag false.
Finally, after exit the for loop, check if the carry flag is true. If true, throw arithmetic exception.
However, the above code does not work. After debugging, it turns out the problem occurs at
c += (1 << i);
The correct code should be:
c += (1L << i);
because Java will automatically promote integer 1 << i to Long and add it to c, showing no warning to me.
I have several questions regarding to this.
Is it possible to disable automatic promotion of one data type to another
How often does automatic promotion causing problem to you?
Is it possible to tweak the IDE so that it shows a warning to me when automatic promotion occurs? (I am using NetBeans IDE 7.3.1 at the moment.)
Sorry for lots of questions and the hard to read code. I will be studying CS in September so I try to write some code in Java to familiarize myself with Java.
Is it possible to disable automatic promotion of one data type to another
No: as you already discovered the Java Language Specification mandates numeric promotion to occur, any compiler doing this would (by definition) not be a valid Java Compiler.
How often does automatic promotion causing problem to you?
Perhaps once a year (and I code in Java for a living)?
Is it possible to tweak the IDE so that it shows a warning to me when automatic promotion occurs? (I am using NetBeans IDE 7.3.1 at the moment.)
It is worth noting that such a warning would not detect all cases where an explicit promotion is needed. For instance, consider:
boolean isNearOrigin(int x, int y, int r) {
return x * x + y + y < r * r;
}
Even though there is no automatic promotion, the multiplications may overflow, which can make the method return incorrect results, and one should probably write
return (long) x * x + (long) y + y < (long) r * r;
instead.
It's also worth noting that your proposed warning would also appear for correct code. For instance:
int x = ...;
foo(x);
would warn about automatic promotion if foo is declared with parameter type long, even though that promotion can not have any adverse effects. Since such innocent situations are quite frequent, your warning would probably be so annoying that everybody would turn it off. I'd therefore by quite surprised to find any Java compiler emit such a warning.
In general, the compiler can not detect that an operation will overflow, and even finding likely candidates for overflow is complex. Given the rarity of overflow-related problems, such an imperfect detection seems a dubious benefit, which is probably why Java compilers and IDEs do not implement it. It therefore remains the responsibility of the programmer to verify, for each arithmetic operation, that the value set afforded by the operand types is suitable. This includes specifying suitable type suffixes for any numeric literals used as operands.
PS: Though I am impressed that you got your ripple-carry adder working, I think your uAdd method could be more easily implemented as follows:
long uAdd(long a, long b) {
long sum = a + b;
if (uLess(sum, a)) {
throw new ArithmeticException("Overflow");
} else {
return sum;
}
}
/** #return whether a < b, treating both a and b as unsigned longs */
boolean uLess(long a, long b) {
long signBit = 1L << -1;
return (signBit ^ a) < (signBit ^ b);
}
To see why this is correct, let < denote the less than relation for the signed interpretation (which is equivalent to the Java operator), and ≪ denote the less than relation for the unsigned values. Let a and b be any bit pattern, from which a' and b' are obtained by flipping the sign bit. By the definition of signed integers, we then have:
If sign(a) = sign(b), we have (a ≪ b) = (a' ≪ b') = (a' < b')
If sign(a) ≠ sign(b), we have (a ≪ b) = (b' ≪ a') = (a' < b')
Therefore, (a ≪ b) = (a' < b').

How to violate Comparable interface first provision with integers overflow?

java.lang.Comparable#compareTo method states as first provision
The implementor must ensure sgn(x.compareTo(y)) == -sgn(y.compare-
To(x)) for all x and y. (This implies that x.compareTo(y) must throw
an exception if and only if y.compareTo(x) throws an exception.)
and according Joshua Bloch in Effective Java in item 12
This trick works fine here but should be used with extreme caution.
Don’t use it unless you’re certain the fields in question are
non-negative or, more generally, that the difference between the
lowest and highest possible field values is less than or equal to
Integer.MAX_VALUE (231-1). The reason this trick doesn’t always work
is that a signed 32-bit integer isn’t big enough to hold the
difference between two arbitrary signed 32-bit integers. If i is a
large positive int and j is a large negative int, (i - j) will
overflow and return a negative value. The resulting compareTo method
will return incorrect results for some arguments and violate the first
and second provisions of the compareTo contract. This is not a purely
theoretical problem: it has caused failures in real systems. These
failures can be difficult to debug, as the broken compareTo method
works properly for most input values.
With integers overflow you can violate the first provision and I can't find how, this example shows how the first provision would be violated:
public class ProblemsWithLargeIntegers implements Comparable<ProblemsWithLargeIntegers> {
private int zas;
#Override
public int compareTo(ProblemsWithLargeIntegers o) {
return zas - o.zas;
}
public ProblemsWithLargeIntegers(int zas) {
this.zas = zas;
}
public static void main(String[] args) {
int value1 = ...;
int value2 = ...;
ProblemsWithLargeIntegers d = new ProblemsWithLargeIntegers(value1);
ProblemsWithLargeIntegers e = new ProblemsWithLargeIntegers(value2);
if (!(Math.signum(d.compareTo(e)) == -Math.signum(e.compareTo(d)))){
System.out.println("hey!");
}
}
So I want a value1 and a value2 for getting that? Any idea? Or Joshua was wrong?
Well, this violates the general contract to start with. For example, take value1 = Integer.MIN_VALUE and value2 = 1. That will report that Integer.MIN_VALUE > 1, effectively.
EDIT: Actually, I was wrong - it's easy to violate the first provision:
int value1 = Integer.MIN_VALUE;
int value2 = 0;
You'll get a negative result for both comparisons, because Integer.MIN_VALUE - 0 == 0 - Integer.MIN_VALUE.

Random Number generation Issues

This question was asked in my interview.
random(0,1) is a function that generates integers 0 and 1 randomly.
Using this function how would you design a function that takes two integers a,b as input and generates random integers including a and b.
I have No idea how to solve this.
We can do this easily by bit logic (E,g, a=4 b=10)
Calculate difference b-a (for given e.g. 6)
Now calculate ceil(log(b-a+1)(Base 2)) i.e. no of bits required to represent all numbers b/w a and b
now call random(0,1) for each bit. (for given example range will be b/w 000 - 111)
do step 3 till the number(say num) is b/w 000 to 110(inclusive) i.e. we need only 7 levels since b-a+1 is 7.So there are 7 possible states a,a+1,a+2,... a+6 which is b.
return num + a.
I hate this kind of interview Question because there are some
answer fulfilling it but the interviewer will be pretty mad if you use them. For example,
Call random,
if you obtain 0, output a
if you obtain 1, output b
A more sophisticate answer, and probably what the interviewer wants is
init(a,b){
c = Max(a,b)
d = log2(c) //so we know how much bits we need to cover both a and b
}
Random(){
int r = 0;
for(int i = 0; i< d; i++)
r = (r<<1)| Random01();
return r;
}
You can generate random strings of 0 and 1 by successively calling the sub function.
So we have randomBit() returning 0 or 1 independently, uniformly at random and we want a function random(a, b) that returns a value in the range [a,b] uniformly at random. Let's actually make that the range [a, b) because half-open ranges are easier to work with and equivalent. In fact, it is easy to see that we can just consider the case where a == 0 (and b > 0), i.e. we just want to generate a random integer in the range [0, b).
Let's start with the simple answer suggested elsewhere. (Forgive me for using c++ syntax, the concept is the same in Java)
int random2n(int n) {
int ret = n ? randomBit() + (random2n(n - 1) << 1) : 0;
}
int random(int b) {
int n = ceil(log2(b)), v;
while ((v = random2n(n)) >= b);
return v;
}
That is-- it is easy to generate a value in the range [0, 2^n) given randomBit(). So to get a value in [0, b), we repeatedly generate something in the range [0, 2^ceil(log2(b))] until we get something in the correct range. It is rather trivial to show that this selects from the range [0, b) uniformly at random.
As stated before, the worst case expected number of calls to randomBit() for this is (1 + 1/2 + 1/4 + ...) ceil(log2(b)) = 2 ceil(log2(b)). Most of those calls are a waste, we really only need log2(n) bits of entropy and so we should try to get as close to that as possible. Even a clever implementation of this that calculates the high bits early and bails out as soon as it exits the wanted range has the same expected number of calls to randomBit() in the worst case.
We can devise a more efficient (in terms of calls to randomBit()) method quite easily. Let's say we want to generate a number in the range [0, b). With a single call to randomBit(), we should be able to approximately cut our target range in half. In fact, if b is even, we can do that. If b is odd, we will have a (very) small chance that we have to "re-roll". Consider the function:
int random(int b) {
if (b < 2) return 0;
int mid = (b + 1) / 2, ret = b;
while (ret == b) {
ret = (randomBit() ? mid : 0) + random(mid);
}
return ret;
}
This function essentially uses each random bit to select between two halves of the wanted range and then recursively generates a value in that half. While the function is fairly simple, the analysis of it is a bit more complex. By induction one can prove that this generates a value in the range [0, b) uniformly at random. Also, it can be shown that, in the worst case, this is expected to require ceil(log2(b)) + 2 calls to randomBit(). When randomBit() is slow, as may be the case for a true random generator, this is expected to waste only a constant number of calls rather than a linear amount as in the first solution.
function randomBetween(int a, int b){
int x = b-a;//assuming a is smaller than b
float rand = random();
return a+Math.ceil(rand*x);
}

f(int x) { return x == 0 ? 0 : 1; } in Java without conditionals

I want to implement f(int x) { return x == 0 ? 0 : 1; } in Java.
In C, I'd just "return !!x;", but ! doesn't work like that in Java. Is there some way to do it without conditionals? Without something cheesy like an unrolled version of
int ret = 0;
for (int i = 0; i < 32; i++) {
ret |= ((x & (1 << i)) >>> i);
}
or
try {
return x/x;
} catch (ArithmeticException e) {
return 0;
}
)
EDIT:
So, I did a microbenchmark of three different solutions:
my return x/x catch solution,
the obvious x==0?0:1 solution, and
Ed Staub's solution: (x|-x) >>> 31.
The timings for random int inputs (the whole int range) were:
1. 0.268716
2. 0.324449
3. 0.347852
Yes, my stupid x/x solution was faster by a pretty hefty margin. Not very surprising when you consider that there are very few 0's in it, and in the vast majority of cases the fast path is taken.
The timings for the more interesting case where 50% of inputs are 0:
1. 1.256533
2. 0.321485
3. 0.348999
The naive x==0?0:1 solution was faster by about 5% than the clever one (on my machine). I'll try to do some disassembly tomorrow to find out why.
EDIT2:
Ok, so the disassembly for the conditional version is (excluding book-keeping):
testl rsi,rsi
setnz rax
movzbl rax,rax
The disassembly for (x|-x)>>>31 is:
movl rax,rsi
negl rax
orl rax,rsi
sarl rax,#31
I don't think anything else needs to be said.
Ok, shortest solution without conditional is probably:
return (i|-i) >>> 31;
Here is a solution:
public static int compute(int i)
{
return ((i | (~i + 1)) >> 31) & 1; // return ((i | -i) >> 31) & 1
}
EDIT:
or you can make it more simple:
public static int compute(int i)
{
return -(-i >> 31); // return -i >>> 31
}
EDIT2:
last solution fails with negative numbers. Take a look at #Ed Staub's solution.
EDIT3:
#Orion Adrian OK, here is a general solution:
public static int compute(int i)
{
return (i|-i) >>> java.math.BigInteger.valueOf(Integer.MAX_VALUE).bitLength();
}
int f(int x) {
return Math.abs(Integer.signum(x));
}
The signum() function returns the sign of the number as -1, 0 or 1. So all what's left is to turn -1 into 1, which is what abs does.
The signum function implements it this way
return (i >> 31) | (-i >>> 31);
so, just add another bitwise operation to return 0 or 1
return ((i >> 31) | (-i >>> 31)) & 1;
All of these solutions seem to suffer from the vice of taking varying degrees of effort to understand. That means the programmer who must later read and maintain this code will have to expend unnecessary effort. That costs money.
The expression
(x == 0)? 0:1
is straightforward and simple to understand. It's really the right way to do this. The use of an exception in the ordinary run of code is downright ghastly. Exceptions are for handling circumstances beyond programmer control, not for ordinary routine operations.
I wonder what the compiler would turn this into...
class kata {
public static int f(int x){
return -(Boolean.valueOf(x==0).compareTo(true));
}
public static void main(String[] args) {
System.out.println(f(0));
System.out.println(f(5));
System.out.println(f(-1));
}
}
http://ideone.com/ssAVo
This question reduces down to: "Is there a way to map boolean true,false to int 1,0 respectively without writing the conditional."
In Java, there is no standardized treatment of true as 1. The closest is use of -1. So as #Ed says, the ternary operator is as succinct as you get.
If you wanted a boolean, i think:
return x == x >>> 1
Would do it, because the only number whose set bits don't move when shifted is one with no set bits.
Under the hood, the bytecode actually uses 1 and 0 for true and false, but i don't know of any way to turn a Java language boolean value into its corresponding int value without some sort of conditional.

Categories

Resources