bitwise operator >>> in hashCode - java

I have two related questions:
the bitwise operator >>> means that we are shifting the binary number by those many places while filling 0 in the Most Significant Bit. But, then why does the following operation yields the same number: 5>>>32 yields 5 and -5>>>32 yields -5. Because if the above description is correct then both these operations would have yielded 0 as the final result.
In continuation to above, As per Effective Java book, we should use (int) (f ^ (f >>> 32)) (in case the field is long) while calculating the hash code (if the field is long). Why do we do that and what's the explanation

5 can be represented as 0101 if you shift it by 1 bit i.e 5>>>1 this will result as 0010=2
If the promoted type of the left-hand operand is int, only the five
lowest-order bits of the right-hand operand are used as the shift
distance. It is as if the right-hand operand were subjected to a
bitwise logical AND operator & (§15.22.1) with the mask value 0x1f.
The shift distance actually used is therefore always in the range 0 to
31, inclusive.
When you shift an integer with the << or >> operator and the shift
distance is greater than or equal to 32, you take the shift distance
mod 32 (in other words, you mask off all but the low order 5 bits of
the shift distance). This can be very counterintuitive. For example (i> >> 32) == i, for every integer i. You might expect it to shift the entire number off to the right, returning 0 for positive inputs and -1
for negative inputs, but it doesn't; it simply returns i, because (i
<< (32 & 0x1f)) == (i << 0) == i.

Answer to your first question is here why is 1>>32 == 1?
The second question answer, in short, is that in such way the whole long value is used(not a part of it) and note that it is probably the fastest way to do this.

I know this question has been answered long back, but I tried an example to get more clarification and I guess it'll others too.
long x = 3231147483648l;
System.out.println(Long.toBinaryString(x));
System.out.println(Long.toBinaryString(x >>> 32));
System.out.println(Long.toBinaryString(x ^ (x >>> 32)));
System.out.println(Long.toBinaryString((int) x ^ (x >>> 32)));
This prints -
101111000001001111011001011110001000000000
1011110000
101111000001001111011001011110000011110000
1001111011001011110000011110000
As #avrilfanomar mentions, this XORs first 32 bits of long with the other 32 bits and unsigned right shift operator helps us in doing this. Since we want to use this long field while calculating the hashcode, directly casting the long to int would mean that long fields differing only in the upper 32 bits will contribute the same value to the hashcode. This potentially means that two objects differing only in this field will have same hashcode and it'll be stored in the same bucket (with say a list to resolve collision) and this impacts the performance of hash-based collections. Hence, this operation.

Related

Bit manipulation in Java - 2s complement and flipping bits

I was recently looking into some problems with but manipulation in Java and I came up with two questions.
1) Firstly, I came up to the problem of flipping all the bits in a number.
I found this solution:
public class Solution {
public int flipAllBits(int num) {
int mask = (1 << (int)Math.floor(Math.log(num)/Math.log(2))+1) - 1;
return num ^ mask;
}
}
But what happens when k = 32 bits? Can the 1 be shifted 33 times?
What I understand from the code (although it doesn't really make sense), the mask is 0111111.(31 1's)....1 and not 32 1's, as someone would expect. And therefore when num is a really large number this would fail.
2) Another question I had was determining when something is a bit sequence in 2s complement or just a normal bit sequence. For example I read that 1010 when flipped is 0110 which is -10 but also 6. Which one is it and how do we know?
Thanks.
1) The Math object calls are not necessary. Flipping all the bits in any ordinal type in Java (or C) is not an arithmatic operation. It is a bitwise operation. Using the '^' operator, simply using 1- as an operand will work regardless of the sizeof int in C/C++ or a Java template with with the ordinal type as a parameter T. The tilde '~' operator is the other option.
T i = 0xf0f0f0f0;
System.out.println(T.toHexString(i));
i ^= -1;
System.out.println(T.toHexString(i));
i = ~ i;
System.out.println(T.toHexString(i));
2) Since the entire range of integers maps to the entire range of integers in a 2's compliment transform, it is not possible to detect whether a number is or is not 2's complement unless one knows the range of numbers from which the 2's complement might be calculated and the two sets (before and after) are mutually exclusive.
That mask computation is fairly inscrutable, I'm going to guess that it (attempts to, since you mention it's wrong) make a mask up to and including the highest set bit. Whether that's useful for "flipping all bits" is an other possible point of discussion, since to me at least, "all bits" means all 32 of them, not some number that depends on the value. But if that's what you want then that's what you want. Especially combined with that second question, that looks like a mistake to me, so you'd be implementing the wrong thing from the start - see near the bottom.
Anyway, the mask can be generated with some reasonably nice bitmath, which does not create any doubt about possible edge cases (eg Math.log(0) is probably bad, and k=32 corresponds with negative numbers which are also probably bad to put into a log):
int m = num | (num >> 16);
m |= m >> 8;
m |= m >> 4;
m |= m >> 2;
m |= m >> 1;
return num ^ m;
Note that this function has odd properties, it almost always returns an unsigned-lower number than went in, except at 0. It flips bits so the name is not completely wrong, but flipAllBits(flipAllBits(x)) != x (usually), while the name suggests it should be an involution.
As for the second question, there is nothing to determine. Two's complement is scheme by which you can interpret a bitvector - any bitvector. So it's really a choice you make; to interpret a given bitvector that way or some other way. In Java the "default" interpretation is two's complement (eg toString will print an int by interpreting it according to its two's complement meaning), but you don't have to go along with it, you can (with care) treat int as unsigned, or as an array of booleans, or several bitfields packed together, etc.
If you wanted to invert all the bits but made the common mistake to assume that the number of bits in an int is variable (and that you therefore needed to compute a mask that covers "all bits"), I have some great news for you, because inverting all bits is a lot easier:
return ~num;
If you were reading "invert all bits" in the context of two's complement, it would have the above meaning, so all bits, including those left of the highest set bit.

Bit manipulation in BigInteger

I recently learned a method in BigInteger Java class called
BigInteger.testBit(n)
Its main function is (this & (1<<n)) != 0), but I do not quite understand the source code
return (getInt(n >>> 5) & (1 << (n & 31))) != 0;
Can someone explain it?
I'm not familiar with how it's implemented, but judging by that code, I would assume that a BigInteger is implemented by storing it's value in a list of integers, and getInt(n) returns the n'th of these, where 0 = least significant.
The first will store the least significant 32 bits, getInt(1) will store bits 32-63 etc. By shifting n right by 5, you get the index of the integer in the internal list that has the bit you care about, it's the equivalent to n div 32.
With that integer, you then pick out the bit you care about from it with the (1 << (n & 31)). n & 31 is the equivalent of n modulo 32, and the 1 << is the equivalent of 2^. This gets you a bit mask that selects precisely the bit you care about.

And bitwise operation gets negative value

I have this code:
int code = 0x92011202;
int a = (code & 0xF0000000) >> 28;
int b = (code & 0x0F000000) >> 24;
// ..
int n = (code & 0x0000000F);
But if most significant bit of code is equal to 1 (from 9 to F) a comes negative value. All other variables
works fine.
Why this happen?
This is explained in The Java Tutorials.
Specifically :
The unsigned right shift operator ">>>" shifts a zero into the
leftmost position, while the leftmost position after ">>" depends on
sign extension.
Java uses 2s complement variables. The only aspect about 2s complements that you care about is that, if the leftmost bit is a 1, the number is negative. The signed bitshift maintains sign, so if code is negative to begin with, it stays negative after the shift.
To fix your program use >>> instead which is a logical bitshift, ignoring sign
The most significant bit of code represents the sign -- 0 means the number is positive and 1 means the number is negative.
If you just print out code you'll find that it's negative.
Because the shift operator takes into account the sign (it's a signed shift), a will get a negative value if code is negative.
The max value of "int" is 2^31-1. 0xF0000000 is a negative number. And any number with most significant bit equals to 1 is negative .

Logical right shift operator in java

I am beginner to java... I have tried very much but could not find the way the following line
System.out.println (-1>>>1);
gives 2147483647 ?
Can anyone help me ?
This is because the binary representation of -1 is 11111111111111111111111111111111. When you perform an unsigned right bit-shift operation (>>>) on it it moves all of the bits right by the argument (1 in this case) and fills in empty spaces on the left with zeros so you get 01111111111111111111111111111111 which is the binary representation of Integer.MAX_VALUE = 2147483647 (not sure where you got 2147483648 from).
>>> is the bitwise right-shift operator, with 0 sign extension - in other words, all bits "incoming" from the left are filled with 0s.
-1 is represented by 32 bits which are all 1. When you shift that right by 1 bit with 0 sign extension, you end up with a value which has the 31 bottom bits still 1, but 0 for the top bit (the sign bit), so you end up with Integer.MAX_VALUE - which is 2147483647, not 2147483648 as your post states.
Or in JLS terms, from section 15.19:
The value of n >>> s is n right-shifted s bit positions with zero-extension, where:
If n is positive, then the result is the same as that of n >> s.
If n is negative and the type of the left-hand operand is int, then the result is equal to that of the expression (n >> s) + (2 << ~s).
If n is negative and the type of the left-hand operand is long, then the result is equal to that of the expression (n >> s) + (2L << ~s).
This definition ends up being a bit of a pain to work with - it's easier to just work with the "0 sign extension right-shift" explanation, IMO.

Difference between >>> and >> operators [duplicate]

This question already has answers here:
Difference between >>> and >>
(9 answers)
Closed 5 years ago.
If the shifted number is positive >>> and >> work the same.
If the shifted number is negative >>> fills the most significant bits with 1s whereas >> operation shifts filling the MSBs with 0.
Is my understanding correct?
If the negative numbers are stored with the MSB set to 1 and not the 2s complement way that Java uses the the operators would behave entirely differently, correct?
The way negative numbers are represented is called 2's complement. To demonstrate how this works, take -12 as an example. 12, in binary, is 00001100 (assume integers are 8 bits though in reality they are much bigger). Take the 2's complement by simply inverting every bit, and you get 11110011. Then, simply add 1 to get 11110100. Notice that if you apply the same steps again, you get positive 12 back.
The >>> shifts in zero no matter what, so 12 >>> 1 should give you 00000110, which is 6, and (-12) >>> 1 should give you 01111010, which is 122. If you actually try this in Java, you'll get a much bigger number since Java ints are actually much bigger than 8 bits.
The >> shifts in a bit identical to the highest bit, so that positive numbers stay positive and negative numbers stay negative. 12 >> 1 is 00000110 (still 6) and (-12) >> 1 would be 11111010 which is negative 6.
Definition of the >>> operator in the Java Language Specification:
The value of n>>>s is n right-shifted s bit positions with zero-extension. If n is positive, then the result is the same as that of n>>s; if n is negative, the result is equal to that of the expression (n>>s)+(2<<~s) if the type of the left-hand operand is int, and to the result of the expression (n>>s)+(2L<<~s) if the type of the left-hand operand is long.
Just the opposite, the >>> fills with zeros while >> fills with ones if the h.o bit is 1.

Categories

Resources