Bitwise operator unexpected behavior - java

Can someone explain this java bitwise operator behavior??
System.out.println(010 | 4); // --> 12
System.out.println(10 | 4); // --> 14
Thank you!

The first number is interpreted as octal. So 010 == 8.
Starting from that, it is easy to see, that
8d | 4d == 1000b | 0100b == 1100b == 12d
The second number is interpreted to be decimal, which yields
10d | 4d == 1010b | 0100b == 1110b == 14d
(Where d indicates a decimal number and b indicates a binary one.)

Related

Does the bitwise operators XOR and OR works differently on negative numbers unlike AND operator

I was trying various combination for better understanding of the XOR operator but I am not able to figure out how bitwise actually works under the hood in case of negative numbers because it's fine(as expected) in the case of positive numbers but producing different results in case of bitwise AND(&), bitwise XOR(^) and bitwise OR(|) case while applying the same logic for negative numbers.
In case of positive numbers:
x = 26;
y = 3;
System.out.println(x ^ y); // 25
System.out.println(x & y); // 2
System.out.println(x | y); // 27
11010 -> 26
00011 -> 3
Applying XOR Applying AND Applying OR
11010 11010 11010
00011 00011 00011
----- ----- -----
11001 -> 25 00010 -> 2 11011 -> 27
Expected outputs as when analysing manually.
But, in case of negative number:
x = 26;
y = -3;
System.out.println(x ^ y); // -25
System.out.println(x & y); // 24
System.out.println(x | y); // -1
00000000000000000000000000011010 --> 26
11111111111111111111111111111101 --> -3(2s complement of -3)
Applying XOR Applying OR
00000000000000000000000000011010 00000000000000000000000000011010
11111111111111111111111111111101 11111111111111111111111111111101
-------------------------------- --------------------------------
11111111111111111111111111100111 11111111111111111111111111111111
(not expected outputs because while analysing manually it's giving different output because if I will convert the result of XOR(11111111111111111111111111100111) and OR(11111111111111111111111111111111) into decimal, then it will give a huge number which is nowhere nearby the expected output)
Any suggestions what internally is being done with which I am not familiar with or the functionality of bitwise XOR(|), AND(&) and OR(|) under the hood in case of negative numbers
It is how numbers are stored internally when they are negative. For example:
System.out.println(Integer.toBinaryString(26));
System.out.println(Integer.toBinaryString(-3));
will output:
0000000.............0000...11010 // the zeros in front are not shown
11111111111111111111111111111101
So now it should make sense what happens when you XOR or AND. They are represented with 32 bits, that is where your miss-understanding is.

Power of 10 problems

I am running in stupid problem, have never seen it before.
I am trying to do a very simple task for some mathematical calculation that requires growing power of 10.
To start with I wrote a very simple growing loop that works fine, but what I do not understanding, the higher value get, the not precise results get.
I would like to know why this happen? and how to fix it?
My test code:
public class PowerOfTen {
private int lineCounter = 1;
public static void main(String[] args) {
PowerOfTen powerOfTen = new PowerOfTen();
powerOfTen.growingOfTenMethodOne();
powerOfTen.growingOfTenMethodTwo();
}
public void growingOfTenMethodOne() {
double MAX_VALUE = 1e50;
lineCounter = 1;
for (double i = 1; i < MAX_VALUE; i = i * 10) {
System.out.printf("%03d%1s%f\n", lineCounter, " | ", i);
lineCounter++;
}
}
public void growingOfTenMethodTwo() {
double MAX_VALUE = 50;
lineCounter = 1;
for (double i = 0; i < MAX_VALUE; i++) {
System.out.printf("%03d%1s%f\n", lineCounter, " | ", Math.pow(10, i));
lineCounter++;
}
}
}
Both methods works and suppose to return correct results, but on both of them give some not accurate results as you can see in examples below.
Method 1:
line 24, 26, 27, 28, 31, 32, 33 etc is not returning correct results
022 | 1000000000000000000000.000000
023 | 10000000000000000000000.000000
024 | 99999999999999990000000.000000
025 | 1000000000000000000000000.000000
026 | 9999999999999999000000000.000000
027 | 99999999999999990000000000.000000
028 | 999999999999999900000000000.000000
029 | 10000000000000000000000000000.000000
030 | 100000000000000000000000000000.000000
031 | 999999999999999900000000000000.000000
032 | 9999999999999999000000000000000.000000
033 | 99999999999999990000000000000000.000000
Method 2:
line 24, 30 is not returning correct results
022 | 1000000000000000000000.000000
023 | 10000000000000000000000.000000
024 | 99999999999999990000000.000000
025 | 1000000000000000000000000.000000
026 | 10000000000000000000000000.000000
027 | 100000000000000000000000000.000000
028 | 1000000000000000000000000000.000000
029 | 10000000000000000000000000000.000000
030 | 100000000000000010000000000000.000000
031 | 1000000000000000000000000000000.000000
032 | 10000000000000000000000000000000.000000
033 | 100000000000000000000000000000000.000000
It is not a "stupid problem"; it is one of the core problems of computer science ... dealing with the fact that correct representation of numbers is a non-trivial task.
You might want to start reading here for example.
Anybody who is "programming" should understand what this actually means; and how it affects your application/solution.
This is caused because of convertion from double-precision floating-point format to decimal format. Just like there are periodic numbers in decimals, there are numbers which are periodic in binary but arent in decimal.
Those cases are periodic in binary.
A Java floating point double is always an IEEE754 64 bit floating point type.
Method 1.
A floating point double can represent integers accurately up to the 53rd power of 2. So inaccuracies will creep in once you exceed 9,007,199,254,740,992. (It's a common misconception that floating points always introduce inaccuracy: in your particular case the calculation will be exact for the first few iterations of your loop).
Method 2.
Math.pow(x, y) is implemented as exp(y log x) if the two arguments are floating point. That will introduce numerical precision issues, due to a floating point double being only accurate to about 15 significant figures.
It all comes down to Java specification.
In Java, Integer uses 32 bits to represent its value. FLOAT uses a 24 bit mantissa, so integers greater than 2^23 will have their least significant bits truncated. For example 33554435 (or 0x200003) will be truncated to around 33554432 +/- 4. DOUBLE uses a 53 bit mantissa, so will be able to represent a 32bit integer without lost of data.
Check this article that explains everything in a very simple way.

How does this print "hello world"?

I discovered this oddity:
for (long l = 4946144450195624l; l > 0; l >>= 5)
System.out.print((char) (((l & 31 | 64) % 95) + 32));
Output:
hello world
How does this work?
The number 4946144450195624 fits 64 bits, and its binary representation is:
10001100100100111110111111110111101100011000010101000
The program decodes a character for every 5-bits group, from right to left
00100|01100|10010|01111|10111|11111|01111|01100|01100|00101|01000
d | l | r | o | w | | o | l | l | e | h
5-bit codification
For 5 bits, it is possible to represent 2⁵ = 32 characters. The English alphabet contains 26 letters, and this leaves room for 32 - 26 = 6 symbols
apart from letters. With this codification scheme, you can have all 26 (one case) English letters and 6 symbols (space being among them).
Algorithm description
The >>= 5 in the for loop jumps from group to group, and then the 5-bits group gets isolated ANDing the number with the mask 31₁₀ = 11111₂ in the sentence l & 31.
Now the code maps the 5-bit value to its corresponding 7-bit ASCII character. This is the tricky part. Check the binary representations for the lowercase
alphabet letters in the following table:
ASCII | ASCII | ASCII | Algorithm
character | decimal value | binary value | 5-bit codification
--------------------------------------------------------------
space | 32 | 0100000 | 11111
a | 97 | 1100001 | 00001
b | 98 | 1100010 | 00010
c | 99 | 1100011 | 00011
d | 100 | 1100100 | 00100
e | 101 | 1100101 | 00101
f | 102 | 1100110 | 00110
g | 103 | 1100111 | 00111
h | 104 | 1101000 | 01000
i | 105 | 1101001 | 01001
j | 106 | 1101010 | 01010
k | 107 | 1101011 | 01011
l | 108 | 1101100 | 01100
m | 109 | 1101101 | 01101
n | 110 | 1101110 | 01110
o | 111 | 1101111 | 01111
p | 112 | 1110000 | 10000
q | 113 | 1110001 | 10001
r | 114 | 1110010 | 10010
s | 115 | 1110011 | 10011
t | 116 | 1110100 | 10100
u | 117 | 1110101 | 10101
v | 118 | 1110110 | 10110
w | 119 | 1110111 | 10111
x | 120 | 1111000 | 11000
y | 121 | 1111001 | 11001
z | 122 | 1111010 | 11010
Here you can see that the ASCII characters, we want to map, begin with the 7th and 6th bit set (11xxxxx₂) (except for space, which only has the 6th bit on). You could OR the 5-bit
codification with 96 (96₁₀ = 1100000₂) and that should be enough to do the mapping, but that wouldn't work for space (darn space!).
Now we know that special care has to be taken to process space at the same time as the other characters. To achieve this, the code turns the 7th bit on (but not the 6th) on the extracted 5-bit group with an OR 64 64₁₀ = 1000000₂ (l & 31 | 64).
So far the 5-bit group is of the form: 10xxxxx₂ (space would be 1011111₂ = 95₁₀).
If we can map space to 0 unaffecting other values, then we can turn the 6th bit on and that should be all.
Here is what the mod 95 part comes to play. Space is 1011111₂ = 95₁₀, using the modulus
operation (l & 31 | 64) % 95). Only space goes back to 0, and after this, the code turns the 6th bit on by adding 32₁₀ = 100000₂
to the previous result, ((l & 31 | 64) % 95) + 32), transforming the 5-bit value into a valid ASCII character.
isolates 5 bits --+ +---- takes 'space' (and only 'space') back to 0
| |
v v
(l & 31 | 64) % 95) + 32
^ ^
turns the | |
7th bit on ------+ +--- turns the 6th bit on
The following code does the inverse process, given a lowercase string (maximum 12 characters), returns the 64-bit long value that could be used with the OP's code:
public class D {
public static void main(String... args) {
String v = "hello test";
int len = Math.min(12, v.length());
long res = 0L;
for (int i = 0; i < len; i++) {
long c = (long) v.charAt(i) & 31;
res |= ((((31 - c) / 31) * 31) | c) << 5 * i;
}
System.out.println(res);
}
}
The following Groovy script prints intermediate values.
String getBits(long l) {
return Long.toBinaryString(l).padLeft(8, '0');
}
for (long l = 4946144450195624l; l > 0; l >>= 5) {
println ''
print String.valueOf(l).toString().padLeft(16, '0')
print '|' + getBits((l & 31))
print '|' + getBits(((l & 31 | 64)))
print '|' + getBits(((l & 31 | 64) % 95))
print '|' + getBits(((l & 31 | 64) % 95 + 32))
print '|';
System.out.print((char) (((l & 31 | 64) % 95) + 32));
}
Here it is:
4946144450195624|00001000|01001000|01001000|01101000|h
0154567014068613|00000101|01000101|01000101|01100101|e
0004830219189644|00001100|01001100|01001100|01101100|l
0000150944349676|00001100|01001100|01001100|01101100|l
0000004717010927|00001111|01001111|01001111|01101111|o
0000000147406591|00011111|01011111|00000000|00100000|
0000000004606455|00010111|01010111|01010111|01110111|w
0000000000143951|00001111|01001111|01001111|01101111|o
0000000000004498|00010010|01010010|01010010|01110010|r
0000000000000140|00001100|01001100|01001100|01101100|l
0000000000000004|00000100|01000100|01000100|01100100|d
Interesting!
Standard ASCII characters which are visible are in range of 32 to 127.
That's why you see 32 and 95 (127 - 32) there.
In fact, each character is mapped to 5 bits here, (you can find what is 5 bit combination for each character), and then all bits are concatenated to form a large number.
Positive longs are 63 bit numbers, large enough to hold encrypted form of 12 characters. So it is large enough to hold Hello word, but for larger texts you shall use larger numbers, or even a BigInteger.
In an application we wanted to transfer visible English characters, Persian characters and symbols via SMS. As you see, there are 32 (number of Persian characters) + 95 (number of English characters and standard visible symbols) = 127 possible values, which can be represented with 7 bits.
We converted each UTF-8 (16 bit) character to 7 bits, and gain more than a 56% compression ratio. So we could send texts with twice the length in the same number of SMSes. (Somehow, the same thing happened here.)
You are getting a result which happens to be char representation of below values
104 -> h
101 -> e
108 -> l
108 -> l
111 -> o
32 -> (space)
119 -> w
111 -> o
114 -> r
108 -> l
100 -> d
You've encoded characters as 5-bit values and packed 11 of them into a 64 bit long.
(packedValues >> 5*i) & 31 is the i-th encoded value with a range 0-31.
The hard part, as you say, is encoding the space. The lowercase English letters occupy the contiguous range 97-122 in Unicode (and ASCII, and most other encodings), but the space is 32.
To overcome this, you used some arithmetic. ((x+64)%95)+32 is almost the same as x + 96 (note how bitwise OR is equivalent to addition, in this case), but when x=31, we get 32.
It prints "hello world" for a similar reason this does:
for (int k=1587463874; k>0; k>>=3)
System.out.print((char) (100 + Math.pow(2,2*(((k&7^1)-1)>>3 + 1) + (k&7&3)) + 10*((k&7)>>2) + (((k&7)-7)>>3) + 1 - ((-(k&7^5)>>3) + 1)*80));
But for a somewhat different reason than this:
for (int k=2011378; k>0; k>>=2)
System.out.print((char) (110 + Math.pow(2,2*(((k^1)-1)>>21 + 1) + (k&3)) - ((k&8192)/8192 + 7.9*(-(k^1964)>>21) - .1*(-((k&35)^35)>>21) + .3*(-((k&120)^120)>>21) + (-((k|7)^7)>>21) + 9.1)*10));
I mostly work with Oracle databases, so I would use some Oracle knowledge to interpret and explain :-)
Let's convert the number 4946144450195624 into binary. For that I use a small function called dec2bin, i.e., decimal-to-binary.
SQL> CREATE OR REPLACE FUNCTION dec2bin (N in number) RETURN varchar2 IS
2 binval varchar2(64);
3 N2 number := N;
4 BEGIN
5 while ( N2 > 0 ) loop
6 binval := mod(N2, 2) || binval;
7 N2 := trunc( N2 / 2 );
8 end loop;
9 return binval;
10 END dec2bin;
11 /
Function created.
SQL> show errors
No errors.
SQL>
Let's use the function to get the binary value -
SQL> SELECT dec2bin(4946144450195624) FROM dual;
DEC2BIN(4946144450195624)
--------------------------------------------------------------------------------
10001100100100111110111111110111101100011000010101000
SQL>
Now the catch is the 5-bit conversion. Start grouping from right to left with 5 digits in each group. We get:
100|01100|10010|01111|10111|11111|01111|01100|01100|00101|01000
We would be finally left with just 3 digits in the end at the right. Because, we had total 53 digits in the binary conversion.
SQL> SELECT LENGTH(dec2bin(4946144450195624)) FROM dual;
LENGTH(DEC2BIN(4946144450195624))
---------------------------------
53
SQL>
hello world has a total of 11 characters (including space), so we need to add two bits to the last group where we were left with just three bits after grouping.
So, now we have:
00100|01100|10010|01111|10111|11111|01111|01100|01100|00101|01000
Now, we need to convert it to 7-bit ASCII value. For the characters it is easy; we need to just set the 6th and 7th bit. Add 11 to each 5-bit group above to the left.
That gives:
1100100|1101100|1110010|1101111|1110111|1111111|1101111|1101100|1101100|1100101|1101000
Let's interpret the binary values. I will use the binary to decimal conversion function.
SQL> CREATE OR REPLACE FUNCTION bin2dec (binval in char) RETURN number IS
2 i number;
3 digits number;
4 result number := 0;
5 current_digit char(1);
6 current_digit_dec number;
7 BEGIN
8 digits := length(binval);
9 for i in 1..digits loop
10 current_digit := SUBSTR(binval, i, 1);
11 current_digit_dec := to_number(current_digit);
12 result := (result * 2) + current_digit_dec;
13 end loop;
14 return result;
15 END bin2dec;
16 /
Function created.
SQL> show errors;
No errors.
SQL>
Let's look at each binary value -
SQL> set linesize 1000
SQL>
SQL> SELECT bin2dec('1100100') val,
2 bin2dec('1101100') val,
3 bin2dec('1110010') val,
4 bin2dec('1101111') val,
5 bin2dec('1110111') val,
6 bin2dec('1111111') val,
7 bin2dec('1101111') val,
8 bin2dec('1101100') val,
9 bin2dec('1101100') val,
10 bin2dec('1100101') val,
11 bin2dec('1101000') val
12 FROM dual;
VAL VAL VAL VAL VAL VAL VAL VAL VAL VAL VAL
---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
100 108 114 111 119 127 111 108 108 101 104
SQL>
Let's look at what characters they are:
SQL> SELECT chr(bin2dec('1100100')) character,
2 chr(bin2dec('1101100')) character,
3 chr(bin2dec('1110010')) character,
4 chr(bin2dec('1101111')) character,
5 chr(bin2dec('1110111')) character,
6 chr(bin2dec('1111111')) character,
7 chr(bin2dec('1101111')) character,
8 chr(bin2dec('1101100')) character,
9 chr(bin2dec('1101100')) character,
10 chr(bin2dec('1100101')) character,
11 chr(bin2dec('1101000')) character
12 FROM dual;
CHARACTER CHARACTER CHARACTER CHARACTER CHARACTER CHARACTER CHARACTER CHARACTER CHARACTER CHARACTER CHARACTER
--------- --------- --------- --------- --------- --------- --------- --------- --------- --------- ---------
d l r o w ⌂ o l l e h
SQL>
So, what do we get in the output?
d l r o w ⌂ o l l e h
That is hello⌂world in reverse. The only issue is the space. And the reason is well explained by #higuaro in his answer. I honestly couldn't interpret the space issue myself at first attempt, until I saw the explanation given in his answer.
I found the code slightly easier to understand when translated into PHP, as follows:
<?php
$result=0;
$bignum = 4946144450195624;
for (; $bignum > 0; $bignum >>= 5){
$result = (( $bignum & 31 | 64) % 95) + 32;
echo chr($result);
}
See live code
Use
out.println((char) (((l & 31 | 64) % 95) + 32 / 1002439 * 1002439));
to make it capitalised.

How AND between two operators works?

I have this code, it results '2'. I googled and searched books but found no good answer that helped me.
int i = 2;
int j = 3;
int x = i & j;
System.out.println (x);
Can anyone can explain.
Here, '&' is the bitwise and operator. Bits are set in the result only if they are set in both of the operands:
Here's the operation:
2: 00000010
& 3: 00000011
-----------
2: 00000010
Of course here, ints are 32 bits, and I only showed the last 8, but the first 24 are all zeroes for these numbers anyway.
It's a bitwise operator.
2 in binary is '10'. 3 in binary is '11'. The bitwise & operator compares them like so:
10
&11
--
10
For each column where both numbers are 1, it will return 1. In this case, the result is '10', which as an int is equal to 2.
Take these examples:
System.out.println(2 & 3);
System.out.println(3 & 4);
System.out.println(8 & 4);
System.out.println(9 & 5);
System.out.println(11 & 7);
// binary representation of above operations
System.out.println(0b10 & 0b11);
System.out.println(0b11 & 0b100);
System.out.println(0b1000 & 0b100);
System.out.println(0b1001 & 0b101);
System.out.println(0b1011 & 0b111);
They result in the output of:
2
0
0
1
3
2
0
0
1
3
Then go back and look at the binary representations and notice how the resulting answers are the Bitwise AND of the numbers (and not the Logical AND &&)
& is a bitwise operator that works as follows:
0 & 0 = 0
1 & 0 = 0
0 & 1 = 0
1 & 1 = 1
and hence 2 & 3 = 2:
2 ==> 00000000000000000000000000000010
&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&
3 ==> 00000000000000000000000000000011
---------
2 ==> 00000000000000000000000000000010
& is the bitwise AND
&& is the logical AND.
2 & 3 delivers correctly 2.
Expressed in binary representation:
10
11
--
10

How do I add binary numbers and ignore the carry?

Supposing the inputs are two integer values. I want to convert the two integer values to binary, perform binary addition, and give the result with the carry ignored (the integer equivalent). How would I go about doing this.
An idea that comes to mind is to convert them to binary strings in some way and use an algorithm for binary addition, and then ignore the carry (delete the carry character from the string, if the carry exists).
Sample Input
One number : 1
Second number : 3
Sample Output
2
Explanation:
The lowest bit in the sum is 1 + 1 = 0
The next bit is 0 + 1 = 1 (the carry from the previous bit is discarded)
The answer is 10 in binary, which is 2.
You are probably looking for the bitwise XOR (exclusive OR) which will provide the following outputs for the given inputs:
^ | 0 | 1
--+---+--
0 | 0 | 1
--+---+--
1 | 1 | 0
It behaves like binary addition ( 1+1 = 10) but ignores the overflow if both operands are 1.
int a = 5; // 101
int b = 6; // 110
a ^ b; // 3 or 011
This is just an XOR of the two integers in binary. In Java you can do
result = v1 ^ v2;

Categories

Resources