Java "int i = byte1 | 0x0200" vs "int i = byte1"?

Java "int i = byte1 | 0x0200" vs "int i = byte1"? - java

In the page Wikipedia - Shifts in Java:
In bit and shift operations, the type byte is implicitly converted to
int. If the byte value is negative, the highest bit is one, then ones
are used to fill up the extra bytes in the int. So
byte b1=-5; int i = b1 | 0x0200;
will give i == -5 as result.
I understand that 0x0200 is equal to 0b0000 0010 0000 0000. But what is the significance of 0x0200 in the passage shown above?
I mean—b1 | 0x0200 will always be equal to i (see "My Test" below), then in the passage above, why not simply write byte b1=-5; int i = b1?
My Test:
public static void main(final String args[]) {
final byte min_byte = Byte.MIN_VALUE; // -128
final byte limit = 0; // according to the bolded words in the passage
for (byte b = min_byte; b < limit; ++b) {
final int i1 = b;
final int i2 = b | 0x0200;
if (i1 != i2) { // this never happens!
System.out.println(b);
}
}
}

But what is the significance of 0x0200 in the passage shown above?
This is done for illustration purposes only: the value 0x200 ORs in a one in a position that is equal to 1 already. The idea is to show that the result is not 0x000002FB, but actually -5, i.e. 0xFFFFFFFB.

I understand that 0x0200 is equal to 0b1111 1110 0000 0000
No, it isn't. The correct value is given by,
int i = 0x0200; // <-- decimal 512
System.out.println(Integer.toBinaryString(i));
Which outputs
1000000000
If we examine your second value,
byte b1 = -5;
System.out.println(Integer.toBinaryString(b1));
We get
11111111111111111111111111111011
Lining up both numbers
11111111111111111111111111111011
00000000000000000000001000000000
It seems clear that the result will be the bit value of -5 (since the only 0 in -5 is also 0 in 0x0200). To determine the significance we can examine
int i = 0x0200; // <-- Decimal 512
System.out.println("Dec: " + Integer.toBinaryString(i).length());
Output
Dec: 10
So, the given bitwise OR will force the tenth bit to be true. It was true in your input byte, but if you used - Decimal 1535 (0b 101 1111 1111) then you would get,
System.out.println(1535 | 0x0200);
Output is
2047
Because if you perform a bitwise-or on the two numbers
01000000000
10111111111
you get
11111111111

Related

How does Xor determine which int is different?

here's the problem that's being solved
"You're given three integers, a, b and c. It is guaranteed that two of these integers are equal to each other. What is the value of the third integer? "
here's the code
int extraNumber(int a, int b, int c) {
return a^b^c;
}
I understand the caret " ^ " mean XOR, which means "this OR that, but not both" in simple terms but what I fail to understand is how this code determines which int is different from the others, or perhaps I don't understand XOR properly?

Look at it at bit-level:
A value x xor-ed with itself will always result in 0 (check: 1^1=0 and 0^0=0).
A value y xor-ed with 0 will always result in the same value y (check: 1^0=1 and 0^0=0).
Since it holds true for individual bits and int are just 32 bits, the same rules hold true for full int values.
Therefore the method doesn't need to figure out "which values are different" because a value xor-ed with itself with cancel out to 0 and xor-ing the "remaining" value with 0 will just return the same value.

Joachim's answer is correct, and to add bit more details with an example, let's say you pass three arguments (2, 3, 2) to your method, the bit level format usually ...16 8 4 2 1 starts from right side, as per truth table for XOR,
https://en.wikipedia.org/wiki/XOR_gate
private int extraNumber(int a, int b, int c) {
// 8421 -> this is bit level,
// let's compare a^b
// 0010 = 2 (a value)
// 0011 = 3 (b value, sum of 2nd + 1st bits)
// 0001 = 1 (after XOR with a^b)
// 0010 = 2 (c value)
// 0011 = 3 (the final output)
return a^b^c;
}

you can also check the intermediate results, by breaking up the code a^b^c into two statements.
for example
int a = 0b1010_1010;
int b = 0b1010_1010; //same value as b
int c = 0b1111_0000; //the remaining item
//CASE 1 (xor the different values first)
int d = a^c; // results in 0b0101_1010
int e = d^b; // results in 0b1111_0000 (the answer)
//CASE 2 (xor the same values first)
int d = a^b; // results in 0b0000_0000
int e = d^c; // results in 0b1111_0000 (the answer)

What is the purpose of low and high nibble when converting a string to a HexString

Recently I have been going through some examples of MD5 to start getting an understanding of security and MD5 has been fairly simple to understand for the most part and a good starting point even though it is no longer secure. Despite this I have a question regarding high and lo nibbles when it comes to converting a string to a hex string.
So I know high and low nibbles are equal to half a byte or also can be hex digits that represent a single hexadecimal digit. What I am not understanding though is exactly how they work and the purpose that they serve. I have been searching on google and I can't find much of an answer that will help explain what they do in the context that they are in. Here is the context of the conversion:
private static String toHexString( byte[] byteArray )
{
final String HEX_CHARS = "0123456789ABCDEF";
byte[] result = new byte[byteArray.length << 1];
int len = byteArray.length;
for( int i = 0 ; i < len ; i++ )
{
byte b = byteArray[i]
int lo4 = b & 0x0F;
int hi4 = ( b & 0xF0 ) >> 4;
result[i * 2] = (byte)HEX_CHARS.charAt( hi4 );
result[i * 2 + 1] = (byte)HEX_CHARS.charAt( lo4 );
}
return new String( result );
}
I don't exactly understand what is going on in the for statement. I would appreciate any help understanding this and if there is some link to some places that I can learn more about this please also leave it.
I understand the base definition of nibble but not the operations and what the assignment to the number 4 is doing either.
If I need to post the full example code I will just ask as I am unsure if it is needed.

This code simply converts a byte array to hexadecimal representation. Within the for-loop, each byte is converted into two characters. I think it's easier to understand it on an example.
Assume one of the bytes in your array is, say, 218 (unsigned). That's 1101 1010 in binary.
lo4 gets the lowest 4 bits by AND-ing the byte with the bitmask 00001111:
int lo4 = b & 0x0F;
This results in 1010, 10 in decimal.
hi4 gets the highest 4 bits by AND-ing with the bitmask 1111 0000 and shifting 4 bits to the right:
int hi4 = ( b & 0xF0 ) >> 4;
This results in 1101, 13 in decimal.
Now to get the hexadecimal representation of this byte you only need to convert 10 and 13 to their hexadecimal representations and concatenate. For this you simply look up the character in the prepared HEX_CHARS string at the specific index. 10 -> A, 13 -> D, resulting in 218 -> DA.

It's just bit operations. The & character takes the literal bit value of each and does a logical and on them.
int lo4 = b & 0x0F;
for instance if b = 24 then it will evaluate to this
00011000
+00001111
=00001000
The second such line does the same on the first four bits.
00011000
+11110000
=00010000
the '>>' shifts all of the bits a certain number in that direction so
00010000 >> 4 = 00000001.
This is done so that you can derive the hex value from the number. Since each character in hex can represent 4 bits by splitting the number into pieces of 4 bits we can convert it.
in the case of b = 24 we no have lo4 = 1000 or 8 and hi4 = 0001 or 1. The last part of the loop assigns the character value for each.
Hex_chars[hi4] = '1' and Hex_chars[lo4] = '8' which gives you "18" for that part of the string which is 24 in hex.

How does cast in java work? [duplicate]

int i =132;
byte b =(byte)i; System.out.println(b);
Mindboggling. Why is the output -124?

In Java, an int is 32 bits. A byte is 8 bits .
Most primitive types in Java are signed, and byte, short, int, and long are encoded in two's complement. (The char type is unsigned, and the concept of a sign is not applicable to boolean.)
In this number scheme the most significant bit specifies the sign of the number. If more bits are needed, the most significant bit ("MSB") is simply copied to the new MSB.
So if you have byte 255: 11111111
and you want to represent it as an int (32 bits) you simply copy the 1 to the left 24 times.
Now, one way to read a negative two's complement number is to start with the least significant bit, move left until you find the first 1, then invert every bit afterwards. The resulting number is the positive version of that number
For example: 11111111 goes to 00000001 = -1. This is what Java will display as the value.
What you probably want to do is know the unsigned value of the byte.
You can accomplish this with a bitmask that deletes everything but the least significant 8 bits. (0xff)
So:
byte signedByte = -1;
int unsignedByte = signedByte & (0xff);
System.out.println("Signed: " + signedByte + " Unsigned: " + unsignedByte);
Would print out: "Signed: -1 Unsigned: 255"
What's actually happening here?
We are using bitwise AND to mask all of the extraneous sign bits (the 1's to the left of the least significant 8 bits.)
When an int is converted into a byte, Java chops-off the left-most 24 bits
1111111111111111111111111010101
&
0000000000000000000000001111111
=
0000000000000000000000001010101
Since the 32nd bit is now the sign bit instead of the 8th bit (and we set the sign bit to 0 which is positive), the original 8 bits from the byte are read by Java as a positive value.

132 in digits (base 10) is 1000_0100 in bits (base 2) and Java stores int in 32 bits:
0000_0000_0000_0000_0000_0000_1000_0100
Algorithm for int-to-byte is left-truncate; Algorithm for System.out.println is two's-complement (Two's-complement is if leftmost bit is 1, interpret as negative one's-complement (invert bits) minus-one.); Thus System.out.println(int-to-byte( )) is:
interpret-as( if-leftmost-bit-is-1[ negative(invert-bits(minus-one(] left-truncate(0000_0000_0000_0000_0000_0000_1000_0100) [)))] )
=interpret-as( if-leftmost-bit-is-1[ negative(invert-bits(minus-one(] 1000_0100 [)))] )
=interpret-as(negative(invert-bits(minus-one(1000_0100))))
=interpret-as(negative(invert-bits(1000_0011)))
=interpret-as(negative(0111_1100))
=interpret-as(negative(124))
=interpret-as(-124)
=-124 Tada!!!

byte in Java is signed, so it has a range -2^7 to 2^7-1 - ie, -128 to 127.
Since 132 is above 127, you end up wrapping around to 132-256=-124. That is, essentially 256 (2^8) is added or subtracted until it falls into range.
For more information, you may want to read up on two's complement.

132 is outside the range of a byte which is -128 to 127 (Byte.MIN_VALUE to Byte.MAX_VALUE)
Instead the top bit of the 8-bit value is treated as the signed which indicates it is negative in this case. So the number is 132 - 256 = -124.

here is a very mechanical method without the distracting theories:
Convert the number into binary representation (use a calculator ok?)
Only copy the rightmost 8 bits (LSB) and discard the rest.
From the result of step#2, if the leftmost bit is 0, then use a calculator to convert the number to decimal. This is your answer.
Else (if the leftmost bit is 1) your answer is negative. Leave all rightmost zeros and the first non-zero bit unchanged. And reversed the rest, that is, replace 1's by 0's and 0's by 1's. Then use a calculator to convert to decimal and append a negative sign to indicate the value is negative.
This more practical method is in accordance to the much theoretical answers above. So, those still reading those Java books saying to use modulo, this is definitely wrong since the 4 steps I outlined above is definitely not a modulo operation.

Two's complement Equation:
In Java, byte (N=8) and int (N=32) are represented by the 2s-complement shown above.
From the equation, a7 is negative for byte but positive for int.
coef: a7 a6 a5 a4 a3 a2 a1 a0
Binary: 1 0 0 0 0 1 0 0
----------------------------------------------
int: 128 + 0 + 0 + 0 + 0 + 4 + 0 + 0 = 132
byte: -128 + 0 + 0 + 0 + 0 + 4 + 0 + 0 = -124

often in books you will find the explanation of casting from int to byte as being performed by modulus division. this is not strictly correct as shown below
what actually happens is the 24 most significant bits from the binary value of the int number are discarded leaving confusion if the remaining leftmost bit is set which designates the number as negative
public class castingsample{
public static void main(String args[]){
int i;
byte y;
i = 1024;
for(i = 1024; i > 0; i-- ){
y = (byte)i;
System.out.print(i + " mod 128 = " + i%128 + " also ");
System.out.println(i + " cast to byte " + " = " + y);
}
}
}

A quick algorithm that simulates the way that it work is the following:
public int toByte(int number) {
int tmp = number & 0xff
return (tmp & 0x80) == 0 ? tmp : tmp - 256;
}
How this work ? Look to daixtr answer. A implementation of exact algorithm discribed in his answer is the following:
public static int toByte(int number) {
int tmp = number & 0xff;
if ((tmp & 0x80) == 0x80) {
int bit = 1;
int mask = 0;
for(;;) {
mask |= bit;
if ((tmp & bit) == 0) {
bit <<=1;
continue;
}
int left = tmp & (~mask);
int right = tmp & mask;
left = ~left;
left &= (~mask);
tmp = left | right;
tmp = -(tmp & 0xff);
break;
}
}
return tmp;
}

If you want to understand this mathematically, like how this works
so basically numbers b/w -128 to 127 will be written same as their decimal value, above that its (your number - 256).
eg. 132, the answer will be
132 - 256 = - 124
i.e.
256 + your answer in the number
256 + (-124) is 132
Another Example
double a = 295.04;
int b = 300;
byte c = (byte) a;
byte d = (byte) b; System.out.println(c + " " + d);
the Output will be 39 44
(295 - 256) (300 - 256)
NOTE: it won't consider numbers after the decimal.

Conceptually, repeated subtractions of 256 are made to your number, until it is in the range -128 to +127. So in your case, you start with 132, then end up with -124 in one step.
Computationally, this corresponds to extracting the 8 least significant bits from your original number. (And note that the most significant bit of these 8 becomes the sign bit.)
Note that in other languages this behaviour is not defined (e.g. C and C++).

In java int takes 4 bytes=4x8=32 bits
byte = 8 bits range=-128 to 127
converting 'int' into 'byte' is like fitting big object into small box
if sign in -ve takes 2's complement
example 1: let number be 130
step 1:130 interms of bits =1000 0010
step 2:condider 1st 7 bits and 8th bit is sign(1=-ve and =+ve)
step 3:convert 1st 7 bits to 2's compliment
000 0010
-------------
111 1101
add 1
-------------
111 1110 =126
step 4:8th bit is "1" hence the sign is -ve
step 5:byte of 130=-126
Example2: let number be 500
step 1:500 interms of bits 0001 1111 0100
step 2:consider 1st 7 bits =111 0100
step 3: the remained bits are '11' gives -ve sign
step 4: take 2's compliment
111 0100
-------------
000 1011
add 1
-------------
000 1100 =12
step 5:byte of 500=-12
example 3: number=300
300=1 0010 1100
1st 7 bits =010 1100
remaining bit is '0' sign =+ve need not take 2's compliment for +ve sign
hence 010 1100 =44
byte(300) =44

N is input number
case 1: 0<=N<=127 answer=N;
case 2: 128<=N<=256 answer=N-256
case 3: N>256
temp1=N/256;
temp2=N-temp*256;
if temp2<=127 then answer=temp2;
else if temp2>=128 then answer=temp2-256;
case 4: negative number input
do same procedure.just change the sign of the solution

Why does i = i + i give me 0?

I have a simple program:
public class Mathz {
static int i = 1;
public static void main(String[] args) {
while (true){
i = i + i;
System.out.println(i);
}
}
}
When I run this program, all I see is 0 for i in my output. I would have expected the first time round we would have i = 1 + 1, followed by i = 2 + 2, followed by i = 4 + 4 etc.
Is this due to the fact that as soon as we try to re-declare i on the left hand-side, its value gets reset to 0?
If anyone can point me into the finer details of this that would be great.
Change the int to long and it seems to be printing numbers as expected. I'm surprised at how fast it hits the max 32-bit value!

Introduction
The problem is integer overflow. If it overflows, it goes back to the minimum value and continues from there. If it underflows, it goes back to the maximum value and continues from there. The image below is of an Odometer. I use this to explain overflows. It's a mechanical overflow but a good example still.
In an Odometer, the max digit = 9, so going beyond the maximum means 9 + 1, which carries over and gives a 0 ; However there is no higher digit to change to a 1, so the counter resets to zero. You get the idea - "integer overflows" come to mind now.
The largest decimal literal of type int is 2147483647 (231-1). All
decimal literals from 0 to 2147483647 may appear anywhere an int
literal may appear, but the literal 2147483648 may appear only as the
operand of the unary negation operator -.
If an integer addition overflows, then the result is the low-order
bits of the mathematical sum as represented in some sufficiently large
two's-complement format. If overflow occurs, then the sign of the
result is not the same as the sign of the mathematical sum of the two
operand values.
Thus, 2147483647 + 1 overflows and wraps around to -2147483648. Hence int i=2147483647 + 1 would be overflowed, which isn't equal to 2147483648. Also, you say "it always prints 0". It does not, because http://ideone.com/WHrQIW. Below, these 8 numbers show the point at which it pivots and overflows. It then starts to print 0s. Also, don't be surprised how fast it calculates, the machines of today are rapid.
268435456
536870912
1073741824
-2147483648
0
0
0
0
Why integer overflow "wraps around"
Original PDF

The issue is due to integer overflow.
In 32-bit twos-complement arithmetic:
i does indeed start out having power-of-two values, but then overflow behaviors start once you get to 230:
230 + 230 = -231
-231 + -231 = 0
...in int arithmetic, since it's essentially arithmetic mod 2^32.

No, it does not print only zeros.
Change it to this and you will see what happens.
int k = 50;
while (true){
i = i + i;
System.out.println(i);
k--;
if (k<0) break;
}
What happens is called overflow.

static int i = 1;
public static void main(String[] args) throws InterruptedException {
while (true){
i = i + i;
System.out.println(i);
Thread.sleep(100);
}
}
out put:
2
4
8
16
32
64
...
1073741824
-2147483648
0
0
when sum > Integer.MAX_INT then assign i = 0;

Since I don't have enough reputation I cannot post the picture of the output for the same program in C with controlled output, u can try yourself and see that it actually prints 32 times and then as explained due to overflow i=1073741824 + 1073741824 changes to
-2147483648 and one more further addition is out of range of int and turns to Zero .
#include<stdio.h>
#include<conio.h>
int main()
{
static int i = 1;
while (true){
i = i + i;
printf("\n%d",i);
_getch();
}
return 0;
}

The value of i is stored in memory using a fixed quantity of binary digits. When a number needs more digits than are available, only the lowest digits are stored (the highest digits get lost).
Adding i to itself is the same as multiplying i by two. Just like multiplying a number by ten in decimal notation can be performed by sliding each digit to the left and putting a zero on the right, multiplying a number by two in binary notation can be performed the same way. This adds one digit on the right, so a digit gets lost on the left.
Here the starting value is 1, so if we use 8 digits to store i (for example),
after 0 iterations, the value is 00000001
after 1 iteration , the value is 00000010
after 2 iterations, the value is 00000100
and so on, until the final non-zero step
after 7 iterations, the value is 10000000
after 8 iterations, the value is 00000000
No matter how many binary digits are allocated to store the number, and no matter what the starting value is, eventually all of the digits will be lost as they are pushed off to the left. After that point, continuing to double the number will not change the number - it will still be represented by all zeroes.

It is correct, but after 31 iterations, 1073741824 + 1073741824 doesn't calculate correctly (overflows) and after that prints only 0.
You can refactor to use BigInteger, so your infinite loop will work correctly.
public class Mathz {
static BigInteger i = new BigInteger("1");
public static void main(String[] args) {
while (true){
i = i.add(i);
System.out.println(i);
}
}
}

For debugging such cases it is good to reduce the number of iterations in the loop. Use this instead of your while(true):
for(int r = 0; r<100; r++)
You can then see that it starts with 2 and is doubling the value until it is causing an overflow.

I'll use an 8-bit number for illustration because it can be completely detailed in a short space. Hex numbers begin with 0x, while binary numbers begin with 0b.
The max value for an 8-bit unsigned integer is 255 (0xFF or 0b11111111).
If you add 1, you would typically expect to get: 256 (0x100 or 0b100000000).
But since that's too many bits (9), that's over the max, so the first part just gets dropped, leaving you with 0 effectively (0x(1)00 or 0b(1)00000000, but with the 1 dropped).
So when your program runs, you get:
1 = 0x01 = 0b1
2 = 0x02 = 0b10
4 = 0x04 = 0b100
8 = 0x08 = 0b1000
16 = 0x10 = 0b10000
32 = 0x20 = 0b100000
64 = 0x40 = 0b1000000
128 = 0x80 = 0b10000000
256 = 0x00 = 0b00000000 (wraps to 0)
0 + 0 = 0 = 0x00 = 0b00000000
0 + 0 = 0 = 0x00 = 0b00000000
0 + 0 = 0 = 0x00 = 0b00000000
...

The largest decimal literal of type int is 2147483648 (=231). All decimal literals from 0 to 2147483647 may appear anywhere an int literal may appear, but the literal 2147483648 may appear only as the operand of the unary negation operator -.
If an integer addition overflows, then the result is the low-order bits of the mathematical sum as represented in some sufficiently large two's-complement format. If overflow occurs, then the sign of the result is not the same as the sign of the mathematical sum of the two operand values.

Anding with 0xff, clarification needed

In the following snippet consider replacing line 8 with commented equivalent
1. private static String ipToText(byte[] ip) {
2. StringBuffer result = new StringBuffer();
3.
4. for (int i = 0; i < ip.length; i++) {
5. if (i > 0)
6. result.append(".");
7.
8. result.append(ip[i]); // compare with result.append(0xff & ip[i]);
9. }
10.
11. return result.toString();
12. }
.equals() test confirms that adding 0xff does not change anything. Is there a reason for this mask to be applied?

byte in Java is a number between −128 and 127 (signed, like every integer in Java (except for char if you want to count it)). By anding with 0xff you're forcing it to be a positive int between 0 and 255.
It works because Java will perform a widening conversion to int, using sign extension, so instead of a negative byte you will have a negative int. Masking with 0xff will leave only the lower 8 bits, thus making the number positive again (and what you initially intended).
You probably didn't notice the difference because you tested with a byte[] with only values smaller than 128.
Small example:
public class A {
public static void main(String[] args) {
int[] ip = new int[] {192, 168, 101, 23};
byte[] ipb = new byte[4];
for (int i =0; i < 4; i++) {
ipb[i] = (byte)ip[i];
}
for (int i =0; i < 4; i++) {
System.out.println("Byte: " + ipb[i] + ", And: " + (0xff & ipb[i]));
}
}
}
This prints
Byte: -64, And: 192
Byte: -88, And: 168
Byte: 101, And: 101
Byte: 23, And: 23
showing the difference between what's in the byte, what went into the byte when it still was an int and what the result of the & operation is.

As you're already working with an array of bytes here, and you're doing a bitwise operation, you can ignore how Java treats all bytes as signed. After all, you're working on the bit level now, and there is no such thing as "signed" or "unsigned" values on the level of bits.
Masking an 8-bit value (a byte) with all 1's is just a waste of cycles, as nothing will ever be masked off. A bitewise AND will return a bit true if both bits being compared are true, thus if the mask contains all 1's, then you're guaranteed that all bits of the masked value will remain unchanged after the AND operation.
Consider the following examples:
Mask off the upper nibble:
0110 1010
AND 0000 1111 (0x0F)
= 0000 1010
Mask off the lower nibble:
0110 1010
AND 1111 0000 (0xF0)
= 0110 0000
Mask off... Eh, nothing:
0110 1010
AND 1111 1111 (0xFF)
= 0110 1010
Of course, if you were working with a full blown int here, you'd get the result at the others have said: You'd "force" the int to be the equivalent of an unsigned byte.

In this example, I don't see how it would make any difference. You are anding the 0xff with a byte. A byte by definition has 8 bits, and the add masks off the last 8 bits. So you're taking the last 8 of 8, that's not going to do anything.
Anding with 0xff would make sense if the thing you were anding with it was bigger than a byte, a short or an int or whatever.

This should only make a difference if there are negative bytes. & 0xff is typically used to interpret a byte as unsigned.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.