Java bit unsigned shifting (>>>) give strange result [duplicate]

Java bit unsigned shifting (>>>) give strange result [duplicate] - java

This question already has answers here:
java bit manipulation
(5 answers)
Closed 8 years ago.
I have this code:
int i = 255;
byte b = (byte) i;
int c;
System.out.println(Integer.toBinaryString( i));
System.out.println("b = " + b); // b = -1
c=b>>>1;
System.out.println(Integer.toBinaryString( c));
System.out.println(c);
But I can't understand how it works. I think that unsigned shifting to 255(11111111) should give me 127(0111111) but it doesn't. Is my assumption wrong?

Shift operators including >>> operate on ints. The value of b, which is -1 because byte is signed, is promoted to int before the shift. That is why you see the results that you see.
The reason why 255 is re-interpreted as -1 is that 255 has all its eight bits set to one. When you assign that to a signed 8-bit type of byte, it is interpreted as -1 following two's complement rules.

this is how you can get the expected result
c = (0xFF & b) >>> 1;
see dasblinkenlight's answer for details

Try this, and you will understand:
System.out.println(Integer.toBinaryString(i)); // 11111111
System.out.println(Integer.toBinaryString(b)); // 11111111111111111111111111111111
System.out.println(Integer.toBinaryString(c)); // 1111111111111111111111111111111
Variable int i equals 255, so the first print makes sense.
Variable byte b equals -1, because you store 255 in a single byte.
But when you call Integer.toBinaryString(b), the compiler converts b from byte to int, and (int)-1 == FFFFFFFFh == 11111111111111111111111111111111b, hence the second print.
Variable int c equals b>>>1, so the third print makes sense.

Related

Java not treating 0xFC as a signed integer?

Just when I though I had a fair grasp on how Java treats all Integers/Bytes etc.. as signed numbers, it hit me with another curve ball and got me thinking if I really understand this treatment after all.
This is a piece of assembly code that is supposed to jump to an address if a condition is met (which indeed is met). PC just before the jump is C838h and then after condition check it is supposed to be: C838h + FCh (h = hex) which I thought would be treated as signed so the PC would jump backwards: FCh = -4 in two's compliment negative number. But To my surprise, java ADDED FCh to the PC making it incorrectly jump to C934h instead of back to C834h.
C832: LD B,04 06 0004 (PC=C834)
C834: LD (HL), A 77 9800 (PC=C835)
C835: INC L:00 2C 0001 (PC=C836)
C836: JR NZ, n 20 00FC (PC=C934)
I tested this in a java code and indeed the result was the same:
int a = 0xC838;
int b = 0xFC;
int result = a + b;
System.out.printf("%04X\n", result); //prints C934 (incorrect)
To fix this I had to cast FCh into a byte after checking if the first bit is 1, which it is: 11111100
int a = 0xC838;
int b = (byte) 0xFC;
int result = a + b;
System.out.printf("%04X\n", result); //prints C834 (correct)
In short I guess my question is that I thought java would know that FCh is a negative number but that is not the case unless I cast it to a byte. Why? Sorry I know this question is asked many times and I seem to be asking it myself alot.

0xfc is a positive number. If you want a negative number, then write a negative number. -0x4 would do just fine.
But if you want to apply this to non-constant data, you'll need to tell Java that you want it sign-extended in some way.
The core of the problem is that you have a 32-bit signed integer, but you want it treated like an 8-bit signed integer. The easiest way to achieve that would be to just use byte as you did above.
If you really don't want to write byte, you can write (0xfc << 24) >> 24:
class Main
{
public static void main(String[] args)
{
int a = 0xC838;
int b = (0xfc << 24) >> 24;
int result = a + b;
System.out.printf("%04X\n", result);
}
}
(The 24 derives from the difference of the sizes of int (32 bits) and byte (8 bits)).

Splitting a byte into bits

I have a byte array i.e from byte[0] to byte[2]. I want to first split byte[0] into 8 bits. Then bytes[1] and finally bytes[2]. By splitting I mean that if byte[0]=12345678 then i want to split as variable A=8,Variable B=7, variable C=6,Variable D=5,........ Variable H=0. How to split the byte and store the bits into variables? I want to do this in JAVA

well the bitwise operations seem to be almost what you're talking about.
a byte is composed of 8 bits, and so it goes in the rage of 0 to 255.
written in binary that's 00000000 to 11111111.
Bitwise operations are those that basically use masks to get out as much info as possible out of a byte.
Say for instance you have your 1101 0011 byte (space added for visibility only) = 211 (decimal). You can split that into 2 "variables" b1 and b2 of half a byte each. they'll thus cover the range of 0 to 15.
How you do this is by defining some masks. A mask to get your first half-a-byte-value would be 0000 1111.
You take your value of 11010011 apply the bitwise and (& ) operator to it.
So saying byte b = 211; byte mask1=15; OR byte b=0x11010011; byte mask1=0x00001111;
you then have your variable byte b1=b & mask1;
Thus applying that operation would result in b1=00000011 = 3;
With a mask byte mask2=0x11110000; applying the same operation on b, you'd get byte b2=mask2 & b = 0x11010000;
Now of course that leaves the number b2 possibly too large for you. All you have to do if you want to grab the value 0x1101 is to right shift it. Thus b2>>=4;
You can have your masks in any form, but it's usual to have them in decimal as the powers of 2 (so that you can take any bit out of your byte) or decide what range you want on your variables and make the mask "larger" like 0x00000011, or ox00001100. Those 2 masks would respectively get you 2 values from a byte, each value ranging from 0 to 3, values that you can fit 4 inside a byte.
for more info, chek out the relevant wiki.
Sorry, the values were a little off (since byte seems to go from -128 to 127, but the idea is the same.
Second edit (never used bitwise operations myself lol)... the "0x" notation is for hexadecimals. So you'll actually have to calculate for yourself what 01001111 actually means... pretty sucky :| but it's gonna do the trick.

boolean a = (theByte & 0x1) != 0;
boolean b = (theByte & 0x2) != 0;
boolean c = (theByte & 0x4) != 0;
boolean d = (theByte & 0x8) != 0;
boolean e = (theByte & 0x10) != 0;
boolean f = (theByte & 0x20) != 0;
boolean g = (theByte & 0x40) != 0;
boolean h = (theByte & 0x80) != 0;

How to Convert Int to Unsigned Byte and Back

I need to convert a number into an unsigned byte. The number is always less than or equal to 255, and so it will fit in one byte.
I also need to convert that byte back into that number. How would I do that in Java? I've tried several ways and none work. Here's what I'm trying to do now:
int size = 5;
// Convert size int to binary
String sizeStr = Integer.toString(size);
byte binaryByte = Byte.valueOf(sizeStr);
and now to convert that byte back into the number:
Byte test = new Byte(binaryByte);
int msgSize = test.intValue();
Clearly, this does not work. For some reason, it always converts the number into 65. Any suggestions?

A byte is always signed in Java. You may get its unsigned value by binary-anding it with 0xFF, though:
int i = 234;
byte b = (byte) i;
System.out.println(b); // -22
int i2 = b & 0xFF;
System.out.println(i2); // 234

Java 8 provides Byte.toUnsignedInt to convert byte to int by unsigned conversion. In Oracle's JDK this is simply implemented as return ((int) x) & 0xff; because HotSpot already understands how to optimize this pattern, but it could be intrinsified on other VMs. More importantly, no prior knowledge is needed to understand what a call to toUnsignedInt(foo) does.
In total, Java 8 provides methods to convert byte and short to unsigned int and long, and int to unsigned long. A method to convert byte to unsigned short was deliberately omitted because the JVM only provides arithmetic on int and long anyway.
To convert an int back to a byte, just use a cast: (byte)someInt. The resulting narrowing primitive conversion will discard all but the last 8 bits.

If you just need to convert an expected 8-bit value from a signed int to an unsigned value, you can use simple bit shifting:
int signed = -119; // 11111111 11111111 11111111 10001001
/**
* Use unsigned right shift operator to drop unset bits in positions 8-31
*/
int psuedoUnsigned = (signed << 24) >>> 24; // 00000000 00000000 00000000 10001001 -> 137 base 10
/**
* Convert back to signed by using the sign-extension properties of the right shift operator
*/
int backToSigned = (psuedoUnsigned << 24) >> 24; // back to original bit pattern
http://docs.oracle.com/javase/tutorial/java/nutsandbolts/op3.html
If using something other than int as the base type, you'll obviously need to adjust the shift amount: http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html
Also, bear in mind that you can't use byte type, doing so will result in a signed value as mentioned by other answerers. The smallest primitive type you could use to represent an 8-bit unsigned value would be a short.

Except char, every other numerical data type in Java are signed.
As said in a previous answer, you can get the unsigned value by performing an and operation with 0xFF. In this answer, I'm going to explain how it happens.
int i = 234;
byte b = (byte) i;
System.out.println(b); // -22
int i2 = b & 0xFF;
// This is like casting b to int and perform and operation with 0xFF
System.out.println(i2); // 234
If your machine is 32-bit, then the int data type needs 32-bits to store values. byte needs only 8-bits.
The int variable i is represented in the memory as follows (as a 32-bit integer).
0{24}11101010
Then the byte variable b is represented as:
11101010
As bytes are signed, this value represent -22. (Search for 2's complement to learn more about how to represent negative integers in memory)
Then if you cast is to int it will still be -22 because casting preserves the sign of a number.
1{24}11101010
The the casted 32-bit value of b perform and operation with 0xFF.
1{24}11101010 & 0{24}11111111
=0{24}11101010
Then you get 234 as the answer.

The solution works fine (thanks!), but if you want to avoid casting and leave the low level work to the JDK, you can use a DataOutputStream to write your int's and a DataInputStream to read them back in. They are automatically treated as unsigned bytes then:
For converting int's to binary bytes;
ByteArrayOutputStream bos = new ByteArrayOutputStream();
DataOutputStream dos = new DataOutputStream(bos);
int val = 250;
dos.write(byteVal);
...
dos.flush();
Reading them back in:
// important to use a (non-Unicode!) encoding like US_ASCII or ISO-8859-1,
// i.e., one that uses one byte per character
ByteArrayInputStream bis = new ByteArrayInputStream(
bos.toString("ISO-8859-1").getBytes("ISO-8859-1"));
DataInputStream dis = new DataInputStream(bis);
int byteVal = dis.readUnsignedByte();
Esp. useful for handling binary data formats (e.g. flat message formats, etc.)

The Integer.toString(size) call converts into the char representation of your integer, i.e. the char '5'. The ASCII representation of that character is the value 65.
You need to parse the string back to an integer value first, e.g. by using Integer.parseInt, to get back the original int value.
As a bottom line, for a signed/unsigned conversion, it is best to leave String out of the picture and use bit manipulation as #JB suggests.

Even though it's too late, I'd like to give my input on this as it might clarify why the solution given by JB Nizet works. I stumbled upon this little problem working on a byte parser and to string conversion myself.
When you copy from a bigger size integral type to a smaller size integral type as this java doc says this happens:
https://docs.oracle.com/javase/specs/jls/se7/html/jls-5.html#jls-5.1.3
A narrowing conversion of a signed integer to an integral type T simply discards all but the n lowest order bits, where n is the number of bits used to represent type T. In addition to a possible loss of information about the magnitude of the numeric value, this may cause the sign of the resulting value to differ from the sign of the input value.
You can be sure that a byte is an integral type as this java doc says
https://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html
byte: The byte data type is an 8-bit signed two's complement integer.
So in the case of casting an integer(32 bits) to a byte(8 bits), you just copy the last (least significant 8 bits) of that integer to the given byte variable.
int a = 128;
byte b = (byte)a; // Last 8 bits gets copied
System.out.println(b); // -128
Second part of the story involves how Java unary and binary operators promote operands.
https://docs.oracle.com/javase/specs/jls/se7/html/jls-5.html#jls-5.6.2
Widening primitive conversion (§5.1.2) is applied to convert either or both operands as specified by the following rules:
If either operand is of type double, the other is converted to double.
Otherwise, if either operand is of type float, the other is converted to float.
Otherwise, if either operand is of type long, the other is converted to long.
Otherwise, both operands are converted to type int.
Rest assured, if you are working with integral type int and/or lower it'll be promoted to an int.
// byte b(0x80) gets promoted to int (0xFF80) by the & operator and then
// 0xFF80 & 0xFF (0xFF translates to 0x00FF) bitwise operation yields
// 0x0080
a = b & 0xFF;
System.out.println(a); // 128
I scratched my head around this too :). There is a good answer for this here by rgettman.
Bitwise operators in java only for integer and long?

If you want to use the primitive wrapper classes, this will work, but all java types are signed by default.
public static void main(String[] args) {
Integer i=5;
Byte b = Byte.valueOf(i+""); //converts i to String and calls Byte.valueOf()
System.out.println(b);
System.out.println(Integer.valueOf(b));
}

In terms of readability, I favor Guava's:
UnsignedBytes.checkedCast(long) to convert a signed number to an unsigned byte.
UnsignedBytes.toInt(byte) to convert an unsigned byte to a signed int.

Handling bytes and unsigned integers with BigInteger:
byte[] b = ... // your integer in big-endian
BigInteger ui = new BigInteger(b) // let BigInteger do the work
int i = ui.intValue() // unsigned value assigned to i

in java 7
public class Main {
public static void main(String[] args) {
byte b = -2;
int i = 0 ;
i = ( b & 0b1111_1111 ) ;
System.err.println(i);
}
}
result : 254

I have tested it and understood it.
In Java, the byte is signed, so 234 in one signed byte is -22, in binary, it is "11101010", signed bit has a "1", so with negative's presentation 2's complement, it becomes -22.
And operate with 0xFF, cast 234 to 2 byte signed(32 bit), keep all bit unchanged.
I use String to solve this:
int a = 14206;
byte[] b = String.valueOf(a).getBytes();
String c = new String(b);
System.out.println(Integer.valueOf(c));
and output is 14206.

Java - strange errors with unsigned values

There is 2-bytes array:
private byte[] mData;
and method:
public void setWord(final short pData) {
mData[0] = (byte) (pData >>> 8);
mData[1] = (byte) (pData);
}
I wrote the simple test:
public void testWord() {
Word word = new Word();
word.setWord((short) 0x3FFF);
Assert.assertEquals(0x3F, word.getByte(0));
Assert.assertEquals(0xFF, word.getByte(1));
}
The second assert fails with message "Expected 255, but was -1".
I know, that 0xFF signed short is, in fact, -1, but why JUnit thinks, that they are not equal? And, what is the correct way to implement such classes?

Java does not support unsigned types, so in order for a value to be 255, it must not be a signed byte, which is incapable of holding the value of 255. The 0xFF constant value will be taken as a signed int, and for the comparison, the byte value 0xFF will be converted to an int at -1 as well.
You need to type cast the literal 0xFF to be a byte. Change the assert to be Assert.assertEquals((byte)0xFF, word.getByte(1)); Then the left hand side will evaluate to -1 as well as the right.

The comment from biziclop is correct.
Any Integer number you specify in your code is considered an Integer unless marked otherwise.
Change your assertion to:
Assert.assertEquals((byte)0xFF, word.getByte(1))
And it should pass fine - as the first two bytes of the integer will be considered as a
byte.
Bitwize speeking - basically when you write 0xFF the compiler interprets it as 0x000000FF which is 255.
You want 0xFFFFFFFF which is -1.
Casting to byte is the correct solution here

There are no unsigned types in java.
0xFF is the int 255 and casted to byte overflows to -1.
I usually work with bytes as integers if I want them unsigned. I usually do that this way:
int b1 = getByte() & 0xFF;
For example:
byte byte1 = 0xFF; // 255 = -1
byte byte2 = 0xFE; // 254 = -2
int int1 = (byte1 & 0xFF) + (byte1 & 0xFF); // 255 + 254 = 509

How are integers cast to bytes in Java?

I know Java doesn't allow unsigned types, so I was wondering how it casts an integer to a byte. Say I have an integer a with a value of 255 and I cast the integer to a byte. Is the value represented in the byte 11111111? In other words, is the value treated more as a signed 8 bit integer, or does it just directly copy the last 8 bits of the integer?

This is called a narrowing primitive conversion. According to the spec:
A narrowing conversion of a signed integer to an integral type T simply discards all but the n lowest order bits, where n is the number of bits used to represent type T. In addition to a possible loss of information about the magnitude of the numeric value, this may cause the sign of the resulting value to differ from the sign of the input value.
So it's the second option you listed (directly copying the last 8 bits).
I am unsure from your question whether or not you are aware of how signed integral values are represented, so just to be safe I'll point out that the byte value 1111 1111 is equal to -1 in the two's complement system (which Java uses).

int i = 255;
byte b = (byte)i;
So the value of be in hex is 0xFF but the decimal value will be -1.
int i = 0xff00;
byte b = (byte)i;
The value of b now is 0x00. This shows that java takes the last byte of the integer. ie. the last 8 bits but this is signed.

or does it just directly copy the last
8 bits of the integer
yes, this is the way this casting works

The following fragment casts an int to a byte. If the integer’s value is larger than the range of a byte, it will be reduced modulo (the remainder of an integer division by the) byte’s range.
int a;
byte b;
// …
b = (byte) a;

Just a thought on what is said: Always mask your integer when converting to bytes with 0xFF (for ints). (Assuming myInt was assigned values from 0 to 255).
e.g.
char myByte = (char)(myInt & 0xFF);
why? if myInt is bigger than 255, just typecasting to byte returns a negative value (2's complement) which you don't want.

Byte is 8 bit. 8 bit can represent 256 numbers.(2 raise to 8=256)
Now first bit is used for sign. [if positive then first bit=0, if negative first bit= 1]
let's say you want to convert integer 1099 to byte. just devide 1099 by 256. remainder is your byte representation of int
examples
1099/256 => remainder= 75
-1099/256 =>remainder=-75
2049/256 => remainder= 1
reason why? look at this image http://i.stack.imgur.com/FYwqr.png

According to my understanding, you meant
Integer i=new Integer(2);
byte b=i; //will not work
final int i=2;
byte b=i; //fine
At last
Byte b=new Byte(2);
int a=b; //fine

for (int i=0; i <= 255; i++) {
byte b = (byte) i; // cast int values 0 to 255 to corresponding byte values
int neg = b; // neg will take on values 0..127, -128, -127, ..., -1
int pos = (int) (b & 0xFF); // pos will take on values 0..255
}
The conversion of a byte that contains a value bigger than 127 (i.e,. values 0x80 through 0xFF) to an int results in sign extension of the high-order bit of the byte value (i.e., bit 0x80). To remove the 'extra' one bits, use x & 0xFF; this forces bits higher than 0x80 (i.e., bits 0x100, 0x200, 0x400, ...) to zero but leaves the lower 8 bits as is.
You can also write these; they are all equivalent:
int pos = ((int) b) & 0xFF; // convert b to int first, then strip high bits
int pos = b & 0xFF; // done as int arithmetic -- the cast is not needed
Java automatically 'promotes' integer types whose size (in # of bits) is smaller than int to an int value when doing arithmetic. This is done to provide a more deterministic result (than say C, which is less constrained in its specification).
You may want to have a look at this question on casting a 'short'.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.