public void int2byte(){
int x = 128;
byte y = (byte) x;
System.out.println(Integer.toHexString(y));
}
I got result ffffff80, why not 80?
I got result 7f when x = 127.
Bytes are signed in Java. When you cast 128 to a byte, it becomes -128.
The int 128: 00000000 00000000 00000000 10000000
The byte -128: 10000000
Then, when you widen it back to an int with Integer.toHexString, because it's now negative, it gets sign-extended. This means that a bunch of 1 bits show up, explaining your extra f characters in the hex output.
You can convert it in an unsigned fashion using Byte.toUnsignedInt to prevent the sign extension.
Converts the argument to an int by an unsigned conversion. In an unsigned conversion to an int, the high-order 24 bits of the int are zero and the low-order 8 bits are equal to the bits of the byte argument.
System.out.println(Integer.toHexString(Byte.toUnsignedInt(y)));
An alternative is to bit-mask out the sign-extended bits by manually keeping only the 8 bits of the byte:
System.out.println(Integer.toHexString(y & 0xFF));
Either way, the output is:
80
This is because a positive value of 128 cannot be represented with a signed byte
The range is -128 to 127
So your int becomes sign-extended so instead of 0x00000080 you get 0xffffff80
Edit: As others have explained:
Integer 128 shares the last 8 bits with the byte (-128)
The difference is that Integer 128 has leading zeros.
0x00000080 (0000 0000 0000 0000 0000 0000 1000 0000)
while the byte -128 is just
0x80 (1000 0000)
When you convert byte -128 using Integer.toHexString() it takes the leading 1 and sign-extends it so you get
0xffffff80 (1111 1111 1111 1111 1111 1111 1000 0000)
Related
Pretty basic stuff i am sure but bits are not my forte.
So for some internal calculation i am trying to convert a given input ( constraint is that it would be a integer string for sure) into its hex equivalent, what stumped me is on how to get
Hex signed 2's complement:
My noob code:
private String toHex(String arg, boolean isAllInt) {
String hexVal = null;
log.info("arg {}, isAllInt {}", arg, isAllInt);
if (isAllInt) {
int intVal = Integer.parseInt(arg);
hexVal = Integer.toHexString(intVal);
// some magic to convert this hexVal to its 2's compliment
} else {
hexVal = String.format("%040x", new BigInteger(1, arg.getBytes(StandardCharsets.UTF_8)));
}
log.info("str {} hex {}", arg, hexVal);
return hexVal;
}
Input: 00001
Output: 1
Expected Output: 0001
Input: 00216
Output: D8
Expected Output: 00D8
00216
Input: 1192633166
Output: 4716234E
Expected Output: 4716234E
any predefined library is much welcome or any other useful pointers!
So to pad the hex digits up to either 4 digits or 8 digits, do:
int intVal = Integer.parseInt(arg);
if (intVal >= 0 && intVal <= 0xffff) {
hexVal = String.format("%04x", intVal);
} else {
hexVal = String.format("%08x", intVal);
}
See Java documentation on how the format strings work.
Answering the two's complement aspect.
Two's Complement Representation
Two's complement is an agreement how to represent signed integral numbers in e.g. 16 bits (in olden times, different representations have been used by various processors, e.g. one's complement or sign-magnitude).
Positive numbers and zero are represented as expected:
0 is 0000 0000 0000 0000 or hex 0000
1 is 0000 0000 0000 0001 or hex 0001
2 is 0000 0000 0000 0010 or hex 0002
3 is 0000 0000 0000 0011 or hex 0003
4 is 0000 0000 0000 0100 or hex 0004
Negative numbers are represented by adding 1 0000 0000 0000 0000 to them, giving:
-1 is 1111 1111 1111 1111 or hex ffff
-2 is 1111 1111 1111 1110 or hex fffe
-3 is 1111 1111 1111 1101 or hex fffd
This is equivalent to: take the positive representation, flip all bits, and add 1.
For negative numbers, the highest bit is always 1. And that's how the machine distinguishes positive and negative numbers.
All processors in use today do their integer arithmetic based on two's complement representation, so there's typically no need to do special tricks. All the Java datatypes like byte, short, int, and long are defined to be signed numbers in two's complement representation.
In a comment you wrote
2's compliment is hex of negative of original value
That mixes up the concepts a bit. Two's complement is basically defined on bit patterns, and groups of 4 bits from these bit patterns can nicely be written as hex digits. Two's complement is about representing negative values as bit patterns, but from your question and comments I read that you don't expect negative values, so two's complement shouldn't concern you.
Hex Strings
To represent signed values as hex strings, Java (and most other languages / environments) simply looks at the bit patterns, ignoring their positive / negative interpretation, meaning that e.g. -30 (1111 1111 1110 0010) does not get shown as "-1e" with a minus sign, but as "ffe2".
Because of this, negative values will always get translated to a string with maximum length according to the value's size (16 bits, 32 bits, 64 bits giving 4, 8, or 16 hex digits), because the highest bit will be 1, resulting in a leading hex digit surely not being zero. So for negative values, there's no need to do any padding.
Small positive values will have leading zeros in their hex representation, and Java's toHexString() method suppresses them, so 1 (0000 0000 0000 0001) becomes "1" and not "0001". That's why e.g. format("%04x", ...), as in #nos's answer, is useful.
Decimal Binary
x1 = 105 0110 1001
x2 = -38 1101 1010
1. (byte) (x>>2)
2. (byte) (x>>>26)
I understand the first shift will shift it two times to the right, and replace the missing bits with a 1. so the shift results in:
1111 0110
but I have no idea why the second shifts results in:
0011 1111 or 63.
My understanding is that the x >> adds 1 if x is negative and adds a 0 if x is positive. The >>> adds a 0 regardless of the sign. So if that is the case wouldn't the result of x2 >>> 26 be 0000 0000?
The reason for the "strange" bit shift result is because the values are widened to 32 bit (int) before the shift.
I. e. -38 isn't 1101 1010 here, but 1111 1111 1111 1111 1111 1111 1101 1010.
Which should make it clear why -38 >>> 26 is 0000 0000 0000 0000 0000 0000 0011 1111 (or 63).
The widening is described in the the Java Language Specification:
Otherwise, if the operand is of compile-time type byte, short, or char, it is promoted to a value of type int by a widening primitive conversion (ยง5.1.2).
If you want to perform bit shift operations on an 8 bit (byte) value, you could mask the value to use only the lower 8 bits, after widening but before shifting, like Federico suggests:
byte x = -38;
(x & 0xFF) >>> 26;
This would give the expected value of 0 (though I'm not sure if it makes sense, as any 8 bit value will be 0 if you right shift by more than 8).
I'm parsing unsigned bits from a DatagramSocket. I have a total of 24bits (or 3 bytes) coming in - they are: 1 unsigned 8bit integer followed by a 16bit signed integer. But java never stores anything more than a signed byte into a byte/byte array? When java takes in these values, do you lose that last 8th bit?
DatagramSocket serverSocket = new DatagramSocket(666);
byte[] receiveData = new byte[3]; <--Now at this moment I lost my 8th bit
System.out.println("Binary Server Listing on Port: "+port);
while (true)
{
DatagramPacket receivePacket = new DatagramPacket(receiveData, receiveData.length);
serverSocket.receive(receivePacket);
byte[] bArray = receivePacket.getData();
byte b = bArray[0];
}
Did I now lose this 8th bit since I turned it into a byte? Was it wrong I initialized a byte array of 3 bytes?
When java takes in these values, do you lose that last 8th bit?
No. You just end up with a negative value when it's set.
So to get a value between 0 and 255, it's simplest to use something like this:
int b = bArray[0] & 0xff;
First the byte is promoted to an int, which will sign extend it, leading to 25 leading 1 bits if the high bit is 1 in the original value. The & 0xff then gets rid of the first 24 bits again :)
No, you do not lose the 8th bit. But unfortunately, Java has two "features" which make it harder than reasonable to deal with such values:
all of its primitive types are signed;
when "unwrapping" a primitive type to another primitive type with a greater size (for instance, reading a byte to an int as is the case here), the sign bit of the "lower type" is expanded.
Which means that, for instance, if you read byte 0x80, which translates in binary as:
1000 0000
when you read it as an integer, you get:
1111 1111 1111 1111 1111 1111 1000 0000
^
This freaking bit gets expanded!
whereas you really wanted:
0000 0000 0000 0000 0000 0000 1000 0000
ie, integer value 128. You therefore MUST mask it:
int b = array[0] & 0xff;
1111 1111 1111 1111 1111 1111 1000 0000 <-- byte read as an int, your original value of b
0000 0000 0000 0000 0000 0000 1111 1111 <-- mask (0xff)
--------------------------------------- <-- anded, give
0000 0000 0000 0000 0000 0000 1000 0000 <-- expected result
Sad, but true.
More generally: if you wish to manipulate a lot of byte-oriented data, I suggest you have a look at ByteBuffer, it can help a lot. But unfortunately, this won't save you from bitmask manipulations, it is just that it makes it easier to read a given quantity of bytes as a time (as primitive types).
In Java, byte (as well as short, int and long) is only a signed numeric data types. However, this does not imply any loss of data when treating them as unsigned binary data. As your illustration shows, 10000000 is -128 as a signed decimal number. If you are dealing with binary data, just treat it as its binary form and you will be fine.
Bytes in Java are signed by default. I see on other posts that a workaround to have unsigned bytes is something similar to that: int num = (int) bite & 0xFF
Could someone please explain to me why this works and converts a signed byte to an unsigned byte and then its respective integer? ANDing a byte with 11111111 results in the same byte - right?
A typecast has a higher precedence than the & operator. Therefore you're first casting to an int, then ANDing in order to mask out all the high-order bits that are set, including the "sign bit" of the two's complement notation which java uses, leaving you with just the positive value of the original byte. E.g.:
let byte x = 11111111 = -1
then (int) x = 11111111 11111111 11111111 11111111
and x & 0xFF = 00000000 00000000 00000000 11111111 = 255
and you've effectively removed the sign from the original byte.
ANDing a byte with 11111111 results in the same byte - right?
Except you're ANDing with 00000000000000000000000011111111, because 0xFF is an int literal - there are no byte literals in Java. So what happens is that the byte is promoted to int (the typecast is unnecessary), has its sign extended (i.e. keeps the possibly negative value of the byte, but then the sign extension is reverted by ANDing it with all those zeroes. The result is an int that has as its least significant bits exactly the former byte and thus the value the byte would have had were it unsigned.
In Java 8 such method appeared in Byte class:
/**
* Converts the argument to an {#code int} by an unsigned
* conversion. In an unsigned conversion to an {#code int}, the
* high-order 24 bits of the {#code int} are zero and the
* low-order 8 bits are equal to the bits of the {#code byte} argument.
*
* Consequently, zero and positive {#code byte} values are mapped
* to a numerically equal {#code int} value and negative {#code
* byte} values are mapped to an {#code int} value equal to the
* input plus 2<sup>8</sup>.
*
* #param x the value to convert to an unsigned {#code int}
* #return the argument converted to {#code int} by an unsigned
* conversion
* #since 1.8
*/
public static int toUnsignedInt(byte x) {
return ((int) x) & 0xff;
}
As you see the result is an int not a byte
How it works, say we have a byte b = -128;, this is represented as 1000 0000, so what happens when you execute your line? Let's use a temp int for this, say:
int i1 = (int)b; i1 is now -128 and this is actually represented in binary like this:
1111 1111 1111 1111 1111 1111 1000 0000
So what does i1 & 0xFF look like in binary?
1111 1111 1111 1111 1111 1111 1000 0000
&
0000 0000 0000 0000 0000 0000 1111 1111
which results in
0000 0000 0000 0000 0000 0000 1000 0000
and this is exactly 128, meaning your signed value converted to unsigned.
Edit
Convertint byte -128 .. 127 into 0 .. 255
int unsignedByte = 128 + yourByte;
You cannot represent the values 128 to 255 by using a byte, you must use something else, like an int or a smallint.
Yes, but this way you can be sure you will never get a number >255 or <0.
If the first bit is 1, the number is negative. If you convert byte to int, if it is negative it will be pre-pended with 1 bytes and if positive, with 0 bytes. Running the and routine will drop all the bytes left of the first 8. This in effect adds 256 to negative bytes.
Are hexadecimal numbers ever negative? If yes then how?
For binary you would have signed and unsigned.
How would one represent them in Hex? I need this for a hex routine I am about to embark upon.
Yes. For example you'd have the following representations in signed 32-bit binary and hex:
Decimal: 1
Binary: 00000000 00000000 00000000 00000001
Hex: 00 00 00 01
Decimal: -1
Binary: 11111111 11111111 11111111 11111111
Hex: FF FF FF FF
Decimal: -2
Binary: 11111111 11111111 11111111 11111110
Hex: FF FF FF FE
As you can see, the Hex representation of negative numbers is directly related to the binary representation.
The high bit of a number determines if it is negative. So for instance an int is 32 bits long, so if bit 31 is a 1 it is negative. Now how you display that value be it hexadecimal or decimal doesn't matter. so the hex values like
0x80000000
0x91345232
0xA3432032
0xBFF32042
0xC33252DD
0xE772341F
0xFFFFFFFF
are all negative, because the top bit is set to 1
|
v
0x8 -> 1000
0x9 -> 1001
0xA -> 1010
0xB -> 1011
0xC -> 1100
0xD -> 1101
0xE -> 1110
0xF -> 1111
Yes they can be. It's the same as binary as to how you interpret it (signed vs unsigned).
You would take the binary signed and unsigned forms then represent them in hex as you would any binary number.
On one hand, why not - it's just a positional numeric system, like decimal.
On the other hand, they normally use hex notation to deduce the underlying bit pattern - and that is much more straightforward if the number is interpreted as unsigned.
So the answer is - it's possible, mathematically correct and internally consistent, but defeats the most common purpose of hex notation.
In Java, these are the bounds of the Integer data type:
Integer.MIN_VALUE = 0x80000000;
Integer.MAX_VALUE = 0x7fffffff;