Java not treating 0xFC as a signed integer?

Java not treating 0xFC as a signed integer? - java

Just when I though I had a fair grasp on how Java treats all Integers/Bytes etc.. as signed numbers, it hit me with another curve ball and got me thinking if I really understand this treatment after all.
This is a piece of assembly code that is supposed to jump to an address if a condition is met (which indeed is met). PC just before the jump is C838h and then after condition check it is supposed to be: C838h + FCh (h = hex) which I thought would be treated as signed so the PC would jump backwards: FCh = -4 in two's compliment negative number. But To my surprise, java ADDED FCh to the PC making it incorrectly jump to C934h instead of back to C834h.
C832: LD B,04 06 0004 (PC=C834)
C834: LD (HL), A 77 9800 (PC=C835)
C835: INC L:00 2C 0001 (PC=C836)
C836: JR NZ, n 20 00FC (PC=C934)
I tested this in a java code and indeed the result was the same:
int a = 0xC838;
int b = 0xFC;
int result = a + b;
System.out.printf("%04X\n", result); //prints C934 (incorrect)
To fix this I had to cast FCh into a byte after checking if the first bit is 1, which it is: 11111100
int a = 0xC838;
int b = (byte) 0xFC;
int result = a + b;
System.out.printf("%04X\n", result); //prints C834 (correct)
In short I guess my question is that I thought java would know that FCh is a negative number but that is not the case unless I cast it to a byte. Why? Sorry I know this question is asked many times and I seem to be asking it myself alot.

0xfc is a positive number. If you want a negative number, then write a negative number. -0x4 would do just fine.
But if you want to apply this to non-constant data, you'll need to tell Java that you want it sign-extended in some way.
The core of the problem is that you have a 32-bit signed integer, but you want it treated like an 8-bit signed integer. The easiest way to achieve that would be to just use byte as you did above.
If you really don't want to write byte, you can write (0xfc << 24) >> 24:
class Main
{
public static void main(String[] args)
{
int a = 0xC838;
int b = (0xfc << 24) >> 24;
int result = a + b;
System.out.printf("%04X\n", result);
}
}
(The 24 derives from the difference of the sizes of int (32 bits) and byte (8 bits)).

Related

Why am I not able to mask 32 bits on a long data type in Java

I cannot figure out why this works. I am attempting to mask the least significant 32 bits of java on a long but it does not properly AND the 33rd and 34th bit and further. Here is my example
class Main {
public static void main(String[] args) {
long someVal = 17592096894893l; //hex 0xFFFFAAFAFAD
long mask = 0xFF; //binary
long result = mask & someVal;
System.out.println("Example 1 this works on one byte");
System.out.printf("\n%x %s", someVal, Long.toBinaryString(someVal) );
System.out.printf("\n%x %s", result, Long.toBinaryString(result) );
long someVal2 = 17592096894893l; //hex 0xFFFFAAFAFAD
mask = 0xFFFFFFFF; //binary
result = mask & someVal2;
System.out.println("\nExample 2 - this does not work");
System.out.printf("\n%x %s", someVal2, Long.toBinaryString(someVal2) );
System.out.printf("\n%x %s", result, Long.toBinaryString(result) );
}
}
I was expecting the results to drop the most significant byte to be a zero since the AND operation did it on 32 bits. Here is the output I get.
Example 1 - this works
ffffaafafad 11111111111111111010101011111010111110101101
ad 10101101
Example 2 - this does not work
ffffaafafad 11111111111111111010101011111010111110101101
ffffaafafad 11111111111111111010101011111010111110101101
I would like to be able to mask the first least significant 4 bytes of the long value.

I believe what you’re seeing here is the fact that Java converts integers to longs using sign extension.
For starters, what should this code do?
int myInt = -1;
long myLong = myInt;
System.out.println(myLong);
This should intuitively print out -1, and that’s indeed what happens. I mean, it would be kinda weird if in converting an int to a long, we didn’t get the same number we started with.
Now, let’s take this code:
int myInt = 0xFFFFFFFF;
long myLong = myInt;
System.out.println(myLong);
What does this print? Well, 0xFFFFFFFF is the hexadecimal version of the signed 32-bit number -1. That means that this code is completely equivalent to the above code, so it should (and does) print the same value, -1.
But the value -1, encoded as a long, doesn’t have representation 0x00000000FFFFFFFF. That would be 232 - 1, not -1. Rather, since it’s 64 bits long, -1 is represented as 0xFFFFFFFFFFFFFFFFF. Oops - all the upper bits just got activated! That makes it not very effective as a bitmask.
The rule in Java is that if you convert an int to a long, if the very first bit of the int is 1, then all 32 upper bits of the long will get set to 1 as well. That’s in place so that converting an integer to a long preserves the numeric value.
If you want to make a bitmask that’s actually 64 bits long, initialize it with a long literal rather than an int literal:
mask = 0xFFFFFFFFL; // note the L
Why does this make a difference? Without the L, Java treats the code as
Create the integer value 0xFFFFFFFF = -1, giving 32 one bits.
Convert that integer value into a long. To do so, use sign extension to convert it to the long value -1, giving 64 one bits in a row.
However, if you include the L, Java interprets things like this:
Create the long value 0xFFFFFFFF = 232 - 1, which is 32 zero bits followed by 32 one bits.
Assign that value to mask.
Hope this helps!

What is the purpose of low and high nibble when converting a string to a HexString

Recently I have been going through some examples of MD5 to start getting an understanding of security and MD5 has been fairly simple to understand for the most part and a good starting point even though it is no longer secure. Despite this I have a question regarding high and lo nibbles when it comes to converting a string to a hex string.
So I know high and low nibbles are equal to half a byte or also can be hex digits that represent a single hexadecimal digit. What I am not understanding though is exactly how they work and the purpose that they serve. I have been searching on google and I can't find much of an answer that will help explain what they do in the context that they are in. Here is the context of the conversion:
private static String toHexString( byte[] byteArray )
{
final String HEX_CHARS = "0123456789ABCDEF";
byte[] result = new byte[byteArray.length << 1];
int len = byteArray.length;
for( int i = 0 ; i < len ; i++ )
{
byte b = byteArray[i]
int lo4 = b & 0x0F;
int hi4 = ( b & 0xF0 ) >> 4;
result[i * 2] = (byte)HEX_CHARS.charAt( hi4 );
result[i * 2 + 1] = (byte)HEX_CHARS.charAt( lo4 );
}
return new String( result );
}
I don't exactly understand what is going on in the for statement. I would appreciate any help understanding this and if there is some link to some places that I can learn more about this please also leave it.
I understand the base definition of nibble but not the operations and what the assignment to the number 4 is doing either.
If I need to post the full example code I will just ask as I am unsure if it is needed.

This code simply converts a byte array to hexadecimal representation. Within the for-loop, each byte is converted into two characters. I think it's easier to understand it on an example.
Assume one of the bytes in your array is, say, 218 (unsigned). That's 1101 1010 in binary.
lo4 gets the lowest 4 bits by AND-ing the byte with the bitmask 00001111:
int lo4 = b & 0x0F;
This results in 1010, 10 in decimal.
hi4 gets the highest 4 bits by AND-ing with the bitmask 1111 0000 and shifting 4 bits to the right:
int hi4 = ( b & 0xF0 ) >> 4;
This results in 1101, 13 in decimal.
Now to get the hexadecimal representation of this byte you only need to convert 10 and 13 to their hexadecimal representations and concatenate. For this you simply look up the character in the prepared HEX_CHARS string at the specific index. 10 -> A, 13 -> D, resulting in 218 -> DA.

It's just bit operations. The & character takes the literal bit value of each and does a logical and on them.
int lo4 = b & 0x0F;
for instance if b = 24 then it will evaluate to this
00011000
+00001111
=00001000
The second such line does the same on the first four bits.
00011000
+11110000
=00010000
the '>>' shifts all of the bits a certain number in that direction so
00010000 >> 4 = 00000001.
This is done so that you can derive the hex value from the number. Since each character in hex can represent 4 bits by splitting the number into pieces of 4 bits we can convert it.
in the case of b = 24 we no have lo4 = 1000 or 8 and hi4 = 0001 or 1. The last part of the loop assigns the character value for each.
Hex_chars[hi4] = '1' and Hex_chars[lo4] = '8' which gives you "18" for that part of the string which is 24 in hex.

Java bit unsigned shifting (>>>) give strange result [duplicate]

This question already has answers here:
java bit manipulation
(5 answers)
Closed 8 years ago.
I have this code:
int i = 255;
byte b = (byte) i;
int c;
System.out.println(Integer.toBinaryString( i));
System.out.println("b = " + b); // b = -1
c=b>>>1;
System.out.println(Integer.toBinaryString( c));
System.out.println(c);
But I can't understand how it works. I think that unsigned shifting to 255(11111111) should give me 127(0111111) but it doesn't. Is my assumption wrong?

Shift operators including >>> operate on ints. The value of b, which is -1 because byte is signed, is promoted to int before the shift. That is why you see the results that you see.
The reason why 255 is re-interpreted as -1 is that 255 has all its eight bits set to one. When you assign that to a signed 8-bit type of byte, it is interpreted as -1 following two's complement rules.

this is how you can get the expected result
c = (0xFF & b) >>> 1;
see dasblinkenlight's answer for details

Try this, and you will understand:
System.out.println(Integer.toBinaryString(i)); // 11111111
System.out.println(Integer.toBinaryString(b)); // 11111111111111111111111111111111
System.out.println(Integer.toBinaryString(c)); // 1111111111111111111111111111111
Variable int i equals 255, so the first print makes sense.
Variable byte b equals -1, because you store 255 in a single byte.
But when you call Integer.toBinaryString(b), the compiler converts b from byte to int, and (int)-1 == FFFFFFFFh == 11111111111111111111111111111111b, hence the second print.
Variable int c equals b>>>1, so the third print makes sense.

How to store unsigned short in java?

So generally I'm using Netty and it's BigEndianHeapChannelBuffer to receive unsinged short (from c++ software) in Java.
When I do it like this:
buf.readUnsignedByte(); buf.readUnsignedByte();
it returns:
149 and 00. Till now everything is fine. Because server sent 149 as unsigned short [2 bytes].
Instead of this I would like to receive unsigned short (ofc after restarting my application):
buf.readUnsignedShort();
and magic happens. It returns: 38144.
Next step is to retrieve unsigned byte:
short type = buf.readUnsignedByte();
System.out.println(type);
and it returns: 1 which is correct output.
Could anyone help me with this?
I looked deeper and this is what netty does with it:
public short readShort() {
checkReadableBytes(2);
short v = getShort(readerIndex);
readerIndex += 2;
return v;
}
public int readUnsignedShort() {
return readShort() & 0xFFFF;
}
But still I can't figure what is wrong. I would like just be able to read that 149.

You could also borrow a page from the Java DataInputStream.readUnsignedShort() implementation:
public final int readUnsignedShort() throws IOException {
int ch1 = in.read();
int ch2 = in.read();
if ((ch1 | ch2) < 0)
throw new EOFException();
return (ch1 << 8) + (ch2 << 0);
}

The answer is to change endianness. Thanks to Roger who in comments wrote:
Not much magic, 149*256+0=38144. You have specified BigEndian so this seems correct, the most significant byte is sent first
and:
#mickula The short is two bytes, where one of the bytes is "worth" 256 times as much as the other, Since you use BigEndian the first byte is the one "worth" more. Similar to the decimal number 123 of the digits 1, 2, and 3. The first position is with 10 times the next position which is worth 10 times the next, and so on. So 1 * 100 + 2 * 10 + 3 = 123 when the digits are transferred one at a time. If you see 1, 2, and 3 as little endian you would use 1 + 2 * 10 + 3 * 100 = 321. The 256 is because the size of a byte is 256. –
Thanks to his comments I just switched endianess in server bootstrap by adding:
bootstrap.setOption("child.bufferFactory", new
HeapChannelBufferFactory(ByteOrder.LITTLE_ENDIAN));

Could someone explain to me what the following Java code is doing?

byte s[] = getByteArray()
for(.....)
Integer.toHexString((0x000000ff & s[i]) | 0xffffff00).substring(6);
I understand that you are trying to convert the byte into hex string. What I don't understand is how that is done. For instance if s[i] was 00000001 (decimal 1) than could you please explain:
Why 0x000000ff & 00000001 ? Why not directly use 00000001?
Why result from #1 | 0xffffff00?
Finally why substring(6) is applied?
Thanks.

It's basically because bytes are signed in Java. If you promote a byte to an int, it will sign extend, meaning that the byte 0xf2 will become 0xfffffff2. Sign extension is a method to keep the value the same when widening it, by copying the most significant (sign) bit into all the higher-order bits. Both those values above are -14 in two's complement notation. If instead you had widened 0xf2 to 0x000000f2, it would be 242, probably not what you want.
So the & operation is to strip off any of those extended bits, leaving only the least significant 8 bits. However, since you're going to be forcing those bits to 1 in the next step anyway, this step seems a bit of a waste.
The | operation following that will force all those upper bits to be 1 so that you're guaranteed to get an 8-character string from ffffff00 through ffffffff inclusive (since toHexString doesn't give you leading zeroes, it would translate 7 into "7" rather than the "07" that you want).
The substring(6) is then applied so that you only get the last two of those eight hex digits.
It seems a very convoluted way of ensuring you get a two-character hex string to me when you can just use String.format ("%02x", s[i]). However, it's possible that this particular snippet of code may predate Java 5 when String.format was introduced.
If you run the following program:
public class testprog {
public static void compare (String s1, String s2) {
if (!s1.equals(s2))
System.out.println ("Different: " + s1 + " " + s2);
}
public static void main(String args[]) {
byte b = -128;
while (b < 127) {
compare (
Integer.toHexString((0x000000ff & b) | 0xffffff00).substring(6),
String.format("%02x", b, args));
b++;
}
compare (
Integer.toHexString((0x000000ff & b) | 0xffffff00).substring(6),
String.format("%02x", b, args));
System.out.println ("Done");
}
}
you'll see that the two expressions are identical - it just spits out Done since the two expressions produce the same result in all cases.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.