Ok, so I have searched and searched and nothing worked...
I have this array of int, each int occupies only the low order byte. For instance, I have
data[0] = Ox52
data[1] = Oxe4
data[2] = Ox18
data[3] = Oxcb
I want that the standard output contains exactly those bytes (or in other words, if I write this in a file and I examine the file with a Hex editor, I should see):
52e418cb
How can I do that?
Thank you for your help
The correct way of doing this is to shift the bytes according to their desired position and then stitch them together using the OR operator. But, you should also perform a bit mask on the lower 8 bits of the byte before shifting it. This is needed because a byte is first converted to an int (before the shifting is done). This is no big deal, but when the highest bit is 1 (i.e.: the byte is negative), your integer will become negative as well, which causes all the leading bits to be set on 1.
So:
(byte) 10000000 = (int) 11111111 11111111 11111111 10000000
Using this negative int value with the OR operator will cause a wrong result. So, the working line is this one:
((data[0] & 0xFF) << 24) | ((data[1] & 0xFF) << 16) | ((data[2] & 0xFF) << 8) | (data[3] & 0xFF)
The following seems to work fine. I'm using the OutputStream.write(int) method.
int[] ints = new int[] { 0x52, 0xe4, 0x18, 0xcb };
FileOutputStream os = new FileOutputStream(new File("/tmp/x"));
for (int i : ints) {
os.write(i);
}
os.close();
Results:
> hexdump /tmp/x
0000000 52 e4 18 cb
Just shift and OR them together before writing them to the file/output:
(data[0] << 24) | (data[1] << 16) | (data[2] << 8) | data[3]
Related
I had a look at the Twofish source code in Java and discovered this line, which I don't quite understand.
public static byte[] blockEncrypt (byte[] in, int inOffset, Object sessionKey) {
int x0 = (in[inOffset++] & 0xFF) | (in[inOffset++] & 0xFF) << 8 | (in[inOffset++] & 0xFF) << 16 | (in[inOffset++] & 0xFF) << 24;
}
Here, a byte is offset with a logical AND with 0xFF (11111111). Why is this done? If I take an 8-bit value and calculate it & (logical AND) 0xFF, the value I had previously is returned in the end, because only the bits that were 1 in the original value are taken over. (11010011 & 11111111 should result in 11010011)
I have been reading these byte by bytes from streams. Example I read this line like this.
int payloadLength = r.readUnsignedShort();
The problem I have is that 2 bytes value is x3100 so it turns out to be 12544 but I suppose to only read as x31 which makes it to be only 49. How to ignore the extra 00.
Right shift the value by 8 bits and then and it with 0xFF. Right shifting moves the bits 8 bits to the right. Any other bits would also be moved to the right so you need to mask those of by do an ANDing (&) with 0xFF to get rid of them.
int payloadLength = r.readUnsignedShort();
payloadLength = (payloadLength >>> 8)& 0xFF;
System.out.println(payLoadLength);
You may also want to swap the two bytes.
v = 0xa0b;
v = swapBytes(v);
System.out.println(Integer.toHexString(v)); // 0xb0a
public static int swapBytes(int v) {
return ((v << 8)&0xFF00) | ((v >> 8) & 0xFF);
}
Normally, for reading in just 16 bits you would not have to and it with 0xFF since the high order bits are 0's. But I think it is a good practice and will prevent possible problems in the future.
I've come across some code which has the bit masks 0xff and 0xff00 or in 16 bit binary form 00000000 11111111 and 11111111 00000000.
/**
* Function to check if the given string is in GZIP Format.
*
* #param inString String to check.
* #return True if GZIP Compressed otherwise false.
*/
public static boolean isStringCompressed(String inString)
{
try
{
byte[] bytes = inString.getBytes("ISO-8859-1");
int gzipHeader = ((int) bytes[0] & 0xff)
| ((bytes[1] << 8) & 0xff00);
return GZIPInputStream.GZIP_MAGIC == gzipHeader;
} catch (Exception e)
{
return false;
}
}
I'm trying to work out what the purpose of using these bit masks in this context (against a byte array). I can't see what difference it would make?
In the context of a GZip compressed string as this method seems to be written for the GZip magic number is 35615, 8B1F in Hex and 10001011 00011111 in binary.
Am I correct in thinking this swaps the bytes? So for example say my input string were \u001f\u008b
bytes[0] & 0xff00
bytes[0] = 1f = 00011111
& ff = 11111111
--------
= 00011111
bytes[1] << 8
bytes[1] = 8b = 10001011
<< 8 = 10001011 00000000
((bytes[1] << 8) & 0xff00)
= 10001011 00000000 & 0xff00
= 10001011 00000000
11111111 00000000 &
-------------------
10001011 00000000
So
00000000 00011111
10001011 00000000 |
-----------------
10001011 00011111 = 8B1F
To me it doesn't seem like the & is doing anything to the original byte in both cases bytes[0] & 0xff and (bytes[1] << 8) & 0xff00). What am I missing?
int gzipHeader = ((int) bytes[0] & 0xff) | ((bytes[1] << 8) & 0xff00);
The type byte is Java is signed. If you cast a byte to an int, its sign will be extended. The & 0xff is to mask out the 1 bits that you get from sign extension, effectively treating the byte as if it is unsigned.
Likewise for 0xff00, except that the byte is first shifted 8 bits to the left.
So, what this does is:
take the first byte, bytes[0], cast it to int and mask out the sign-extended bits (treating the byte as if it is unsigned)
take the second byte, cast it to int, shift it left by 8 bits, and mask out the sign-extended bits
combine the values with |
Note that the shift left effectively swaps the bytes.
Apparently the purpose is to read the first word of bytes and store them in gzipHeader by suitable masking and shifting. More precisely, the first part masks out exactly the first byte while the second part masks out the second byte, already shifted by 8 bits. The | combines both bit masks to an int.
The resulting value is compared against the defined value GZIPInputStream.GZIP_MAGIC to determine if the first two bytes are the defined beginning of data compressed with gzip.
This is a trick to overcome big-endian/little-endian issues. It is forcing the interpretation of the first two bytes as little-endian, i.e. [0] contains the low byte and [1] contains the high byte.
byte is a signed type. If you convert 0xff as a byte to int you get -1. If you actually want to get 255, mask after the conversion.
I'm a bit confused regarding a conversion from bytes to integers. Consider the following code:
byte[] data = new byte[] { 0, (byte) 0xF0 };
int masked = data[0] << 8 & 0xFF | data[1] & 0xFF; //240
int notMasked = data[0] << 8 | data[1]; //-16
Because bytes in java are signed, data[1] is not 240 decimal, but rather the 2's complement, -16. However, it should still be, in binary: 0x11110000 so, why do I need to do data[1] & 0xFF ?
Is Java converting everything to Integer before passing it to the | operator? Why does &0xFF make a difference then?
Java bytes are signed (unfortunately) - so when you promote the value to an int in order to perform the bitwise |, it ends up being sign-extended as 0xFFFFFFF0. That then messes up the | with data[0]. The masking with & 0xff converts it to an integer value of 240 (just 0x000000F0) instead.
However, you've stlil got a problem. This code:
int masked = data[0] << 8 & 0xFF | data[1] & 0xFF;
should be:
int masked = ((data[0] & 0xff) << 8) | (data[1] & 0xFF);
... otherwise you're masking after the shift, which won't work. I've added brackets because I'm never sure of the predence of &, << and |...
It is similar to a known "puzzle"
byte x = -1;
x = x >>>= 1;
System.out.println(x);
produces
-1
No shift? This is because before compiling arithemtic / shift / comparison expressions javac promotes byte (as well as short and char) to int or to long (if there is any long in the expression), so it works as follows
x -> int = 0xFFFFFFFF; 0xFFFFFFF >>> 1 = 0x7FFFFFF; (byte)0x7FFFFFF -> 0xFF
I'm trying to convert a short into 2 bytes...and then from those 2 bytes try to get the same short value. For that, I've written this code:
short oldshort = 700;
byte 333= (byte) (oldshort);
byte byte2= (byte) ((oldshort >> 8) & 0xff);
short newshort = (short) ((byte2 << 8) + byte1);
System.out.println(oldshort);
System.out.println(newshort);
For the value of 700 (oldshort), newhosrt is 444. After some testing, it looksl ike \tThis code only works for some values. Like...if oldshort=50, then it will work fine..but if it is -200, or bigger values than 127 (i think) it doesn't work. I guess that there is a problem with the signed bytes, two's complement value, etc...but I can't figure out how to solve it.
Any idea?? Any native way to do this in java?? Thanks in advance!
When recombining, you need to mask the byte1 to stop it being sign extended.
E.g.
short oldshort = 700;
byte byte1= (byte) (oldshort);
byte byte2= (byte) ((oldshort >> 8) & 0xff);
short newshort = (short) ((byte2 << 8) + (byte1&0xFF);
System.out.println(oldshort);
System.out.println(newshort);
EDIT:
All operations on bytes and shorts in java are actually done as integers. So when you write
+byte1, what is really happening is that the byte is first cast to an integer (sign-extended). It will still have the same value, but now has more bits. We can then mask off the bottom 8 bits to get the original 8-bits from the short - without the sign.
E.g. short =511 = 0x01FE
// lots of 0x000's because the operations are done on 32-bit int's
byte1 = (0x000001FE & 0x000000FF) = (0x01FE & 0xFF) = 0xFE = (byte)-2
byte2 = 0x1
newShort = (byte2 << 8) + (byte1 & 0xFF)
= (0x1 << 8) + (0xFE & 0xFF)
// since the ops are performed as int's
= (0x00000001 << 8) + (0xFFFFFFFE & 0x000000FF)
// 0xFFFFFFFE = -2
= (0x00000100) + (0x000000FE)
= 0x000001FE
= 511
You could also use com.google.common.primitives.Shorts, which has methods:
public static byte[] toByteArray(short value)
public static short fromByteArray(byte[] bytes)