Java, combining two integers to long results negative number

Java, combining two integers to long results negative number - java

I am trying to combining two integers to a long in Java. Here is the code I am using:
Long combinedValue = (long) a << 32 | b;
When a = 0x03 and b = 0x1B56 ED23, I am able to get the expected value (combinedValue = 13343583523 in long).
However, when my a = 0x00 and b = 0xA2BF E1C7, I get a negative value, -1567628857, instead of 2730484167. Can anyone explain why shifting an integer 0 by 32 bits causes the first 32 bits become 0xFFFF FFFF?
Thanks

b is negative, too. That's what that constant means. What you probably want is ((long) a << 32) | (b & 0xFFFFFFFFL).

When you OR (long) a << 32 with b, if b is an int then it will be promoted to a long because the operation must be done between two values of the same type. This is called a widening conversion.
When this conversion from int to long happens, b will be sign extended, meaning that if the top bit is set then it will be copied into the top 32 bits of the 64 bit long value. This is what causes the top 32 bits to be 0xffffffff.

Related

Why am I not able to mask 32 bits on a long data type in Java

I cannot figure out why this works. I am attempting to mask the least significant 32 bits of java on a long but it does not properly AND the 33rd and 34th bit and further. Here is my example
class Main {
public static void main(String[] args) {
long someVal = 17592096894893l; //hex 0xFFFFAAFAFAD
long mask = 0xFF; //binary
long result = mask & someVal;
System.out.println("Example 1 this works on one byte");
System.out.printf("\n%x %s", someVal, Long.toBinaryString(someVal) );
System.out.printf("\n%x %s", result, Long.toBinaryString(result) );
long someVal2 = 17592096894893l; //hex 0xFFFFAAFAFAD
mask = 0xFFFFFFFF; //binary
result = mask & someVal2;
System.out.println("\nExample 2 - this does not work");
System.out.printf("\n%x %s", someVal2, Long.toBinaryString(someVal2) );
System.out.printf("\n%x %s", result, Long.toBinaryString(result) );
}
}
I was expecting the results to drop the most significant byte to be a zero since the AND operation did it on 32 bits. Here is the output I get.
Example 1 - this works
ffffaafafad 11111111111111111010101011111010111110101101
ad 10101101
Example 2 - this does not work
ffffaafafad 11111111111111111010101011111010111110101101
ffffaafafad 11111111111111111010101011111010111110101101
I would like to be able to mask the first least significant 4 bytes of the long value.

I believe what you’re seeing here is the fact that Java converts integers to longs using sign extension.
For starters, what should this code do?
int myInt = -1;
long myLong = myInt;
System.out.println(myLong);
This should intuitively print out -1, and that’s indeed what happens. I mean, it would be kinda weird if in converting an int to a long, we didn’t get the same number we started with.
Now, let’s take this code:
int myInt = 0xFFFFFFFF;
long myLong = myInt;
System.out.println(myLong);
What does this print? Well, 0xFFFFFFFF is the hexadecimal version of the signed 32-bit number -1. That means that this code is completely equivalent to the above code, so it should (and does) print the same value, -1.
But the value -1, encoded as a long, doesn’t have representation 0x00000000FFFFFFFF. That would be 232 - 1, not -1. Rather, since it’s 64 bits long, -1 is represented as 0xFFFFFFFFFFFFFFFFF. Oops - all the upper bits just got activated! That makes it not very effective as a bitmask.
The rule in Java is that if you convert an int to a long, if the very first bit of the int is 1, then all 32 upper bits of the long will get set to 1 as well. That’s in place so that converting an integer to a long preserves the numeric value.
If you want to make a bitmask that’s actually 64 bits long, initialize it with a long literal rather than an int literal:
mask = 0xFFFFFFFFL; // note the L
Why does this make a difference? Without the L, Java treats the code as
Create the integer value 0xFFFFFFFF = -1, giving 32 one bits.
Convert that integer value into a long. To do so, use sign extension to convert it to the long value -1, giving 64 one bits in a row.
However, if you include the L, Java interprets things like this:
Create the long value 0xFFFFFFFF = 232 - 1, which is 32 zero bits followed by 32 one bits.
Assign that value to mask.
Hope this helps!

Packing bytes into a long with |= is giving unexpected results

I am trying to concatenate my byte[] data into a long variable. But for some reason, the code is not working as I expected.
I have this byte array which maximum size will be 8 bytes which are 64 bits, the same size a Long variable has so I am trying to concatenate this array into the long variable.
public static void main(String[] args) {
// TODO Auto-generated method stub
byte[] data = new byte[]{
(byte)0xD4,(byte)0x11,(byte)0x92,(byte)0x55,(byte)0xBC,(byte)0xF9
};
Long l = 0l;
for (int i =0; i<6; i++){
l |= data[i];
l <<=8;
String lon = String.format("%064d", new BigInteger(Long.toBinaryString((long)l)));
System.out.println(lon);
}
}
The results are:
1111111111111111111111111111111111111111111111111101010000000000
1111111111111111111111111111111111111111110101000001000100000000
1111111111111111111111111111111111111111111111111001001000000000
1111111111111111111111111111111111111111100100100101010100000000
1111111111111111111111111111111111111111111111111011110000000000
1111111111111111111111111111111111111111111111111111100100000000
When the final result should be something like
111111111111111110101000001000110010010010101011011110011111001
which is 0xD4,0x11,0x92,0x55,0xBC,0xF9

byte in Java is signed, and when you do long |= byte, the byte's value is promoted and the sign bit is extended, which essentially sets all those higher bits to 1 if the byte was a negative value.
You can do this instead:
l |= (data[i] & 255)
To force it into an int and kill the sign before it's then promoted to a long. Here is an example of this happening.
Details
Prerequisite: If the term "sign bit" does not make sense to you, then you must read What is “2's Complement”? first. I will not explain it here.
Consider:
byte b = (byte)0xB5;
long n = 0l;
n |= b; // analogous to your l |= data[i]
Note that n |= b is exactly equivalent to n = n | b (JLS 15.26.2) so we'll look at that.
So first n | b must be evaluated. But, n and b are different types.
According to JLS 15.22.1:
When both operands of an operator &, ^, or | are of a type that is convertible (§5.1.8) to a primitive integral type, binary numeric promotion is first performed on the operands (§5.6.2).
Both operands are convertible to primitive integral types, so we consult 5.6.2 to see what happens next. The relevant rules here are:
Widening primitive conversion (§5.1.2) is applied to convert either or both operands as specified by the following rules:
...
Otherwise, if either operand is of type long, the other is converted to long.
...
Ok, well, n is long, so according to this b must be now be converted to long using the rules specified in 5.1.2. The relevant rule there is:
A widening conversion of a signed integer value to an integral type T simply sign-extends the two's-complement representation of the integer value to fill the wider format.
Well byte is a signed integer value and its being converted to a long, so according to this the sign bit (highest bit) is simply extended to the left to fill the space. So this is what happens in our example (imagine 64 bits here I'm just saving space):
b = (byte)0xB5 10110101
b widened to long 111 ... 1111111110110101
n 000 ... 0000000000000000
n | b 111 ... 1111111110110101
And so n | b evaluates to 0xFFFFFFFFFFFFFFB5, not 0x00000000000000B5. That is, when that sign bit is extended and the OR operation is applied, you've got all those 1's there essentially overwriting all of the bits from the previous bytes you've OR'd in, and your final results, then, are incorrect.
It's all the result of byte being signed and Java requiring long | byte to be converted to long | long prior to performing the calculation.
If you're unclear on the implicit conversions happening here, here is the explicit version:
n = n | (long)b;
Details of workaround
So now consider the "workaround":
byte b = (byte)0xB5;
long n = 0l;
n |= (b & 255);
So here, we evaluate b & 255 first.
So from JLS 3.10.1 we see that the literal 255 is of type int.
This leaves us with byte & int. The rules are about the same as above although we invoke a slightly different case from 5.6.2:
Otherwise, both operands are converted to type int.
So as per those rules byte must be converted to an int first. So in this case we have:
(byte)0xB5 10110101
promote to int 11111111111111111111111110110101 (sign extended)
255 00000000000000000000000011111111
& 00000000000000000000000010110101
And the result is an int, which is signed, but as you can see, now its a positive number and its sign bit is 0.
Then the next step is to evaluate n | the byte we just converted. So again as per the above rules the new int is widened to a long, sign bit extended, but this time:
b & 255 00000000000000000000000010110101
convert to long 000 ... 0000000000000000000000000010110101
n 000 ... 0000000000000000000000000000000000
n | (b & 255) 000 ... 0000000000000000000000000010110101
And now we get the intended value.
The workaround works by converting b to an int as an intermediate step and setting the high 24 bits to 0, thus letting us convert that to a long without the original sign bit getting in the way.
If you're unclear on the implicit conversions happening here, here is the explicit version:
n = n | (long)((int)b & 255);
Other stuff
And also like maraca mentions in comments, swap the first two lines in your loop, otherwise you end up shifting the whole thing 8 bits too far to the left at the end (that's why your low 8 bits are zero).
Also I notice that your expected final result is padded with leading 1s. If that's what you want at the end you can start with -1L instead of 0L (in addition to the other fixes).

Bitwise left shift behaviour

Today I was learning about the left shift bit operator (<<). As I understand it the left shift bit operator moves bits to the left as specified. And also I know multiply by 2 for shifting. But I am confused, like what exactly is the meaning of "shifting bits" and why does the output differ when value is assigned with a different type?
When I call the function below, it gives output as System.out.println("b="+b); //Output: 0
And my question is: how does b become 0 and why is b typecasted?
public void leftshiftDemo()
{
byte a=64,b;
int i;
i=a << 2;
b=(byte)(a<<2);
System.out.println("i="+i); //Output: 256 i.e 64*2^2
System.out.println("b="+b); //Output: 0 how & why b is typecasted
}
Update (new doubt):
what does it mean "If you shift a 1 bit into high-order position (Bit 31 or 63), the value will become negative". eg.
public void leftshifHighOrder()
{
int i;
int num=0xFFFFFFE;
for(i=0;i<4;i++)
{
num=num<<1;
System.out.println(num);
/*
* Output:
* 536870908
* 1073741816
* 2147483632
* -32 //how this is -ve?
*/
}
}

When integers are casted to bytes in Java, only the lowest order bits are kept:
A narrowing conversion of a signed integer to an integral type T
simply discards all but the n lowest order bits, where n is the number
of bits used to represent type T. In addition to a possible loss of
information about the magnitude of the numeric value, this may cause
the sign of the resulting value to differ from the sign of the input
value.
In this case the byte 64 has the following binary representation:
01000000
The shift operator promotes the value to int:
00000000000000000000000001000000
then left shifts it by 2 bits:
00000000000000000000000100000000
We then cast it back into a byte, so we discard all but the last 8 bits:
00000000
Thus the final byte value is 0. However, your integer keeps all the bits, so its final value is indeed 256.

In java, ints are signed. To represent that, the 2's complement is used. In this representation, any number that has its high-order bit set to 1 is negative (by definition).
Therefore, when you left-shift a 1 that is on the 31st bit (that is the one before last for an int), it becomes negative.

i = a << 2;
in memory:
load a (8 bits) into regitry R1 (32 bits)
shift registry R1 to the left two position
assign registry R1 (32 bits) to variable i (32 bits).
b = (byte)(a << 2);
in memory:
load a (8 bits) into regitry R1 (32 bits)
shift registry R1 to the left two position
assign registry R1 (32 bits) to variable b (8 bits). <- this is why cast (byte) is necessary and why they get only the last 8 bits of the shift operation

The exact meaning of shifting bits is exactly what it sounds like. :-) You shift them to the left.
0011 = 3
0011 << 1 = 0110
0110 = 6

You should read about different data types and their ranges in Java.
Let me explain in easy terms.
byte a=64,b;
int i;
i=a << 2;
b=(byte)(a<<2);
'byte' in Java is signed 2's complement integer. It can store values from -128 to 127 both inclusive. When you do this,
i = a << 2;
you are left shifting 'a' by 2 bits and the value is supposed to be 64*2*2 = 256. 'i' is of type 'int' and 'int' in Java can represent that value.
When you again left shift and typecast,
b=(byte)(a<<2);
you keep your lower 8 bits and hence the value is 0.
You can read this for different primitive types in Java.
http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html

Is it possible to create a bitmask for ~100 constants?

Would that mean that the 100th constant would have to be 1 << 100?

You can use a BitSet which has any number bits you want to set or clear. e.g.
BitSet bitSet = new BitSet(101);
bitSet.set(100);

You can't do it directly because maximum size for a primitive number which can be used as a bitmask is actually 64 bit for a long value. What you can do is to split the bitmask into 2 or more ints or longs and then manage it by hand.
int[] mask = new int[4];
final int MAX_SHIFT = 32;
void set(int b) {
mask[b / MAX_SHIFT] |= 1 << (b % MAX_SHIFT);
}
boolean isSet(int b) {
return (mask[b / MAX_SHIFT] & (1 << (b % MAX_SHIFT))) != 0;
}

You can only create a simple bitmask with the number of bits in the primitive type.
If you have a 32 bit (as in normal Java) int then 1 << 31 is the most you can shift the low bit.
To have larger constants you use an array of int elements and you figure out which array element to use by dividing by 32 (with 32 bit int) and shift with % 32 (modula) into the selected array element.

Effective Java Item #32 suggests using an EnumSet instead of bit fields. Internally, it uses a bit vector so it is efficient, however, it becomes more readable as each bit has a descriptive name (the enum constant).

Yes, if you intend to be able to bitwise OR any or all of those constants together, then you're going to need a bit representing each constant. Of course if you use an int you will only have 32 bits and a long will only give you 64 bits.

How are integers cast to bytes in Java?

I know Java doesn't allow unsigned types, so I was wondering how it casts an integer to a byte. Say I have an integer a with a value of 255 and I cast the integer to a byte. Is the value represented in the byte 11111111? In other words, is the value treated more as a signed 8 bit integer, or does it just directly copy the last 8 bits of the integer?

This is called a narrowing primitive conversion. According to the spec:
A narrowing conversion of a signed integer to an integral type T simply discards all but the n lowest order bits, where n is the number of bits used to represent type T. In addition to a possible loss of information about the magnitude of the numeric value, this may cause the sign of the resulting value to differ from the sign of the input value.
So it's the second option you listed (directly copying the last 8 bits).
I am unsure from your question whether or not you are aware of how signed integral values are represented, so just to be safe I'll point out that the byte value 1111 1111 is equal to -1 in the two's complement system (which Java uses).

int i = 255;
byte b = (byte)i;
So the value of be in hex is 0xFF but the decimal value will be -1.
int i = 0xff00;
byte b = (byte)i;
The value of b now is 0x00. This shows that java takes the last byte of the integer. ie. the last 8 bits but this is signed.

or does it just directly copy the last
8 bits of the integer
yes, this is the way this casting works

The following fragment casts an int to a byte. If the integer’s value is larger than the range of a byte, it will be reduced modulo (the remainder of an integer division by the) byte’s range.
int a;
byte b;
// …
b = (byte) a;

Just a thought on what is said: Always mask your integer when converting to bytes with 0xFF (for ints). (Assuming myInt was assigned values from 0 to 255).
e.g.
char myByte = (char)(myInt & 0xFF);
why? if myInt is bigger than 255, just typecasting to byte returns a negative value (2's complement) which you don't want.

Byte is 8 bit. 8 bit can represent 256 numbers.(2 raise to 8=256)
Now first bit is used for sign. [if positive then first bit=0, if negative first bit= 1]
let's say you want to convert integer 1099 to byte. just devide 1099 by 256. remainder is your byte representation of int
examples
1099/256 => remainder= 75
-1099/256 =>remainder=-75
2049/256 => remainder= 1
reason why? look at this image http://i.stack.imgur.com/FYwqr.png

According to my understanding, you meant
Integer i=new Integer(2);
byte b=i; //will not work
final int i=2;
byte b=i; //fine
At last
Byte b=new Byte(2);
int a=b; //fine

for (int i=0; i <= 255; i++) {
byte b = (byte) i; // cast int values 0 to 255 to corresponding byte values
int neg = b; // neg will take on values 0..127, -128, -127, ..., -1
int pos = (int) (b & 0xFF); // pos will take on values 0..255
}
The conversion of a byte that contains a value bigger than 127 (i.e,. values 0x80 through 0xFF) to an int results in sign extension of the high-order bit of the byte value (i.e., bit 0x80). To remove the 'extra' one bits, use x & 0xFF; this forces bits higher than 0x80 (i.e., bits 0x100, 0x200, 0x400, ...) to zero but leaves the lower 8 bits as is.
You can also write these; they are all equivalent:
int pos = ((int) b) & 0xFF; // convert b to int first, then strip high bits
int pos = b & 0xFF; // done as int arithmetic -- the cast is not needed
Java automatically 'promotes' integer types whose size (in # of bits) is smaller than int to an int value when doing arithmetic. This is done to provide a more deterministic result (than say C, which is less constrained in its specification).
You may want to have a look at this question on casting a 'short'.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.