Java byte to short convert randomly attached 0xff problem - java

I am trying to cast from byte to short, but the value is randomly attached to 0xff. Is there a way to handle it normally?
short[] shortArray = new short[size];
for (int index = 0; index < size; index++)
shortArray[index] = (short) byteArray[index];

The negative numbers always must have 1 in leftmost bit.
If byte array has negative numbers, their representation in short format requires left byte to be 0xff to keep the same negative value.
For example:
the decimal byte value -2 is binary 0b1111_1110 or hexadecimal 0xfe
the decimal short value -2 is binary 0b1111_1111_1111_1110 or hexadecimal 0xfffe

Looks like you want to do an unsigned conversion (e.g. got results in range [0..255]).
byte is a signed type in Java, so converting negative byte values to short will produce negative numbers (and in two's complement system you'll see 0xff prefix).
Bit representation, however, is same for signed and unsigned bytes, for example (byte) 0xFF means 255 as unsigned, but is -1 if treated as signed one.
Unsigned conversion can be done by implicit promotion to int, picking 8 low bits using AND and downcasting result to short:
shortArray[index] = (short) (byteArray[index] & 0xFF);

Related

How is short 128 = byte -128 while explicit casting in java ?

import java.util.Scanner;
public class ShortToByte{
public static void main (String args[]){
int i=0;
while (i<6){
Scanner sinput = new Scanner (System.in);
short a = sinput.nextShort();
byte b = (byte) a;
System.out.println("Short value : " + a + ",Byte value : " + b);
i++;
}
}
}
I am trying to understand conversion between different data types, but I am confused as how is short value of 128 = -128 in byte and also how is 1000 in short = -24 in byte ?
I have been using the following logic to convert short to byte :
1000 in decimal -> binary = 0000 0011 1110 1000
while converting to byte : xxxx xxxx 1110 1000 which is equivalent to : 232
I do notice that the correct answer is the two's complement of the binary value, but then when do we use two's complement to convert and when not as while converting 3450 from short to byte I did not use two's complement yet achieved the desired result.
Thank you!
Your cast from short to byte is a narrowing primitive conversion. The JLS, Section 5.1.3, states:
A narrowing conversion of a signed integer to an integral type T simply discards all but the n lowest order bits, where n is the number of bits used to represent type T. In addition to a possible loss of information about the magnitude of the numeric value, this may cause the sign of the resulting value to differ from the sign of the input value.
(bold emphasis mine)
Numeric types are signed in Java. The short 128, represented in bits as
00000000 10000000
is narrowed to 8 bits as follows:
10000000
... which is -128. Now the 1 bit is no longer interpreted as +128; now it's -128 because of how two's complement works. If the first bit is set, then the value is negative.
Something similar is going on with 1000. The short 1000, represented in bits as
00000011 11101000
is narrowed to 8 bits as follows:
11101000
... which is -24.
From java datatypes:
byte: The byte data type is an 8-bit signed two's complement integer. It has a minimum value of -128 and a maximum value of 127 (inclusive)
Therefore 128 overflows to -128.

Reading bytes in Java

I am trying to understand how the following line of code works:
for (int i = 0; i < numSamples; i++) {
short ampValue = 0;
for (int byteNo = 0; byteNo < 2; byteNo++) {
ampValue |= (short) ((data[pointer++] & 0xFF) << (byteNo * 8));
}
amplitudes[i] = ampValue;
}
As far as I understand, this is reading 2 bytes (as 2 bytes per sample) in a inclusive manner, i.e. the ampValue is composed of two byte reads. The data is the actual data sample (file) and the pointer is increasing to read it upto the last sample. But I don't understand this part:
"data[pointer++] & 0xFF) << (byteNo * 8)); "
Also, I am wondering whether it makes any difference if I want to read this as a double instead of short?
Looks like data[] is the array of bytes.
data[pointer++] gives you a byte value in the range [-128..127].
0xFF is an int contstant, so...
data[pointer++] & 0xFF promotes the byte value to an int value in the range [-128..127]. Then the & operator zeroes out all of the bits that are not set in 0xFF (i.e., it zeroes out the 24 upper bits, leaving only the low 8 bits.
The value of that expression now will be in the range [0..255].
The << operator shifts the result to the left by (byteNo * 8) bits. That's the same as saying, it multiplies the value by 2 raised to the power of (byteNo * 8). When byteNo==0, it will multiply by 2 to the power 0 (i.e., it will multiply by 1). When byteNo==1, it will multiply by 2 to the power 8 (i.e., it will multiply by 256).
This loop is creating an int in the range [0..65535] (16 bits) from each pair of bytes in the array, taking the first member of each pair as the low-order byte and the second member as the high-order byte.
It won't work to declare ampValue as double, because the |= operator will not work on a double, but you can declare the amplitudes[] array to be an array of double, and the assignment amplitudes[i] = ampValue will implicitly promote the value to a double value in the range [0.0..65535.0].
Additional info: Don't overlook #KevinKrumwiede's comment about a bug in the example.
In Java, all bytes are signed. The expression (data[pointer++] & 0xFF) converts the signed byte value to an int with the value of the byte if it were unsigned. Then the expression << (byteNo * 8) left-shifts the resulting value by zero or eight bits depending on the value of byteNo. The value of the whole expression is assigned with bitwise or to ampValue.
There appears to be a bug in this code. The value of ampValue is not reset to zero between iterations. And amplitude is not used. Are those identifiers supposed to be the same?
Let's break down the statement:
|= is the bitwise or and assignment operator. a |= b is equivalent to a = a | b.
(short) casts the int element from the data array to a short.
pointer++ is a post-increment operation. The value of pointer will be returned and used and then immediately incremented every single time it's accessed in this fashion - this is beneficial in this case because the outer-loop is cycling through 2-byte samples (via the inner loop) from the contiguous data buffer, so this keeps incrementing.
& is the bitwise AND operator and 0xFF is the hexadecimal value for the byte 0b11111111 (255 in decimal); the expression data[pointer++] & 0xFF is basically saying, for each bit in the byte retrieved from the data array, AND it with 1. In this context, it forces Java, which by default stores signed byte objects (i.e. values from -128 to 127 in decimal), to return the value as an unsigned byte (i.e. values from 0 to 255 decimal).
Since your samples are 2 bytes long, you need to shift the second lot of 8 bits left, as the most significant bits, using the left bit-shift operator <<. The byteNo * 8 ensures that you're only shifting bits when it's the second of the two bytes.
After the two bytes have been read, ampValue will now contain the value of the sample as a short.

How to manually calculate the value of a typecasted byte value that exceeds the primary data type? [duplicate]

This question already has answers here:
java byte data type
(3 answers)
Closed 8 years ago.
since english isn't my main language and i've didn't find any pointers as to how to manually calculate the new number.
Example:
byte b = (byte) 720;
System.out.println(b); // -48
I know that most primary data types use the two complement.
Byte value ranges from -128 to 127, so it's expected to round the number down to something that fits in byte.
The Question is how do i manually calculate the -48 of the typecasted 720? I've read that i have to convert it to two complement, so taking the binary number, searching from right to left for the first one and then inverting all others and since byte only has 8 bits, take the first 8 bits. But that didn't quite work for me, it would be helpful if you would tell me how to calculate a typecasted number that doesn't fit into byte. Thanks for reading! :)
The binary representation of 702 is
10 1101 0000
Get rid of everything except the last 8 bits (because that's what fits into a byte).
1101 0000
Because of the leading 1, get the complement
0010 1111
and add 1
0011 0000
and negate the value. Gives -48.
In Java, integral types are always 2's complement. Section 4.2 of the JLS states:
The integral types are byte, short, int, and long, whose values are 8-bit, 16-bit, 32-bit and 64-bit signed two's-complement integers
You can mask out the least significant 8 bits.
int value = 720;
value &= 0xFF;
But that leaves a number in the range 0-255. Numbers higher than 127 represent negative numbers for bytes.
if (value > Byte.MAX_VALUE)
value -= 1 << 8;
This manually yields the -48 that the (byte) cast yields.
First what happens is the value (in binary) is truncated.
720 binary is 0b1011010000
We can only fit 8 bits 0b11010000
first digit 1, so the value is negative.
2's compliment gives you -48.
This is close to redundant with the answer posted by rgettman, but since you inquired about the two's complement, here is "manually" taking the 2's complement, rather than simply subtracting 256 as in that answer.
To mask the integer down to the bits that will be considered for a byte, combine it bitwise with 11111111.
int i = 720;
int x = i & 0xFF;
Then to take the two's complement:
if (x >> 7 == 1) {
x = -1 * ((x ^ 0xFF) + 1);
}
System.out.println(x);

Is byte not signed two's complement?

Given that byte,short and int are signed, why do byte and short in Java not get the usual signed two's complement treatment ? For instance 0xff is illegal for byte.
This has been discussed before here but I couldn't find a reason for why this is the case.
If you look at the actual memory used to store -1 in signed byte, then you will see that it is 0xff. However, in the language itself, rather than the binary representation, 0xff is simply out of range for a byte. The binary representation of -1 will indeed use two's complement but you are shielded from that implementation detail.
The language designers simply took the stance that trying to store 255 in a data type that can only hold -128 to 127 should be considered an error.
You ask in comments why Java allows:
int i = 0xffffffff;
The literal 0xffffffff is an int literal and is interpreted using two's complement. The reason that you cannot do anything similar for a byte is that the language does not provide syntax for specifying that a literal is of type byte, or indeed short.
I don't know why the decision not to offer more literal types was made. I expect it was made for reasons of simplicity. One of the goals of the language was to avoid unnecessary complexity.
You can write
int i = 0xFFFFFFFF;
but you can't write
byte b = 0xFF;
as the 0xFF is an int value not a byte so its equal to 255. There is no way to define a byte or short literal, so you have to cast it.
BTW You can do
byte b = 0;
b += 0xFF;
b ^= 0xFF;
even
byte b = 30;
b *= 1.75; // b = 52.
it is legal, but you need to cast it to byte explicitly, i.e. (byte)0xff because it is out of range.
You can literally set a byte, but surprisingly, you have to use more digits:
byte bad = 0xff; // doesn't work
byte b = 0xffffffff; // fine
The logic is, that 0xff is implicitly 0x000000ff, which exceeds the range of a byte. (255)
It's not the first idea you get, but it has some logic. The longer number is a smaller number (and smaller absolute value).
byte b = 0xffffffff; // -1
byte c = 0xffffff81; // -127
byte c = 0xffffff80; // -128
The possible values for a byte range from -128 to 127.
It would be possible to let values outside the range be assigned to the variable, and silently throw away the overflow, but that would rather be confusing than conventient.
Then we would have:
byte b = 128;
if (b < 0) {
// yes, the value magically changed from 128 to -128...
}
In most situations it's better to have the compiler tell you that the value is outside the range than to "fix" it like that.

Java implicit conversion of int to byte

I am about to start working on something the requires reading bytes and creating strings. The bytes being read represent UTF-16 strings. So just to test things out I wanted to convert a simple byte array in UTF-16 encoding to a string. The first 2 bytes in the array must represent the endianness and so must be either 0xff 0xfe or 0xfe 0xff. So I tried creating my byte array as follows:
byte[] bytes = new byte[] {0xff, 0xfe, 0x52, 0x00, 0x6F, 0x00};
But I got an error because 0xFF and 0xFE are too big to fit into a byte (because bytes are signed in Java). More precisely the error was that the int couldn't be converted to a byte. I know that I could just explicitly convert from int to byte with a cast and achieve the desired result, but that is not what my question is about.
Just to try something out I created a String and called getBytes("UTF-16") then printed each of the bytes in the array. The output was slightly confusing because the first two bytes were 0xFFFFFFFE 0xFFFFFFFF, followed by 0x00 0x52 0x00 0x6F. (Obvisouly the endianness here is different from what I was trying to create above but that is not important).
Using this output I decided to try and create my byte array the same way:
byte[] bytes = new byte[] {0xffffffff, 0xfffffffe, 0x52, 0x00, 0x6F, 0x00};
And strangely enough it worked fine. So my question is, why does Java allow an integer value of 0xFFFFFF80 or greater to be automatically converted to a byte without an explicit cast, but anything equal to or greater than 0x80 requires an explicit cast?
The key thing to remember here is that int in Java is a signed value. When you assign 0xffffffff (which is 2^32 -1), this is translated into a signed int of value -1 - an int cannot actually represent something as large as 0xffffffff as a positive number.
So for values less than 0x80 and greater than 0xFFFFFF80, the resulting int value is between -128 and 127, which can unambiguously be represented as a byte. Anything outside that range cannot be, and needs forcing with an explicit cast, losing data in the process.
If you use a number without a hint (e.g. 1234L for a long) the compiler assumes an integer. The value 0xffffffff is an integer with value -1 which can be cast to byte without a warning.
Because 0xffffffff is the number -1 and -1 can be interpreted as a byte.
0xff is the same as writing 0x000000ff, not 0xffffffff. So that's your issue; the integer is a positive number (255), but the byte (if converted bit-for-bit) would be a negative number (-1). But 0xffffffff is -1 both as an int and as a byte.
Because int are signed and 0xffffffff represent -1, and 0xff represent an integer of value 255, which not lie into -128 (0x80) +127 (0x7f) range of a byte.

Categories

Resources