JAVA and byte arrays

JAVA and byte arrays - java

I'm trying to use an API where there's a socket is used to communicate. A request is made up of different parts and one of them is the header which is stated as so:
Fixed header: 2 bytes, fixed at 0xffff
Generally I'm not good with bytes and streams, since I've never used it. So how should i create said byte array? I've tried the following
byte[] header = new byte[]{(byte)0xff, (byte)0xff};
But they bytes each become -1, which I believe is because 0xFF translates to 255 which is outside of the signed byte range (-128 to +127), but then how do I create a header like that?

You just did it.
In the end, computers just know about bits. The rest is what the code, and the humans looking at it, make of it. A bit is a 0 or a 1. If you bought a computer with 4GB RAM, then your computer can remember 34359738368 of those.
That's a bit unwieldy, so AMD, or intel, or TSMC, or whomever baked your chip, baked into the chip's design that the chip groups them in sets of 8 (and for certain jobs, in sets of 64 or even higher). But that's where it ends. It's just bits, really. Negative number? What's that? 2? What is this 2 you speak of. I know only 0 and 1.
So that's unwieldy too, so we humans don't wanna say: This byte holds value 00000101. We'll just say 'that holds 5'.
bits = decimal
00000000 = 0
00000001 = 1
00000010 = 2
00000011 = 3
00000100 = 4
00000101 = 5
... and so on
That's great, but what about -1? We just have 0 and 1. There's no - so how do we do this?
That's where it gets interesting. It's a convention, not something in the computer. There's this thing called two's complement: We all agree to check the first bit. If it is a 1, then we shall call this -X, where X is found by applying the following algorithm: Flip every bit (all zeroes become one, all ones become zeroes), and add 1 to it.
11111011 = -5.
Why? Well, flip every bit: 00000100
then add 1 to it : 00000101
which is 5.
But that immediately eats half of what we can represent. After all, the biggest number we can now store in a byte is 127: 01111111, which is 127. If we add 1 to this number, then we get to 10000000, but hey that starts with a 1 bit, so assuming we are all in agreement that this means it is negative, that means 1000000 is -128 (bit of an exotic case).
And sometimes that's annoying or not worth it. So sometimes we all agree that the number cannot be negative at all, and 1000000 is just 128. and 11111111 is just 255.
The computer has no idea. 255 is 11111111 and so is -1. So what's 11111111? The computer doesn't know. It doesn't even know what 2 is. It just knows zeroes and ones, and as far as the computer is concerned, 11111111 is what it is. (the math works out that + and - 'just work' regardless of whether we decree these numbers are to be seen as two's complement signed or not, cool, huh? Try it! If 11111011 is both -5 as well as 251 depending on the opinion of the one reading off the number, what happens? -5 + 2 is -3. 251 + 2 is 253. -3 and 253 boil down to the same sequence of bits. Just an example. This is, incidentally, why we do the weirdo 'flip all bits and add 1' stuff. So that + and - just work and you don't need to pass along whether you consider the bits 'signed' or 'unsigned'.
In java, all numeric types except char (which is a numeric type. You'd think it represents a character, but it really doesn't) are signed. byte is 'signed 8-bit number' (so, can represent from -128 to +127, inclusive). char is the only exception, that is an 'unsigned 16-bit number', so can hold from 0 to 65535, inclusive. It's just if you e.g. call System.out.println((char) 65);, the println method will interpret that number as: "Look this up in the unicode table and print whatever you find there", so that prints 'A'. That's part of the source code of that particular println method, it's nothing inherent about the char type in java, which is just 'a number between 0 and 65535'.
So, when you print your byte array containing 0xFF, 0xFF in java, because java agreed that we consider it signed, it prints -1, -1. But that's just java-ese for 0xFF, 0xFF. Your byte array contains 0xFF, 0xFF because at the bit level -1 and 255 are the exact same number. For bytes anyway. Not so for all the other ones (char, short, int, long).
To recap:
byte x = (byte) 200;
byte x = (byte) 0xC8;
byte x = -56;
In all these cases, x ends up holding the bits 11001000. There is no way to tell the difference. You can't ask the system: So, uh, is this x equal to 200, or 0xC8, or -56? What was used to set it? Because the computer does not know - the compiler translates all of the above code to the exact same end result, which is 11001000.
255 is -1.

Well, to start you must know that in Java all integer types are signed. This means that the most significant bit is reserved to represent the sign. That is why in Java the constant Byte.MAX_VALUE says it can go up to 127.
Now, this means you can store 8 bits in a byte, but if you happen to turn on the sign bit, whatever you store would be represented by Java as negative number.
Since 0xff turns on all the byte bits (i.e. 11111111) instead of getting 255 as you were expecting, what you're getting is -1, because that number represents -1 in Java.
Perhaps to understand it I can show you how the bits work in Java. Imagine a type called nimble of only 4 bits, where the most significant bit is reserved for sign.
This is how it would look in Java if it existed:
Imaginary Signed Type: Nimble (4 bits)
Dec. Bin. Hex.
--------------------
+0 0000 0x0
+1 0001 0x1
+2 0010 0x2
+3 0011 0x3
+4 0100 0x4
+5 0101 0x5
+6 0110 0x6
+7 0111 0x7
-8 1000 0x8
-7 1001 0x9
-6 1010 0xA
-5 1011 0xB
-4 1100 0xC
-3 1101 0xD
-2 1110 0xE
-1 1111 0xF
Notice how those numbers where the most significant bit is on become negative numbers. If this nimble was a unsigned type, then it wouldn't have negative numbers and it could reach 15.
That's why Java bytes go from -128 to 127, instead of up to 255 as you were expecting.
Now, when it comes to creating byte arrays to send to a stream, perhaps instead of creating the byte array yourself, you could wrap your socket output stream to a type-aware stream like a DataOuputStream, which allows you to send data of specific type.
For example:
try(DataOutputStream out = new DataOutpuStream(socket.getOutputStream())) {
dOut.writeByte((byte)0xff);
dOut.writeByte((byte)0xff);
}
That way you may avoid all the difficulties of having to create a header array.
But bottom line, you are array if fine.

Related

how does a number change when it is too long for the selected datatype in java [duplicate]

This question already has answers here:
How are integers cast to bytes in Java?
(8 answers)
Type casting into byte in Java
(6 answers)
Explicit conversion from int to byte in Java
(4 answers)
Closed 4 months ago.
For example of this is my input:
byte x=(byte) 200;
This will be the output:
-56
if this is my input:
short x=(short) 250000;
This will be the output:
-12144
I realize that the output is off because the number does not fit into the datatype, but how can I predict what this output will be in this case? In my computer science exam this my be one of the questions and I do not understand why exactly 200 changes to -56 and so one.
I realize that the output is off because the number does not fit into the datatype, but how can I predict what this output will be in this case? In my computer science exam this my be one of the questions and I do not understand why exactly 200 changes to -56 and so one.

The relevant aspects are what overflow looks like, and how the bits that represent the underlying data are treated.
Computers are all bits, grouped together in groups of 8; a group of 8 bits is called a byte.
byte b = 5; for example, is stored in memory as 0000 0101.
Bits can be 0. Or 1. That's it. That's where it ends. And everything is, in the end, bits. This means: That - is not a thing. Computers do not know what - is and cannot store it. We need to write code and agree on some sort of meaning to represent them.
2's complement
So what's -5 in bits? It's 1111 1011. Which seems bizarre. But it's how it works. If you write: byte b = -5;, then b will contain 1111 1011. It is because javac made that happen. Similarly, if you then call System.out.println(b), then the println method gets the bit sequence 1111 1011. Why does the println method decide to print a - symbol and then a 5 symbol? Because it's programmed that way: We all are in agreement that 1111 1011 is -5. So why is that?
Because of a really cool property - signed/unsigned irrelevancy.
The rule is 2's complement: To switch the sign (i.e. turn 5, which is 0000 0101 into -5 which is 1111 1011), you flip every bit, and then add 1 to the end result. Try it with 0000 0101 - and you'll see it's 1111 1011. This algorithm is reversible - apply the same algorithm (flip every bit, then add 1) and you can turn -5 into 5.
This 2's complement thing has 2 great advantages:
There is only one 0 value. If we just flipped all bits, we'd have both 1111 1111 and 0000 0000 both representing some form of 0. In basic math, there's no such thing as 'negative 0' - it's the same as positive 0. Similarly if we just decided the first bit is the sign and the remaining 7 bits are the number, then we'd have both 1000 0000 and 0000 0000 both being 0, which is annoying and inefficient, why waste 2 different bit sequences on the same number?
plus and minus are sign-mode independent. The computer doesn't have to KNOW whether we are doing the 2's complement thing or not. Take the bit sequence 1111 1011. If we treat that as unsigned bits, then that is 251 (it's 128 + 64 + 32 + 16 + 8 + 2 + 1). If we treat that as a signed number, then the first bit is 1, so the thing is negative: We apply 2's complement and figure out that it is -5. So, is it -5 or 251? It's both, at once! Depends on the human/code that interprets this bit sequence which one it is. So how could the computer possibly do a + b given this? The weird answer is: It doesn't matter - because the math works out the same way. 251 - 10 is 241. -5 - 10 is -15. -15 and 241 are the exact same bit sequence.
Overflow
A byte is 8 bits, and there are 256 different sequences of bits, and then you have listed each and every possible variant. (2^8 = 256. Hence, a 16-bit number can be used to convey 65536 different things, because 2^16 is 65536, and so on). So, given that bytes are 8 bits and we decreed they are signed, and 2's complement signed, that means that the smallest number you can send with it is -128, which in bits is 1000 0000 (use 2's complement to check my work), and +127, which in bits is 0111 1111. So what happens if you add 1 to 127? That'd seemingly be +128 except that's not storable in 8 bits if we decree that we interpret these bits as 2's complement signed (which java does). What happens? The bits 'roll over'. We just add 1 as normal, which turns 0111 1111 into 1000 0000 which is -128:
byte b = 127;
b = (byte)(b + 1);
System.out.println(b); // prints -128
Imagine the number line - stretching out into infinity on both ends, from -infinite to +infinite. That's the usual way math works. Computers (or rather, int, long, etc) do not work like that. Instead of a line, it is a circle. Take your infinite number line and take some scissors, and snip that number line at -128 (because a 2's comp signed byte cannot represent -129 or anything else below -128), and at +127 (because our byte cannot represent 128 or anything above it).
And now tape the 2 cut ends together.
That's the number line. What's 'to the right' of 125? 126 - that's what +1 means: Move one to the right on the number line.
What's 'to the right' of +127? Why, -128. Because we taped it together.
Similarly, -127 - 5 is +123. '-5' is 'move 5 places to the left on the number line (or rather, number circle)'. Going in 1 decrements:
-127 (we start here)
-128 (-127 -1)
+127 (-127 -2)
+126 (-127 -3)
+125 (-127 -4)
+124 (-127 -5)
Hence, 124.
Same math applies to short (-32768 to +32767), char (which is really a 16-bit unsigned number - so 0 to 65535), int (-2147483648 to +2147483647), and even long (-2^63 to +2^63-1 - those get a little large).
short x = 32765;
x += 5;
System.out.println(x); // prints -32766.

How do I convert a short to an int without turning it into a negative in java

I am working on a file reader and came into a problem when trying to read a short. In short (punintended), java is converting a two bytes I'm using to make the short into an int to do bitwise operations and is converting it in a way to keep the same value. I need to convert the byte into an int in a way that would preserve its value so the bits stayed the same.
example of what's happening:
byte number = -1; //-1
int otherNumber = 1;
number | otherNumber; // -1
example of what I want:
byte number = -1; //-1
int otherNumber = 1;
number | otherNumber; // 129

This can be done pretty easily with some bit magic.
I'm sure you're aware that a short is 16 bits (2 bytes) and an int is 32 bits (4 bytes). So, between an integer and a short, there is a two-byte difference. Now, for positive numbers, copying the value of a short to an int is effectively copying the binary data, however, as you've pointed out, this is not the case for negative numbers.
Now let's look at how negative numbers are represented in binary. It's a bit confusing, so I'll try to keep it simple. Modern systems use what's called the two's compliment to store negative numbers. Basically all this means is that the very first bit in the set of bytes representing the number determines whether or not it's negative. For mathematical purposes, the rest of the bits are also inverted and offset 1 bit to the right (since you can't have negative 0). For example, 2 as a short would be represented as 0000 0000 0000 0010, while -2 would be represented as 1111 1111 1111 1110. Now, since the bytes are inverted in a negative number, this means that -2 in int form is the same but with 2 more bytes (16 bits) at the beginning that are all set to 1.
So, in order to combat this, all we need to do is change the extra 1s to 0s. This can be done by simply using the bitwise and operator. This operator goes through each bit and checks if the bits at each position in each operand are a 1 or a 0. If they're both 1, the bit is flipped to a 0. If not, nothing happens.
Now, with this knowledge, all we need to do is create another integer where the first two bytes are all 1. This is fairly simple to do using hexidecimal literals. Since they are an integer by default, we simply need to use this to get four bytes of 1s. With a single byte, if you were to set every bit to 1, the max value you can get is 255. 255 in hex is 0xFF, so 2 bytes would be 0xFFFF. Pretty simple, now you just need to apply it.
Here is an example that does exactly that:
short a = -2;
int b = a & 0xFFFF;
You could also use Short.toUnsignedInt(), but where's the fun in that? 😉

How to send an integer greater than 127 from android(server java) using Byte array in java to computer (client c)? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 12 months ago.
Improve this question
I want to send an integer value (6000) from android.I have to transfer it using byte[], for which i tried converting int[] to byte[].But [0,0,23,112] is being stored.Could someone help?

[0,0,23,112] is 6000.
As you said, you must send the data in the shape of a byte array. A single byte is 8 bits; a single bit is an on/off switch. With 8 on/off switches, you can represent 256 different unique states. (2^8 is 256). A byte is just that, and it ends there. The bitsequence 00000101 is commonly understood to mean '5', but that's just conventions. The computer doesn't know what 5 is, it just knows bits and bytes, it keeps seeing 00000101. If you call System.out.println and pass that byte, and you see 5? That's println that decided to render it that way. It's not a universal truth about bytes.
In java specifically, all the various methods that interact with bytes, including println, have decreed that they interpret byte values as two's complement signed. That means that it counts up from 0 to 127, then 'rolls over' to -128, and as you keep incrementing your bits, goes back to -1, at which point we've covered all the 256 unique combinations (0, that's 1 combination. 127 positive integers, 128 negative ones: 1+127+128 is 256). But, again, just a choice.
This is where there is no such thing as a "signed byte" and an "unsigned byte", as far as the byte is concerned. The question of 'is it signed or unsigned' is for the code that prints to decide. When you put bytes on a wire or in a file, it's irrelevant. In that sense, the byte 255 and the byte -1 are the identical value. That value (The bit sequence 11111111) is printed as -1 if the code that does the printing decides to treat it as signed, and prints as 255 if the printing code decides to treat it as unsigned.
This explains, however, that you can't "just" put, say, '200' in a byte value because java treats things as signed (even the compiler), but this:
byte b = (byte) 200;
works fine and does exactly what you want.
However, even unsigned, bytes are still limited to the 0-255 range (or, at least, they can represent 256 unique things, and assuming we want to start at 0, that means only 0-255 is covered).
Thus, how do you represent higher numbers? Simply by adding more bytes.
The exact same thing happens when you count. We have 10 digits in the common western arabic numeral system: A single digit symbol can differentiate 10 different things. So what happens if you want to count up to 12?
Once you get to the 10th digit (the 9), and you want to add 1 more to it, what do we do?
We invent a second digit! We increment the second digit (from blank/0 to 1), and start our first digit (what used to be the 9) over from the beginning. Thus, after 9, we have 10.
You can do the exact same thing with bytes. A common digit covers 10 different things (0-9). A byte, however, covers 255 different things.
So what do you do when you want to 'roll over' and you need to add 1 to 255?
You add a second digit byte, and restart the first digit byte from 0 again.
So, we go from a single byte with bitsequence 11111111 (representing 255, let's say we treat it as unsigned for this exercise), and when you add one to that, we end up with 2 bytes. The first byte is 00000001 (representing 1), and the second byte is 00000000) representing 0. Just like we went from 9 to 10.
Just like with human decimal, computers treat 10 in bytes the same way: that leftmost digit now counts the amount of times we 'rolled over' our first digit. Except with bytes, of course, each 'rollover' was 256, whereas with human decimal digits, each rollover is 10. Mathematically: Decimal (western arabic numerals)) counting does 1 * 10^1 + 0 * 10^0, byte-based counting does 1 * 256^1 + 0 * 256^0. So, the byte sequence of [1, 0] is 256.
Let's do that math now on the byte sequence: 23, 112.
We 'rolled over' our 256-ranged byte 23 times, so that's 23 * 256 = 5888, and that final digitbyte adds 112 more. 5888 + 112 is... 6000!
Hence, whatever you did to turn 6000 into the byte array [0, 0, 23, 112]? That was correct. That is 6000 in big endian bytes.
NB: Little Endian means that the least number is sent first - that you write the digits in reverse order. 6000 in little endian is [112, 23, 0, 0]. Most protocols (networks, files, etc) use big-endian. Most CPUs work in big endian. But intel CPUs work in little endian, as in, if your computer has an intel chip and it stores 6000 in its own memory banks, it stores [112, 23, 0, 0]. Some protocols/file formats just dump memory, and they tend to be in little endian, because for a decade or two a lot of computers had intel chips in them. However, that era appears to be ending, as is the era of 'just dump memory straight to a file, voila, state saved'. Hence, little endian was never particularly relevant and is getting less relevant as time progresses.

What does it mean when we say the width of Byte in java is 8 bit?

I can store numbers ranging from -127 to 127 but other than that it is impossible and the compiler give warning. The binary value of 127 is 01111111, and 130 is 10000010 still the same size (8 bits) and what I think is I can store 130 in Byte but it is not possible. How did that happen?

Java does not have unsigned types, each numeric type in Java is signed (except char but it is not meant for representing numbers but unicode characters).
Let's take a look at byte. It is one byte which is 8 bits. If it would be unsigned, yes, its range would be 0..255.
But if it is signed, it takes 1 bit of information to store the sign (2 possible values: + or -), which leaves us 7 bits to store the numeric (absolute) value. Range of 7 bit information is 0..127.
Note that the representation of signed integer numbers use the 2's complement number format in most languages, Java included.
Note: The range of Java's byte type is actually -128..127. The range -127..127 only contains 255 numbers (not 256 which is the number of all combinations of 8 bits).

In Java, a byte is a signed data type. You are thinking about unsigned bytes, in which case it is possible to store the value 130 in 8 bits. But with a signed data type, that also allows negative numbers, the first bit is necessary to indicate a negative number.
There are two ways to store negative numbers, (one's complement and two's complement) but the most popular one is two's complement. The benefit of it is that for two's complement, most arithmetic operations do not need to take the sign of the number into account; they can work regardless of the sign.

The first bit indicates the sign of the number: When the first bit is 1, then the number is negative. When the first bit is 0, then the number is positive. So you basically only have 7 bits available to store the magnitude of the number. (Using a small trick, this magnitude is shifted by 1 for negative numbers - otherwise, there would be two different bit patterns for "zero", namely 00000000 and 10000000).
When you want to store a number like 130, whose binary representation is 10000010, then it will be interpreted as a negative number, due to the first bit being 1.
Also see http://en.wikipedia.org/wiki/Two%27s_complement , where the trick of how the magnitude is shifted is explained in more detail.

Unexpected endless byte for loop

I have the following loop:
for (byte i = 0 ; i < 128; i++) {
System.out.println(i + 1 + " " + name);
}
When I execute my programm it prints all numbers from -128 to 127 in an infinite loop. Why does this happen?

byte is a 1-byte type so can vary between -128...127, so condition i < 128 is always true. When you add 1 to 127 it overflows and becomes -128 and so on in a (infinite) loop...

After 127, when it increments, it will become -128, so your condition won't match .
byte: The byte data type is an 8-bit signed two's complement integer. It has a minimum value of -128 and a maximum value of 127 (inclusive). The byte data type can be useful for saving memory in large arrays, where the memory savings actually matters. They can also be used in place of int where their limits help to clarify your code; the fact that a variable's range is limited can serve as a form of documentation.
It will work like this:
0, 1, 2, ..., 126, 127, -128, -127, ...
as 8 bits can represent a signed number up to 127.
See here for the primitive data types.
Picture says more than words

Because bytes are signed in Java so they will always be less than 128.
Why Java chose signed bytes is a mystery from the depths of time. I've never been able to understand why they corrupted a perfectly good unsigned data type :-)
Try this instead:
for (byte i = 0 ; i >= 0; ++i ) {
or, better yet:
for (int i = 0 ; i < 128; ++i ) {

because when i == 127 and and you executes i++ it overflows to -128.

The type byte has a range of -128..127. So i is always less than 128.

Alright, so the reason behind this has been answered already, but in case you were interested in some background:
A bit is the smallest unit of storage which the computer can recognize (n.b. not the smallest number). A bit is either a 0 or a 1.
A byte is an 8 bit data type, meaning it is composed of 8 bit strings such as 10101010 or 0001110. Using simple combinatorics, we know that there are 2^8 = 256 possible combinations of bytes.
If we wanted to only represent positive numbers, we could do a straight conversion from base 2 to base 10. The way that works is, for a bit string b7b6b5b4b3b2b1b0 the number in decimal is dec = sum from n=0 to 7 of (bn * 2^n).
By only representing positive numbers ( an unsigned byte) we can represent 256 possible numbers in the range 0 to 255 inclusive.
The problem comes in when we want to represent signed data. A naive approach (n.b. this is for background, not the way java does it) is to take the left most bit and make it the sign bit where 1 is negative and 0 is positive. So for example 00010110 would be 21 and 10010110 would be -21.
There are two major problems with such a system. The first is that 00000000 is 0 and 10000000 is -0, but as everyone knows, there is no -0 that is somehow different from 0, but such a system allows for the number and 0 ≠ -0. The second problem is that, due to representing two zeroes, the system only allows for representing numbers from -127 to 127, a range of only 254 (2 less than before).
A much better system (and the one which most systems use) is called Two's Compliment. In Two's Compliment, the positive numbers are represented with their normal bit string where the leftmost bit is 0. Negative numbers are represented with the left most bit as a 1 and then calculating the two's compliment for that number (from whence the system gets its name)
Although mathematically it is a slightly more complex process, because we are dealing with the number 2 there are some short cuts. Essentially, you can take the positive version and (from right to left) take all zeroes until you hit a 1. Copy those zeroes and one, then take the NOT of the rest of the bits. So for example, to get -21, positive 21 is 00010110 we take the 10 and not the rest to get 11101010, the two's compliment representation of -21.
Two's Compliment is a much more difficult system to grasp, but it avoids the previously stated problems, and for an n-bit number can represent all digits from -2^(n-1) to 2^(n-1)-1 which for our byte means -128 to 127 (hence the problem in this question)
A couple of notes:
- This is for integer representation only. Real number representation is another system entirely (if there is a request for it, I'm sure we could make a number representation CW post)
- Wikipedia has a couple more number representation systems if you're interested.

Best is if you do
for (byte i = 0 ; i < Byte.MAX_VALUE; i++ ) {
System.out.println( i + 1 + " " + name );
}

this should work
for (byte i = 0 ; i<128; ++i ) {
if(i==-128)
break;
System.out.println( i + 1 + " " + "name" );
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.