Does Java read integers in little endian or big endian? - java

I ask because I am sending a byte stream from a C process to Java. On the C side the 32 bit integer has the LSB is the first byte and MSB is the 4th byte.
So my question is: On the Java side when we read the byte as it was sent from the C process, what is endian on the Java side?
A follow-up question: If the endian on the Java side is not the same as the one sent, how can I convert between them?

Use the network byte order (big endian), which is the same as Java uses anyway. See man htons for the different translators in C.

I stumbled here via Google and got my answer that Java is big endian.
Reading through the responses I'd like to point out that bytes do indeed have an endian order, although mercifully, if you've only dealt with “mainstream” microprocessors you are unlikely to have ever encountered it as Intel, Motorola, and Zilog all agreed on the shift direction of their UART chips and that MSB of a byte would be 2**7 and LSB would be 2**0 in their CPUs (I used the FORTRAN power notation to emphasize how old this stuff is :) ).
I ran into this issue with some Space Shuttle bit serial downlink data 20+ years ago when we replaced a $10K interface hardware with a Mac computer. There is a NASA Tech brief published about it long ago. I simply used a 256 element look up table with the bits reversed (table[0x01]=0x80 etc.) after each byte was shifted in from the bit stream.

There are no unsigned integers in Java. All integers are signed and in big endian.
On the C side the each byte has tne LSB at the start is on the left and the MSB at the end.
It sounds like you are using LSB as Least significant bit, are you? LSB usually stands for least significant byte.
Endianness is not bit based but byte based.
To convert from unsigned byte to a Java integer:
int i = (int) b & 0xFF;
To convert from unsigned 32-bit little-endian in byte[] to Java long (from the top of my head, not tested):
long l = (long)b[0] & 0xFF;
l += ((long)b[1] & 0xFF) << 8;
l += ((long)b[2] & 0xFF) << 16;
l += ((long)b[3] & 0xFF) << 24;

There's no way this could influence anything in Java, since there's no (direct non-API) way to map some bytes directly into an int in Java.
Every API that does this or something similar defines the behaviour pretty precisely, so you should look up the documentation of that API.

I would read the bytes one by one, and combine them into a long value. That way you control the endianness, and the communication process is transparent.

Imho there is no endianness defined for java. The endianness is the one of the hardware but java is highlevel and hides the hardware so you don't have to wory about that.
The only endianess related feature is how the java lib maps int and long to byte[] (and inversely). It does it Big-Endian which is the most readable and natural:
int i=0xAABBCCDD
maps to
byte[] b={0xAA,0xBB,0xCC,0xDD}

If it fits the protocol you use, consider using a DataInputStream, where the behavior is very well defined.

Java is 'Big-endian' as noted above. That means that the MSB of an int is on the left if you examine memory (on an Intel CPU at least). The sign bit is also in the MSB for all Java integer types.
Reading a 4 byte unsigned integer from a binary file stored by a 'Little-endian' system takes a bit of adaptation in Java. DataInputStream's readInt() expects Big-endian format.
Here's an example that reads a four byte unsigned value (as displayed by HexEdit as 01 00 00 00) into an integer with a value of 1:
// Declare an array of 4 shorts to hold the four unsigned bytes
short[] tempShort = new short[4];
for (int b = 0; b < 4; b++) {
tempShort[b] = (short)dIStream.readUnsignedByte();
}
int curVal = convToInt(tempShort);
// Pass an array of four shorts which convert from LSB first
public int convToInt(short[] sb)
{
int answer = sb[0];
answer += sb[1] << 8;
answer += sb[2] << 16;
answer += sb[3] << 24;
return answer;
}

java force indeed big endian : https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-2.html#jvms-2.11

Related

Get least significant bytes from an integer

I need to sum all data bytes in ByteArrayOutputStream, adding +1 to the result and taking the 2 least significant bytes.
int checksum = 1;
for(byte b : byteOutputStream.toByteArray()) {
checksum += b;
}
Any input on taking the 2 least significant bytes would be helpful. Java 8 is used in the environment.
If you really mean least significant bytes then:
checksum & 0xFFFF
If you meant that you want to take least significant bits from checksum, then:
checksum & 0x3
Add
checksum &= 0x0000ffff;
That will zero out everything to the left of the 2 least significant bytes.
Your question is a bit underspecified. You didn’t say neither, what you want to do with these two bytes nor how you want to store them (which depends on what you want to do).
To get to individual bytes, you can use
byte lowest = (byte)checksum, semiLowest=(byte)(checksum>>8);
In case you want to store them in a single integer variable, you have to decide, how these bytes are to be interpreted numerically, i.e signed or unsigned.
If you want a signed interpretation, the operation is as simple as
short lowest2bytes = (short)checksum;
If you want an unsigned interpretation, there’s the obstacle that Java has no dedicated type for that. There is a 2 byte sized unsigned type (char), but using it for numerical values can cause confusion when other code tries to interpret it as character value (i.e. when printing). So in that case, the best solution is to use an int variable again and only initialize it with the unsigned char value:
int lowest2bytes = (char)checksum;
Note that this is semantically equivalent to
int lowest2bytes = checksum&0xffff;
seen in other solutions.

Byte buffer in Java?

Since I found out that it's impossible to have unsigned bytes in java, and that essentially they take up the same memory as an int, is there really any difference when you send them over a packet?
If I send a Java byte via tcp or udp via(Games.RealTimeMultiplayer.sendReliableMessage) would that be more beneficial to me than just sending an integer to represent an unsigned byte?
Since I found out that it's impossible to have unsigned bytes in java
This is incorrect. There are lots of ways. You can even use byte to represent an unsigned byte. You just need to perform a mapping in places that require it; e.g.
byte b = ...
int unsigned = (b >= 0) ? b : (b + 256);
... and that essentially they take up the same memory as an int.
That is also incorrect. It is true for a byte variable or field. However, the bytes in a byte array occupy 1/4 of the space of integers in an int array.
... is there really any difference when you send them over a packet?
Well yes. A byte sent over the network (in the natural fashion) takes 1/4 of the number of bits as an int sent over the network. If you are sending an "8 bit quantity" as 32 bits, then you are wasting network bandwidth.

Java - byte array arithmetics on bit-level?

I'm using byte arrays (of size 2 or 4) to emulate the effect of short and int data types.
Main idea is to have a data type that support both char and int types, however it is really hard for me to emulate arithmetic operations in this way, since I must do them in bit level.
For those who do not follow:
The int representation of 123 is not equal to the byte[] of {0,1,2,3} since their bit representations differ (123 representation is 00000000000000000000000001111011 and the representation of {0,1,2,3} is 00000000000000010000001000000011 on my system.
So "int of 123" would actually be equivalent to "byte[] of {0,0,0,123}". The problems occur when values stretch over several bytes and I try to subtract or decrement from those byte arrays, since then you have to interact with several different bytes and my math isn't that sharp.
Any pseudo-code or java library suggestions would be welcome.
Unless you really want to know what bits are being carried from one byte to the next, I'd suggest don't do it this way! If it's just plain math, then convert your arrays to real short and int types, do the math, then convert them back again.
If you must do it this way, consider the following:
Imaging you're adding two short variables that are in byte arrays.
The first problem you have is that all Java integer types are signed.
The second is that the "carry" from the least-significant-byte into the most-significant-byte is best done using a type that's longer than a byte because otherwise you can't detect the overflow.
i.e. if you add two 8-bit values, the carry will be in bit 8. But a byte only has bits 0..7, so to calculate bit 8 you have to promote your bytes to the next appropriate larger type, do the add operation, then figure out if it resulted in a carry, and then handle that when you add up the MSB. It's just not worth it.
BTW, I did actually have to do this sort of bit manipulation many years ago when I wrote an MC6809 CPU emulator. It was necessary to perform multiple operations on the same operands just to be able to figure out the effect on the CPU's various status bits, when those same bits are generated "for free" by a hardware ALU.
For example, my (C++) code to add two 8-bit registers looked like this:
void mc6809::help_adc(Byte& x)
{
Byte m = fetch_operand();
{
Byte t = (x & 0x0f) + (m & 0x0f) + cc.bit.c;
cc.bit.h = btst(t, 4); // Half carry
}
{
Byte t = (x & 0x7f) + (m & 0x7f) + cc.bit.c;
cc.bit.v = btst(t, 7); // Bit 7 carry in
}
{
Word t = x + m + cc.bit.c;
cc.bit.c = btst(t, 8); // Bit 7 carry out
x = t & 0xff;
}
cc.bit.v ^= cc.bit.c;
cc.bit.n = btst(x, 7);
cc.bit.z = !x;
}
which requires that three different additions get done on different variations of the operands just to extract the h, v and c flags.

bit operations in java, compared to c

i am trying to convert a c++ software in java, however the bit operations don't produce the same results.
overview of what i am doing:
there's an ascii file with data entries, 2 bytes long, unsigned (0-65535). Now i want to convert the two-byte unsigned int in two one-byte unsigned short ints.
C++ code:
signed char * pINT8;
signed char ACCBuf[3];
UInt16 tempBuf[128];
tempBuf[0] = Convert::ToUInt16(line);
pINT8 = (signed char *)&tempBuf[0];
ACCBuf[0] = *pINT8;
pINT8++;
ACCBuf[1] = *pINT8;
Java code:
int[] ACCBuf = new int[6];
int[] tempBuf = new int[128];
tempBuf[0] = Integer.parseInt(line);
ACCBuf[0] = tempBuf[0] >> 8;
ACCBuf[1] = 0x00FF & tempBuf[0];
this two codes produce different results.
any idea why?
This might depend on the endianess of the system. The C++ code has the lower byte in ACCBUF[0], if it is a little endian system. The Java code has the upper byte in ACCBUF[0], no matter what hardware.
If you want to get the same result in Java, you must swap the high and low byte
ACCBuf[0] = 0x00FF & tempBuf[0];
ACCBuf[1] = tempBuf[0] >> 8;
now you will have the same bits in either Java or C++.
Another difference between the two code snippets are the types used. You have 32 bit ints in the Java code and 16 bit unsigned ints respectively 8 bit chars in C++. This isn't relevant here, but must be kept in mind, when comparing different code snippets.

Unsigned short in Java

How can I declare an unsigned short value in Java?
You can't, really. Java doesn't have any unsigned data types, except char.
Admittedly you could use char - it's a 16-bit unsigned type - but that would be horrible in my view, as char is clearly meant to be for text: when code uses char, I expect it to be using it for UTF-16 code units representing text that's interesting to the program, not arbitrary unsigned 16-bit integers with no relationship to text.
If you really need a value with exactly 16 bits:
Solution 1: Use the available signed short and stop worrying about the sign, unless you need to do comparison (<, <=, >, >=) or division (/, %, >>) operations. See this answer for how to handle signed numbers as if they were unsigned.
Solution 2 (where solution 1 doesn't apply): Use the lower 16 bits of int and remove the higher bits with & 0xffff where necessary.
This is a really stale thread, but for the benefit of anyone coming after. The char is a numeric type. It supports all of the mathematical operators, bit operations, etc. It is an unsigned 16.
We process signals recorded by custom embedded hardware so we handle a lot of unsigned 16 from the A-D's. We have been using chars all over the place for years and have never had any problems.
You can use a char, as it is an unsigned 16 bit value (though technically it is a unicode character so could potnetially change to be a 24 bit value in the future)... the other alternative is to use an int and make sure it is within range.
Don't use a char - use an int :-)
And here is a link discussing Java and the lack of unsigned.
From DataInputStream.java
public final int readUnsignedShort() throws IOException {
int ch1 = in.read();
int ch2 = in.read();
if ((ch1 | ch2) < 0)
throw new EOFException();
return (ch1 << 8) + (ch2 << 0);
}
It is not possible to declare a type unsigned short, but in my case, I needed to get the unsigned number to use it in a for loop. There is the method toUnsignedInt in the class Short that returns "the argument converted to int by an unsigned conversion":
short signedValue = -4767;
System.out.println(signedValue ); // prints -4767
int unsignedValue = Short.toUnsignedInt(signedValue);
System.out.println(unsingedValue); // prints 60769
Similar methods exist for Integer and Long:
Integer.toUnsignedLong
Long.toUnsignedString : In this case it ends up in a String because there isn't a bigger numeric type.
No such type in java
Yep no such thing if you want to use the value in code vs. bit operations.
"In Java SE 8 and later, you can use the int data type to represent an unsigned 32-bit integer, which has a minimum value of 0 and a maximum value of 232-1." However this only applies to int and long but not short :(
If using a third party library is an option, there is jOOU (a spin off library from jOOQ), which offers wrapper types for unsigned integer numbers in Java. That's not exactly the same thing as having primitive type (and thus byte code) support for unsigned types, but perhaps it's still good enough for your use-case.
import static org.joou.Unsigned.*;
// and then...
UShort s = ushort(1);
(Disclaimer: I work for the company behind these libraries)
No, really there is no such method, java is a high-level language. That's why Java doesn't have any unsigned data types.
He said he wanted to create a multi-dimensional short array. Yet no one suggested bitwise operators? From what I read you want to use 16 bit integers over 32 bit integers to save memory?
So firstly to begin 10,000 x 10,000 short values is 1,600,000,000 bits, 200,000,000 bytes, 200,000 kilobytes, 200 megabytes.
If you need something with 200MB of memory consumption you may want to redesign this idea. I also do not believe that will even compile let alone run. You should never initialize large arrays like that if anything utilize 2 features called On Demand Loading and Data Caching. Essentially on demand loading refers to the idea to only load data as it is needed. Then data caching does the same thing, but utilizes a custom frame work for delete old memory and adding new information as needed. This one is tricky to have GOOD speed performance. There are other things you can do, but those two are my favorite when done right.
Alright back to what I was saying about bitwise operators.
So a 32bit integer or in Java "int". You can store what are called "bits" to this so let's say you had 32 Boolean values which in Java all values take up 32 bits (except long) or for arrays they take up 8 for byte, 16 for short, and 32 for int. So unless you have arrays you don't get any memory benefits from using a byte or short. This does not mean you shouldn't use it as its a way to ensure you and others know the data range this value should have.
Now as I was saying you could effectively store 32 Booleans into a single integer by doing the following:
int many_booleans = -1; //All are true;
int many_booleans = 0; //All are false;
int many_booleans = 1 | 2 | 8; //Bits 1, 2, and 4 are true the rest are false;
So now a short consists of 16 bits so 16 + 16 = 32 which fits PERFECTLY within a 32bit integer. So every int value can consist of 2 short values.
int two_shorts = value | (value2 << 16);
So what the above is doing is value is something between -32768 and 32767 or as an unsigned value 0 - 65535. So let's say value equaled -1 so as an unsigned value it was 65535. This would mean bits 1 through 16 are turned on, but when actually performing the math consider the range 0 - 15.
So we need to then activate bits 17 - 32. So we must begin at something larger than 15 bits. So we begin at 16 bits. So by taking value2 and multiplying it by 65536 which is what "<< 16" does. We now would have let's say value2 equaled 3 it would be OR'd 3x65536 = 196608. So our integer value would equal 262143.
int assumed_value = 262143;
so let's say we want to retrieve the two 16bit integer values.
short value1 = (short)(assumed_value & 0xFFFF); //-1
short value2 = (short)(assumed_value >> 16); //=3
Also basically think of bitwise operators as powers of 2. That is all they really are. Never look at it terms of 0's and 1's. I mostly posted this to assist anyone who may come across this searching for unsigned short or even possibly multi-dimensional arrays. If there are any typo's I apologize quickly wrote this up.
Java does not have unsigned types. What do you need it for?
Java does have the 'byte' data type, however.
You can code yourself up a ShortUnsigned class and define methods for those operators you want. You won't be able to overload + and - and the others on them, nor have implicit type conversion with other primitive or numeric object types, alas.
Like some of the other answerers, I wonder why you have this pressing need for unsigned short that no other data type will fill.
Simple program to show why unsigned numbers are needed:
package shifttest;
public class ShiftTest{
public static void main(String[] args){
short test = -15000;
System.out.format ("0x%04X 0x%04X 0x%04X 0x%04X 0x%04X\n",
test, test>>1, test>>2, test>>3, test>>4);
}
}
results:
0xC568 0xFFFFE2B4 0xFFFFF15A 0xFFFFF8AD 0xFFFFFC56
Now for those that are not system types:
JAVA does an arithmetic shift because the operand is signed, however, there are cases where a logical shift would be appropriate but JAVA (Sun in particular), deemed it unnecessary, too bad for us on their short sightedness. Shift, And, Or, and Exclusive Or are limited tools when all you have are signed longer numbers. This is a particular problem when interfacing to hardware devices that talk "REAL" computer bits that are 16 bits or more. "char" is not guaranteed to work (it is two bytes wide now) but in several eastern gif based languages such as Chinese, Korean, and Japanese, require at least 3 bytes. I am not acquainted with the number need for sandscript style languages. The number of bytes does not depend on the programmer rather the standards committee for JAVA. So basing char as 16 bits has a downstream risk. To safely implement unsigned shorts JAVA, as special class is the best solution based on the aforementioned ambiguities. The downside of the class is the inability of overloading the mathematical operations for this special class. Many of the contributors for this thread of accurately pointed out these issues but my contribution is a working code example and my experience with 3 byte gifs languages in C++ under Linux.
//вот метод для получения аналога unsigned short
public static int getShortU(byte [] arr, int i ) throws Exception
{
try
{
byte [] b = new byte[2];
b[1] = arr[i];
b[0] = arr[i+1];
int k = ByteBuffer.wrap(b).getShort();
//if this:
//int k = ((int)b[0] << 8) + ((int)b[1] << 0);
//65536 = 2**16
if ( k <0) k = 65536+ k;
return k;
}
catch(Throwable t)
{
throw new Exception ("from getShort: i=" + i);
}
}

Categories

Resources