Endian in networking and multi platform - clarification? - java

I have this code in JAVA :
socket = new Socket("127.0.0.1", 10);
OutputStream os = socket.getOutputStream();
os = socket.getOutputStream();
int data=50000;
os.w.write(data.toByteArray());
os.write(ByteBuffer.allocate(4).putInt(data).array());
And in the C# :
byte[] ba = readint(networkStream);
networkStream.Flush();
if (BitConverter.IsLittleEndian) Array.Reverse(ba);
int i = BitConverter.ToInt32(ba, 0); //50000
It's all working fine but :
I saw this image :
But the part that interested me was the "Network Order - under Big Endian"
I read in wiki that
Many IETF RFCs use the term network order, meaning the order of
transmission for bits and bytes over the wire in network protocols.
Among others, the historic RFC 1700 (also known as Internet standard
STD 2) has defined its network order to be big endian, though not all
protocols do.
Question
if Java uses big endian and the tcp also uses "Networked order" - big Endian -
So Why - in my C# - I had to check if it's big endian ?
I mean I could do :
Array.Reverse(ba);
Without checking : if (BitConverter.IsLittleEndian) Array.Reverse(ba);
Am I right ?
If so , What about the case that some unknown source sends me data and I don't know if he sent it big or small endian ? He would have to send me a first byte to indicate right ? but the first byte is also subject to endianness.....Where is my misunderstanding ?

You are assuming that this is a 32-bit in big endian, you could also assume that BitConverter has a default of little endian (or you could change it to big endian and not reverse it in the first place) BTW ByteBuffer supports little endian too.
You could also send little endian
os.write(ByteBuffer.allocate(4)
.order(ByteOrder.LITTLE_ENDIAN)
.putInt(data)
.array());
What about the case the some unknown source sends me data and I don't know if he sent it big or small endian ?
Then you don't know how to decode it, for sure. You can guess if you have enough data.
He would have to send me a first byte to indicate right ?
She could but you would have to know that the first byte tells you the order and how to interperate that first byte. Simpler to assume a given byte order.
but the first byte is also subject to endianness
So imagine a byte is written in little or big endian. How would it be any different? ;)

Related

What is meant by reading data in a machine-independent way?

The DataInputStream class's documentation says that it is able to
read the data in a machine-independent way , what exactly does that
mean ?
Does it mean it will receive the exact same data regardless
of , in what programming language the other end of the Socket
communication is written ?
Is the DataInputStream useful for reading
primitives sent from an application/program written in other
language(e.g. if a C application is sending and a Java application
is receiving) if not which class would ?
The DataInputStream works on a sequence of bytes. When it reads larger values from such a sequence, it uses a fixed interpretation. For example, when reading an int, which requires 4 bytes, it reads them in big-endian format. That is, if the byte stream contains 0x01 0x23 0x45 0x67, the DataInputStream will read this as the integer 0x01234567.
In short, it uses a fixed Endianness instead of relying on the endianness of the platform.
Plus, it defines the exact size and representation of several data types, whose sizes depend on the execution environment in other programming languages. For example, int C the type int is at least 16 bits wide, while Java defines it as exactly 32 bits wide, and so does Java's DataInputStream.
The DataInputStream is great when you need to exchange data between Java programs. If you need to exchange data between different programming languages, you should use another library that is implemented in all involved programming languages. Maybe Google's protobuf. Or if your data is text data, use JSON or XML.

Sending double with tcp from java to C#

I have a Java SocketServer that sends doubles to a C# client. The sever sends the doubles with DataOutputStream.writeDouble() and the client reads the double with BinaryReader.ReadDouble().
When I send dos.writeDouble(0.123456789); and flush it from the server the client reads and outputs 3.1463026401691E+151 which is different from what I sent.
Are the C# and Java doubles each encoded differently?
In Java, DataOutputStream.writeDouble() converts the double to a long before sending, writing it High byte first (Big endian).
However, C#, BinaryReader.ReadDouble() reads in Little Endian Format.
In other words: The byte order is different, and changing one of them should fix your problem.
The easiest way to change the byte order in Java from Big to Little Endian is to use a ByteBuffer, where you can specify the endian type: eg:
ByteBuffer buffer = ByteBuffer.allocate(yourvaluehere);
buffer.order(ByteOrder.LITTLE_ENDIAN);
// add stuff to the buffer
byte[] bytes = buffer.array();
Then, use DataOutputStream.write()
The issues is in fact with the encoding, specifically endianness. Java uses big endian format, which is the standard network endianness, while your C# client is using little endian format.
So here is what happened: 0.123456789 is stored in the IEEE754 Double precision format as 0x3FBF9ADD3739635F. When this is read in C#, the byte order is switched, so it is stored as 0x5F633937DD9ABF3F. This corresponds to the decimal number 3.14630264016909969143315814746e151.
Take a look at this questing to see about reversing the byte order on the C# client side

What integer format when reading binary data from Java DataOutputStream in PHP?

I'm aware that this is probably not the best idea but I've been playing around trying to read a file in PHP that was encoded using Java's DataOutputStream.
Specifically, in Java I use:
dataOutputStream.writeInt(number);
Then in PHP I read the file using:
$data = fread($handle, 4);
$number = unpack('N', $data);
The strange thing is that the only format character in PHP that gives the correct value is 'N', which is supposed to represent "unsigned long (always 32 bit, big endian byte order)". I thought that int in java was always signed?
Is it possible to reliably read data encoded in Java in this way or not? In this case the integer will only ever need to be positive. It may also need to be quite large so writeShort() is not possible. Otherwise of course I could use XML or JSON or something.
This is fine, as long as you don't need that extra bit. l (instead of N) would work on a big endian machine.
Note, however, that the maximum number that you can store is 2,147,483,647 unless you want to do some math on the Java side to get the proper negative integer to represent the desired unsigned integer.
Note that a signed Java integer uses the two's complement method to represent a negative number, so it's not as easy as flipping a bit.
DataOutputStream.writeInt:
Writes an int to the underlying output stream as four bytes, high byte
first.
The formats available for the unpack function for signed integers all use machine dependent byte order. My guess is that your machine uses a different byte order than Java. If that is true, the DataOutputStream + unpack combination will not work for any signed primitive.

Network communication between C# server and JAVA client: how to handle that. transition?

I have a C# server that cannot be altered. In C#, a byte ranges fom 0 - 255, while in JAVA it ranges from -128 to 127.
I have read about the problem with unsigned byte/ints/etc and the only real option as I have found out is to use "more memory" to represent the unsigned thing:
http://darksleep.com/player/JavaAndUnsignedTypes.html
Is that really true?
So when having network communication between the JAVA client and the C# server, the JAVA client receives byte arrays from the server. The server sends them "as unsigned" but when received they will be interpreted as signed bytes, right?
Do I then have to typecast each byte into an Int and then add 127 to each of them?
I'm not sure here... but how do I interpret it back to the same values (int, strings etc) as I had on the C# server?
I find this whole situation extremely messy (not to mention the endianess-problems, but that's for another post).
A byte is 0-255 in C#, not 0-254.
However, you really don't need to worry in most cases - basically in both C# and Java, a byte is 8 bits. If you send a byte array from C#, you'll receive the same bits in Java and vice versa. If you then convert parts of that byte array to 32-bit integers, strings etc, it'll all work fine. The signed-ness of bytes in Java is almost always irrelevant - it's only if you treat them numerically that it's a problem.
What's more of a problem is the potential for different endianness when (say) converting a 32-bit integer into 4 bytes in Java and then reading the data in C# or vice versa. You'd have to give more information about what you're trying to do in order for us to help you there. In particular do you already have a protocol that you need to adhere to? If so, that pretty much decides what you need to do - and how hard it will be depends on what the protocol is.
If you get to choose the protocol, you may wish to use a platform-independent serialization format such as Protocol Buffers which is available for both .NET and Java.
Unfortunately, yes ... the answer is to "use more memory", at least on some level.
You can store the data as a byte array in java, but when you need to use that data numerically you'll need to move up to an int and add 256 to negative values. A bitwise & will do this for you quickly and efficiently.
int foo;
if (byte[3] < 0)
foo = (byte[3] & 0xFF);
else
foo = byte[3];

Java - Sending unsigned bytes through TCP connection

Since Java bytes are signed values and I'm trying to establish a TCP socket connection with a C# program that is expecting the bytes to be unsigned.
I am not able to change the code on the C# portion.
How can I go about sending these bytes in the correct format using Java.
Thanks
No, Java bytes are signed values. In general C# bytes are unsigned. (You'd need the sbyte type to refer to signed bytes; I can't remember the last time I used sbyte.)
However, it shouldn't matter at all in terms of transferring data across the wire - normally you just send across whatever binary data you've got (e.g. what you've read from a file) and both sides will do the right thing. A byte with value -1 on the Java side will come through as a byte with value 255 on the C# side.
If you can tell us more about exactly what you're trying to do (what the data is) we may be able to help more, but I strongly suspect you can just ignore the difference in this case.
It doesn't matter. The numbers are just strings of bits and the fact that they're signed or unsigned doesn't matter as far as the bits are concerned. And it's the bits that are transferred so all that signed/unsigned information is irrelevant.
1100101010101111 is still 1100101010101111 regardless of signed/unsigned.

Categories

Resources