Representing signed byte in an unsigned byte variable - java

I apologies if the title of this question is not clear, but i cannot figure out the best way to describe my predicament in so few words.
I am writing a communication framework between java and C# using sockets and byte by byte transfer of information.
I have ran into an issue which has been confusing me for a good few hours now. As you hopefully know. java's byte base type is signed, meaning it can store -128 to +127 if you were to represent it in integer form. C# however, uses unsigned bytes, meaning that it store 0-255 in integer form.
This is where i am encountering the issue. If need to send some bytes of information from my c# client to my java server, i use the following code:
C#:
MemoryStream stream;
public void write(byte[] b, int off, int len) {
stream.Write(b, off, len);
}
Java:
DataInputStream in;
public int read(byte[] b, int off, int len) throws IOException{
in.read(b, off, len));
}
As you can see these are very very similar pieces of code that when used within their own languages will produce predictable results. However, due to the differences in the signing these will produce unusable data.
I.e if i send 255 from my c# client to java server, I will receive a value of -1 on the java server. This is because both of those values are represented of these 8 bits: 11111111
Preferably in order to solve this problem I would need to use the following code, using sbyte, c#'s signed byte.
C#:
MemoryStream stream;
public void write(sbyte[] b, int off, int len) {
//Code to change sbyte into a byte but keeping it in the form in which java will understand
stream.Write(b, off, len);
}
I basically need to store java's representation of a signed byte inside an unsigned C# byte in order to send that byte across to the server. I will also need to do this in reverse to get an sbyte out of a byte received from my java server.
I have tried numerous ways in which to do this with no success. If anyone has any idea as to how i can go about this i would be GREATLY appreciative.

You basically don't need to do anything except stop thinking about bytes as numbers. Think of them as 8 bits, and Java and C# are identical. It's rare that you really want to consider a byte as a magnitude - it's usually just binary data like an image, or perhaps encoded text.
If you want to send the byte 10100011 across from Java to C# or vice versa, just do it in the most natural way. The bits will be correct, even if the byte values will be different when you treat them as numbers.
It's not entirely clear what data you're actually trying to propagate, but in 99.9% of cases you can just treat the byte[] as opaque binary data, and transmit it without worrying.
If you do need to treat the bytes as magnitudes, you need to work out which range you want. It's easier to handle the Java range, as C# can support it with sbyte[]... but if you want the range 0-255, you just need to convert the byte to an int on the Java side and mask it with the bottom 8 bits:
byte b = ...;
int unsigned = b & 0xff;
If you really need to treat byte[] as sbyte[] or vice versa on C#, you can use a little secret: even though C# doesn't allow you to convert between the two, the CLR does. All you need to do is go via a conversion of the reference to object to fool the C# compiler into thinking it might be valid - otherwise it thinks it knows best. So this executes with no exceptions:
byte[] x = new byte[] { 255 };
sbyte[] y = (sbyte[]) (object) x;
Console.WriteLine(y[0]); // -1
You can convert in the other direction in exactly the same way.

Related

Migrate a int conversion from byte from C++ to Java

I have the following piece of code in C++
int magic;
stream.read(&magic, sizeof(magic));
Which stores the value of magic from an array of bytes.
I want to migrate it to Java, so far I have this:
int magic = stream[0];
But it is not working. I think that it is due to the length of the ints in Java and C++. Shall I use two bytes in the Java part to retrieve the proper magic number?
byte[] stream = ...
ByteBuffer buf = ByteBuffer.wrap(stream);
buf.order(ByteOrder.LITTLE_ENDIAN);
int magic = buf.readInt();
See ByteBuffer. A java int is always 4 bytes signed, and per default java uses a BIG_ENDIAN byte order, so you might want to set a reversed order.

What integer format when reading binary data from Java DataOutputStream in PHP?

I'm aware that this is probably not the best idea but I've been playing around trying to read a file in PHP that was encoded using Java's DataOutputStream.
Specifically, in Java I use:
dataOutputStream.writeInt(number);
Then in PHP I read the file using:
$data = fread($handle, 4);
$number = unpack('N', $data);
The strange thing is that the only format character in PHP that gives the correct value is 'N', which is supposed to represent "unsigned long (always 32 bit, big endian byte order)". I thought that int in java was always signed?
Is it possible to reliably read data encoded in Java in this way or not? In this case the integer will only ever need to be positive. It may also need to be quite large so writeShort() is not possible. Otherwise of course I could use XML or JSON or something.
This is fine, as long as you don't need that extra bit. l (instead of N) would work on a big endian machine.
Note, however, that the maximum number that you can store is 2,147,483,647 unless you want to do some math on the Java side to get the proper negative integer to represent the desired unsigned integer.
Note that a signed Java integer uses the two's complement method to represent a negative number, so it's not as easy as flipping a bit.
DataOutputStream.writeInt:
Writes an int to the underlying output stream as four bytes, high byte
first.
The formats available for the unpack function for signed integers all use machine dependent byte order. My guess is that your machine uses a different byte order than Java. If that is true, the DataOutputStream + unpack combination will not work for any signed primitive.

Network communication between C# server and JAVA client: how to handle that. transition?

I have a C# server that cannot be altered. In C#, a byte ranges fom 0 - 255, while in JAVA it ranges from -128 to 127.
I have read about the problem with unsigned byte/ints/etc and the only real option as I have found out is to use "more memory" to represent the unsigned thing:
http://darksleep.com/player/JavaAndUnsignedTypes.html
Is that really true?
So when having network communication between the JAVA client and the C# server, the JAVA client receives byte arrays from the server. The server sends them "as unsigned" but when received they will be interpreted as signed bytes, right?
Do I then have to typecast each byte into an Int and then add 127 to each of them?
I'm not sure here... but how do I interpret it back to the same values (int, strings etc) as I had on the C# server?
I find this whole situation extremely messy (not to mention the endianess-problems, but that's for another post).
A byte is 0-255 in C#, not 0-254.
However, you really don't need to worry in most cases - basically in both C# and Java, a byte is 8 bits. If you send a byte array from C#, you'll receive the same bits in Java and vice versa. If you then convert parts of that byte array to 32-bit integers, strings etc, it'll all work fine. The signed-ness of bytes in Java is almost always irrelevant - it's only if you treat them numerically that it's a problem.
What's more of a problem is the potential for different endianness when (say) converting a 32-bit integer into 4 bytes in Java and then reading the data in C# or vice versa. You'd have to give more information about what you're trying to do in order for us to help you there. In particular do you already have a protocol that you need to adhere to? If so, that pretty much decides what you need to do - and how hard it will be depends on what the protocol is.
If you get to choose the protocol, you may wish to use a platform-independent serialization format such as Protocol Buffers which is available for both .NET and Java.
Unfortunately, yes ... the answer is to "use more memory", at least on some level.
You can store the data as a byte array in java, but when you need to use that data numerically you'll need to move up to an int and add 256 to negative values. A bitwise & will do this for you quickly and efficiently.
int foo;
if (byte[3] < 0)
foo = (byte[3] & 0xFF);
else
foo = byte[3];

Why no readUnsignedInt in RandomAccessFile class?

I just found there is no readUnsignedInt() method in the RandomAccessFile class. Why? Is there any workaround to read an unsigned int out from the file?
Edit:
I want to read an unsigned int from file and put it into a long space.
Edit2:
Cannot use readLong(). it will read 8 bytes not 4 bytes. the data in the file have unsigned ints in 4 bytes range.
Edit3:
Found answer here: http://www.petefreitag.com/item/183.cfm
Edit4:
how about if the data file is little-endian? we need to bits swap first?
I'd do it like this:
long l = file.readInt() & 0xFFFFFFFFL;
The bit operation is necessary because the upcast will extend a negative sign.
Concerning the endianness. To the best of my knowledge all I/O in Java is done in big endian fashion. Of course, often it doesn't matter (byte arrays, UTF-8 encoding, etc. are not affected by endianness) but many methods of DataInput are. If your number is stored in little endian, you have to convert it yourself. The only facility in standard Java I know of that allows configuration of endianness is ByteBuffer via the order() method but then you open the gate to NIO and I don't have a lot of experience with that.
Edited to remove readLong():
You could use readFully(byte[] b, int off, int len) and then convert to Long with the methods here: How to convert a byte array to its numeric value (Java)?
Because there is no unsigned int type in java?
Why not readLong() ?
You can readLong and then take first 32 bits.
Edit
You can try
long value = Long.parseLong(Integer.toHexString(file.readInt()), 16);
Depending on what you are doing with the int, you may not need to turn it into a long. You just need to be aware of the operations you are performing. After all its just 32-bits and you can treat it as signed or unsigned as you wish.
If you want to play with the ByteOrder, the simplest thing to do may be to use ByteBuffer which allows you to set a byte order. If your file is less than 2 GB, you can map the entire file into memory and access the ByteBuffer randomly.

Java Unsigned Char Array

I'm working on a project fuzzing a media player. I wrote the file generator in Java and converted the CRC generator from the original compression code written in C. I can write data fine with DataOutputStream, but I can't figure out how to send the data as an unsigned character array in java. In C this is a very straightforward process. I have searched for a solution pretty thoroughly, and the best solution I've found is to just send the data to C and let C return a CRC. I may just not be searching correctly as I'm pretty unfamiliar with this stuff. Thank you very much for any help.
You definitely want a byte[]. A 'byte' is equivalent to a signed char in C. Java's "char" is a 16-bit unicode value and not really equivalent at all.
If it's for fuzzing, unless there's something special about the CRC function you're using, I imagine you could simply use:
import java.util.Random;
Random randgen = new Random();
byte[] fuzzbytes = new byte[numbytes];
randgen.nextBytes(fuzzbytes);
outstream.write(fuzzbytes, 0, numbytes);
I doubt that you want to do anything with characters. I can't see anything in your description which suggests text manipulation, which is what you'd do with characters.
You want to use a byte array. It's a bit of a pain that bytes are signed in Java, but a byte array is what you've got - just work with the bit patterns rather than thinking of them as actual numbers, and check each operation carefully.
Most CRC operators use mostly bitwise shifts and XORs. These should work fine on Java, which does not support unsigned integer primitives. If you need other arithmetic to work properly, you could try casting to a short.

Categories

Resources