I have a byte array in Java. The values are both positive and negative (since they are greater than 127 in the original unsigned array). Now I want to send this array with Quickserver (http://www.quickserver.org/) to my TCP client in an iOS application I am writing as well. I pass the byte array to sendClientBinary() method which accepts the byte array as its input. However, when I receive the array in the iOS client app, all the negative values seem to have been converted to some kind of complement form and mainly into two-byte values: -71 (0xB9) in Netbeans looks in Xcode memory view as 0xC2 0xB9 and -67 (0xBD) in Netbeans appears as 0xC2 0xBD in Xcode.
Can anyone please provide explanation for this?
I am also able to convert my byte array to char array and masking out all the upper bytes, so now the char array holds the correct values in the full 0-255 range, however, there is no way how to pass a char array to sendClientBinary() method that only accepts byte array as input.
Should I try to be casting or converting char array to byte array somehow again?
//Some code in Java:
//reading my byte array from a method and converting it to char array (sorry if it's not the most efficient way, just need something simple right now
byte byteArray[] = (byte[])functionReturningByteArray();
char charArray[] = new char[byteArray.length];
for (int ij = 0; ij < byteArray.length; ij++)
{
charArray[ij] = (char) byteArray[ij];
if (charArray[ij] > 255)
charArray[ij] &= 0xFF;
}
//and the code sending the data over TCP socket (via Quickserver):
clientH.setDataMode(DataMode.BINARY, DataType.OUT);
clientH.sendClientBinary(byteArray);
//--this is received in iOS as 16-bit values with some prefix such as 0xC2 or 0xC3 for negative values, if not for the prefix the value would be correct
//or an attempt to send the charArray:
clientH.setDataMode(DataMode.byte, DataType.OUT);
clientH.sendClientBytes(charArray.toString());
//--this doesn't resemble my bytes once received in iOS at all
//iOS reception code:
case NSStreamEventHasBytesAvailable:
{
if(stream == inputStream)
{
int len = 0;
len = [inputStream read:receptionBuf maxLength:2048*2048*2];
packetBytesReceived += len;
[packetData appendBytes:receptionBuf length:len];
NSString* fullData = [[NSString alloc] initWithData:packetData encoding:NSASCIIStringEncoding];
...
...
I think the problem might be in NSASCIIStringEncoding since I am receiving characters in the main part of my data packet, but some content is just byte data values and this probably could be the cause...? Will start working on it.
0xc2 is the prepend for byte in UTF-8 encoding. It denotes that you are sending a special character in UTF-8 in the 0xc2 sequence. So 0xC2 0xB9 would translate to a superscript character; in particular ^1. My guess (since I assume this is not what you are actually sending) is that your encoding is set incorrectly some place.
Problem solved. I am reading my binary portion of data payload directly from packetData variable in iOS application (instead of fullData which is NSString) without converting it first to string and then decoding to byte with UTF8 encoding again.
Related
I have an Android app that uses ByteBuffer to send an array of ints (converted to a string) to my server, which is written in Ruby. The server takes the string and unpacks it. My code to build the ByteBuffer looks like this:
ByteBuffer bytes = ByteBuffer.allocate(16);
bytes.order(ByteOrder.LITTLE_ENDIAN);
bytes.putInt(int1);
bytes.putInt(int2);
bytes.putInt(int3);
bytes.putInt(int4);
String byteString = new String(bytes.array());
This works great when the ints are all positive values. When it has a negative int, things go awry. For example, in iOS when I submit an array of ints like [1,1,-1,0], the byte string on the server is:
"\x01\x00\x00\x00\x01\x00\x00\x00\xFF\xFF\xFF\xFF\x00\x00\x00\x00"
That gets correctly unpacked to [1,1,-1,0].
In Android, however, when I try to submit the same array of [1,1,-1,0], my string is:
"\x01\x00\x00\x00\x01\x00\x00\x00\xEF\xBF\xBD\xEF\xBF\xBD\xEF\xBF\xBD\xEF\xBF\xBD\x00\x00\x00\x00"
Which gets unpacked to [1, 1, -272777233, -1074807361].
If I convert the negative int to an unsigned int:
byte intByte = (byte) -1;
int unsignedInt = intByte & 0xff;
I get the following:
"\x01\x00\x00\x00\x01\x00\x00\x00\xEF\xBF\xBD\x00\x00\x00\x00\x00\x00\x00"
Which gets unpacked to [1, 1, 12435439, 0]. I'm hoping someone can help me figure out how to properly handle this so I can send negative values properly.
Your problem is here:
String byteString = new String(bytes.array());
Why do you do that? You want to send a stream of bytes, so why convert it to a stream of chars?
If you want to send bytes, send bytes. Use an OutputStream, not a Writer; use an InputStream, not a Reader. The fact that integers are "negative" or "positive" does not matter.
I am trying to send data from PHP TCP server to JAVA TCP client.
I am comparing my results by comparing hex values of the data.
PHP script reads STDIN, sends it through socket one byte at a time and java reads it using DataInputStream.read(), converts to hex and displays.
If I manually type data into script - it works ok.
If I use file with data - it works OK
But when I assign /dev/urandom(even few bytes) - the data on the java side is coming corrupted. There is always a hex of value efbfbd in random places instead of correct data.
Please help me with this issue.
PHP code:
$f = fopen( 'php://stdin', 'rb' );
while($line = fread($f, 1)){
$length = 1;
echo bin2hex($line)."\n";
echo socket_write($client, $line, 1)."\n";
$sent = socket_write($client, $line, $length);
if ($sent === false) {
break;
}
// Check if the entire message has been sented
if ($sent < $length) {
// If not sent the entire message.
// Get the part of the message that has not yet been sented as message
$line = substr($line, $sent);
// Get the length of the not sented part
$length -= $sent;
}
Java code:
in = new DataInputStream(clientSocket.getInputStream());
byte[] data = new byte[1];
int count = 0;
while(in.available() > 0){
//System.out.println(in.available());
in.read(data);
String message = new String(data);
System.out.println(message);
//System.out.flush();
System.out.println( toHex(message) );
//in.flush();
message = "";
}
You're stumbling upon encoding. By calling new String(data) the byte array is converted using your default encoding to a string, whatever this encoding may is (you can set the encoding by java -Dfile.encoding=UTF-8 to UTF-8 for example).
The Java code you want would most likely look the following:
in = new DataInputStream(clientSocket.getInputStream());
byte[] data = new byte[1];
int count = 0;
while (in.available() > 0) {
// System.out.println(in.available());
in.read(data);
String hexMessage = Integer.toHexString(data[0] & 0xFF);
String stringMessage = new String(data, "UTF-8"); // US-ASCII, ISO-8859-1, ...
System.out.println(hexMessage);
}
Update: I missed the 32bit issue. The 8-bit byte, which is signed in Java, is sign-extended to a 32-bit int. To effectively undo this sign extension, one can mask the byte with 0xFF.
There are two main issues with your Java program.
First - the use of in.available(). It does not tell you how many bytes there are still in the message. It merely says how many bytes are ready in the stream and for available reading without being blocked. For example, if the server sends two packets of data over the socket, one has arrived, but one is still being sent over the Internet, and each packet has 200 bytes (this is just an example), then in the first call you'll get the answer 200. If you read 200 bytes, you're sure not to be blocked. But if the second packet has not arrived yet, your next check of in.available() will return 0. If you stop at this point, you only have half the data. Not what you wanted.
Typically you either have to read until you reach end-of-stream (InputStream.read() returns -1), and then you can't use the same stream anymore and you close the socket, or you have a specific protocol that tells you how many bytes to expect and you read that number of bytes.
But that's not the reason for the strange values you see in output from your program. The reason is that Java and PHP represent strings completely differently. In PHP, a string can contain any bytes at all, and the interpretation of them as characters is up to the prorgrammer.
This basically means that a PHP string is the equivalent of a byte[] in Java.
But Java Strings are completely different. It consists internally of an array of char, and char is always two bytes in UTF-16 encoding. When you convert bytes you read into a Java String, it's always done by encoding the bytes using some character encoding so that the appropriate characters are stored in the string.
For example, if your bytes are 44 4F 4C 4C, and the character encoding is ISO-8859-1, this will be interpreted as the characters \u0044, \u004F, \u004C, \u004C. It will be a string of four characters - "DOLL". But if your character encoding is UTF-16, the bytes will be interpreted as \u444F and \u4C4C. A string of only two characters, "䑏䱌".
When you were reading from the console or from a file, the data was probably in the encoding that Java expects by default. This is usually the case when the file is written in pure English, with just English letters, spaces and punctuation. These are all 7-bit characters which are the same in ISO-8859-1 and UTF-8, which are the common defaults. But in /dev/urandom you'd have some bytes in the range 80 through FF, which may be treated differently when interpreted into a UTF-16 Java string.
Furthermore, you didn't show your toHex() method in Java. It probably reads bytes back from the string again, but using which encoding? If you read the bytes into the String using ISO-8859-1, and got them out in UTF-8, you'd get completely different bytes.
If you want to see exactly what PHP sent you, don't put the bytes in a String. Write a toHex method that works on byte arrays, and use the byte[] you read directly.
Also, always remember to check the number of bytes returned by read() and only interpret that number of bytes! read() does not always fill the entire array. So in your new toHex() method, you need to also pass the number of bytes read as a parameter, so that it doesn't display the parts of the array after them. In your case you just have a one-byte array - which is not recommended - but even in this case, read() can return 0, and it's a perfectly legal value indicating that in this particular call to read() there were no bytes available although there may be some available in the next read().
As the comment above says you might be having troubles with the string representation of the bytes String message = new String(data); To be certain, you should get the data bytes and encode them in Base64 for example. You can use a library such as Apache Commons or Java 8 to do that. You should be able to do something similar in PHP to compare.
I need to parse a binary file created by C++ and overwrite a 4 char long char array in that file, for example change the original char array of ABCD to WXYZ.
I know exactly the position in terms of bytes of the that char array. I tried RandomAccessFile which let me go to the position easily. But I cannot make the rest work for me right now.
Is the RandomAccessFile a right way to go?
I know I have to do some conversion from 2 bytes char to one byte char.
Anybody has a good way to do this?
Fine: always try the JavaDoc RandomAccessFile.
long position = ...;
byte[] bytes = new byte[] { (byte)'W', ... };
raf.seek(position);
raf.write(bytes);
RandomAccessFile is fine. As you have already figured out, in C++ char is a single byte, whereas Java uses UTF-16.
The easiest option might be to use byte[4] in your code to represent the 4-character ASCII string.
I'm receiving an UDP packet (in which format I don't know, I think UTF-16 -little endian-), only thing that I know is the following doc. directly from the developers page:
The master servers each respond by sending FF FF FF FF 73 0A followed
by a (relatively) unique 4-byte "challenge" number.
So this is how I'm receiving the packet:
byte[] buff = new byte[64];
DatagramPacket packet = new DatagramPacket(buff, buff.length);
socket.receive(packet);
Packet was received, everything is Okay, but now I'm stuck. I need that 4 byte integer. I must split the buffer or... I don't know what to do.
This is the received data:
˙˙˙˙s
Ň?t
I tried to convert to hex but the output is very interesting:
-0000000000000000000000000008cf52df7c08c
Method to convert:
public String toHex(String arg) throws UnsupportedEncodingException {
return String.format("%040x", new BigInteger(arg.getBytes("UTF-16LE")));
}
Than I tried to convert hex to string (from the method above) and the result is much more interesting (sorry I can't copy paste, something goes wrong), anyway the method used to convert hex to string is:
public String hexToString(String hex){
StringBuilder output = new StringBuilder();
for (int i = 0; i < hex.length(); i+=2) {
String str = hex.substring(i, i+2);
output.append((char)Integer.parseInt(str, 16));
}
return new String(output);
}
So with all that said, I'm stuck. I don't know what am I supposed to do. Am I need to split the UDP packet in to pieces or...?
I'm receiving an UDP packet (in which format I don't know, I think UTF-16 -little endian-), only thing that I know is the following doc.
You really need to find out what the packet actually contains. The packet contents you have posted in your question don't make much sense to me, and don't seem to correspond to the supposed format.
Start out by dumping the bytes of the byte array like this:
byte[] bytes = ...
int len = // number of bytes read.
for (int i = 0; i < len; i++) {
System.out.format("%02x ", bytes[i]);
}
Then compare that with the expected format from the documentation. If they match (more or less) then you can start on the problem of deciding how to extract the data that you need. Otherwise, you first need to figure out what the format REALLY is. Maybe we can help ... but we need a reliable rendering of the packet (e.g. produced as above.)
FWIW, the reason that you are getting -0000000000000000000000000008cf52df7c08c is (I think) that BigInteger(byte[]) is interpreting the byte array as a signed number. Any way, that's a not a good way to do this. The UDP packet body is a sequence of bytes, not a number.
I also think it is unlikely that the UDP packet is UTF-16. FFFF is described thus in the official Unicode code charts:
Noncharacters:
These codes are intended for process-internal uses, but are not permitted for interchange. [...]
FFFF : • the value FFFF is guaranteed not to be a Unicode character at all
So if someone is claiming that this is UTF-16, the usage is violating the Unicode standard.
Through a socket I am sending information from a program written in C to a program written in Java.
Through the program in C, I am sending two bytes through a char array (using an internet socket), and the received information in Java is stored in a char array also.
My main problem is that the received information in the Java array does not correspond properly to the transmitted information from the C program.
I have read that the chars in Java are 16 bits long, and the chars in C are 8 bits long. That may be the problem, but I do not know how to handle/solve that.
The C code to send information is the following:
char buffer[256];
bzero(buffer,256);
n = read(fd, buffer, 255); // getting info from an uart port, which works properly
n = write(sockfd,buffer,3); // send the information through the socket
Part of the Java code (for an Android app) is the following:
char[] buffer = new char[256];
BufferedReader in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
int readX = in.read(buffer, 0, 3);
if (readX > 0) { // I am using a handler to manipulate the info
Message msg = new Message();
msg.obj = buffer;
mHandler.sendMessage(msg);
}
....
// Part of the handler is the following:
mHandler = new Handler() {
#Override
public void handleMessage(Message msg) {
char[] buffer;
buffer = (char[])msg.obj; // the information here is different from the one sent from the C program
....
}
}
Any suggestion to solve this problem I would really appreciate it.
Thanks in advance, Gus.
In C and C++ the char data type is 8-bit characters, corresponding roughly to the Java byte type. In Java, the fundamental char type is a 16-bit Unicode character. When you convert from bytes to characters (or vice-versa) in Java, a mapping has to occur, depending on the character encoding of the byte stream (UTF-8, ISO-8859-1, etc), so you have to know how the C byte stream is encoded. In your case I'd guess it's ISO-8859-1. If it's really binary data, then you should use the Java byte type.
EDIT:
You have to know whether the data being sent from C is character or binary, and if character, how the C program is encoding the data (ISO-8859-1, UTF-8, etc).
If the data is binary then you must use BufferedInputStream to read bytes, not BufferedReader, which decodes bytes into characters, which you don't want since the data is binary.
If the data is character, then you can use the 2-argument constructor
InputStreamReader(InputStream in, String charSetName)
or one of the other 2-arg constructors that let you specify how to decode bytes into characters.
Use the Java byte type, which is an 8-bit signed integer. Also ensure that your char type in C is actually 8 bits. The Boost StaticAssert facility can be used to ensure that.