Java and C++ - String - java

I have a Java Client and C++ server. All values are sent as byte array. The numeric values are received fine but the string values when stored in char array in C++, have special characters like new page or new line feed at the end of the value. Can someone suggest a solution to the problem?

Yes - use google protocol buffers for serialization/deserialization. It's an open-source, stable, easy-to-use cross-platform package.

How are you serialising / deserialising? You should decide on an encoding (for example ASCII) then write the length of the string first as an int, that way the server can read the int and will know how many bytes to read of the string.
Once its read the bytes it just needs to tail the char* with a '\0' to terminate the string in the array.
Depending on what you are using to write the string in Java you would do something like:
writeInt(string.length());
writeBytes(string.getBytes("ASCII"));
and in your C++ server you would do the reverse.

1) Make sure the server code is complying with your protocol at the byte level.
2) Make sure the client code is complying with your protocol at the byte level.
3) If you have done 1 and 2, and you still have problems, your protocol is broken. Most likely, it fails to properly specify how the server specifies where the strings end and how the client establishes where the strings end.

Related

C# - Writing strings to a stream using two bytes for length, not one

I am creating an easy to use server-client model with an extensible protocol, where the server is in Java and clients can be Java, C#, what-have-you.
I ran into this issue: Java data streams write strings with a short designating the length, followed by the data.
C# lets me specify the encoding I want, but it only reads one byte for the length. (actually, it says '7 bits at a time'...this is odd. This might be part of my problem?)
Here is my setup: The server sends a string to the client once it connects. It's a short string, so the first byte is 0 and the second byte is 9; the string is 9 bytes long.
//...
_socket.Connect(host, port);
var stream = new NetworkStream(_socket);
_in = new BinaryReader(stream, Encoding.UTF8);
Console.WriteLine(_in.ReadString()); //outputs nothing
Reading a single byte before reading the string of course outputs the expected string. But, how can I set up my stream reader to read a string using two bytes as the length, not one? Do I need to subclass BinaryReader and override ReadString()?
The C# BinaryWriter/Reader behavior uses, if I recall correctly, the 8th bit to signify where the last byte of the count is. This allows for counts up to 127 to fit in a single byte while still allowing for actual count values much larger (i.e. up to 2^31-1); it's a bit like UTF8 in that respect.
For your own purposes, note that you are writing the whole protocol (presumably), so you have complete control over both ends. Both behaviors you describe, in C# and Java, are implemented by what are essentially helper classes in each language. There's nothing saying that you have to use them, and both languages offer a way to simply encode text directly into an array of bytes which you can send however you like.
If you do want to stick with the Java-based protocol, you can use BitConverter to convert between a short to a byte[] so that you can send and receive those two bytes explicitly. For example:
_in = new BinaryReader(stream, Encoding.UTF8);
byte[] header = _in.ReadBytes(2);
short count = BitConverter.ToInt16(header, 0);
byte[] data = _in.ReadBytes(count);
string text = Encoding.UTF8.GetString(data);
Console.WriteLine(text); // outputs something

Can I add a binary file to a String based server message queue?

I have a multi-threaded client-server application that uses Vector<String> as a queue of messages to send.
I need, however, to send a file using this application. In C++ I would not really worry, but in Java I'm a little confused when converting anything to string.
Java has 2 byte characters. When you see Java string in HEX, it's usually like:
00XX 00XX 00XX 00XX
Unless some Unicode characters are present.
Java also uses Big endian.
These facts make me unsure, whether - and eventually how - to add the file into the queue. Preferred format of the file would be:
-- Headers --
2 bytes Size of the block (excluding header, which means first four bytes)
2 bytes Data type (text message/file)
-- End of headers --
2 bytes Internal file ID (to avoid referring by filenames)
2 bytes Length of filename
X bytes Filename
X bytes Data
You can see I'm already using 2 bytes for all numbers to avoid some horrible operations required when getting 2 numbers out of one char.
But I have really no idea how to add the file data correctly. For numbers, I assume this would do:
StringBuilder packetData = new StringBuilder();
packetData.append((char) packetSize);
packetData.append((char) PacketType.BINARY.ordinal()); //Just convert enum constant to number
But file is really a problem. If I have also described anything wrongly regarding the Java data types please correct me - I'm a beginner.
Does it have to send only Strings? I think if it does then you really need to encode it using base64 or similar. The best approach overall would probably be to send it as raw bytes. Depending on how difficult it would be to refactor your code to support byte arrays instead of just Strings, that may be worth doing.
To answer your String question I just saw pop up in the comments, there's a getBytes method on a String.
For the socket question, see:
Java sending and receiving file (byte[]) over sockets

Implementing binary protocol Java

I am trying to implement a binary protocol in a Java Android application. The variables in this protocol are unsigned and are either uint32, uint16 or uint8.
I am having trouble with sending integer values. For example, when I try to send a short with a value of 1, the server (written in C++) receives a value of 256.
After searching a bit, I saw some posts talking about endianess and all but it doesn't really give me an answer.
How can I manage to get the bits stored in my variables in Java aligned in the same fashion as the one in C++.
Thanks
C++ does not define the order in which bytes in multibyte integers are stored. You should pick one standard and make sure that everyone uses it.
The standard API in Java has many classes that use big-endian byte order, so you might as well use that as the standard. To receive these correctly in C++ you can use the ntohl and ntohs functions for the conversion, for example.
I solved it! It was a problem of endianess. I solved it using the order() method of ByteBuffer.

Java: Faster alternative to String(byte[])

I am developing a Java-based downloader for binary data. This data is transferred via a text-based protocol (UU-encoded). For the networking task the netty library is used. The binary data is split by the server into many thousands of small packets and sent to the client (i.e. the Java application).
From netty I receive a ChannelBuffer object every time a new message (data) is received. Now I need to process that data, beside other tasks I need to check the header of the package coming from the server (like the HTTP status line). To do so I call ChannelBuffer.array() to receive a byte[] array. This array I can then convert into a string via new String(byte[]) and easily check (e.g. compare) its content (again, like comparison to the "200" status message in HTTP).
The software I am writing is using multiple threads/connections, so that I receive multiple packets from netty in parallel.
This usually works fine, however, while profiling the application I noticed that when the connection to the server is good and data comes in very fast, then this conversion to the String object seems to be a bottleneck. The CPU usage is close to 100% in such cases, and according to the profiler very much time is spent in calling this String(byte[]) constructor.
I searched for a better way to get from the ChannelBuffer to a String, and noticed the former also has a toString() method. However, that method is even slower than the String(byte[]) constructor.
So my question is: Does anyone of you know a better alternative to achieve what I am doing?
Perhaps you could skip the String conversion entirely? You could have constants holding byte arrays for your comparison values and check array-to-array instead of String-to-String.
Here's some quick code to illustrate. Currently you're doing something like this:
String http200 = "200";
// byte[] -> String conversion happens every time
String input = new String(ChannelBuffer.array());
return input.equals(http200);
Maybe this is faster:
// Ideally only convert String->byte[] once. Store these
// arrays somewhere and look them up instead of recalculating.
final byte[] http200 = "200".getBytes("UTF-8"); // Select the correct charset!
// Input doesn't have to be converted!
byte[] input = ChannelBuffer.array();
return Arrays.equals(input, http200);
Some of the checking you are doing might just look at part of the buffer. If you could use the alternate form of the String constructor:
new String(byteArray, startCol, length)
That might mean a lot less bytes get converted to a string.
Your example of looking for "200" within the message would be an example.
2
You might find that you can use the length of the byte array as a clue. If some messages are long and you are looking for a short one, ignore the long ones and don't convert to characters. Or something like that.
3
Along with what #EricGrunzke said, partially looking in the byte buffer to filter out some messages and find that you don't need to convert them from bytes to characters.
4
If your bytes are ASCII characters, the conversion to characters might be quicker if you use charset "ASCII" instead of whatever the default is for your server:
new String(bytes, "ASCII")
might be faster in that case.
In fact, you might be able to pick and choose the charset for conversion byte-character in some organized fashion that speeds up things.
Depending on what you are trying to do there are a few options:
If you are just trying to get the response status to then can't you just call getStatus()? This would probably be faster than getting the string out.
If you are trying to convert the buffer, then, assuming you know it will be ASCII, which it sounds like you do, then just leave the data as byte[] and convert your UUDecode method to work on a byte[] instead of a String.
The biggest cost of the string conversion is most likely the copying of the data from the byte array to the internal char array of the String, this combined with the conversion is most likely just a bunch of work that you don't need to do.

cpp:char(-1) in Java-char

I have client-server app. Client on C++, server on Java.
I am sending byte-stream form client to server, and from server to client.
Tell me please, when I sent char(-1) from C++, what value equals to it in Java?
And what value I must sent from Java to C++, to get char(-1) in Cpp code?
As you are writing through a byte stream, your char(-1) arrives as 255, as byte streams normally transmit unsigned bytes.
The -1 which is read when you read the end of a stream can not be send explicitely but only through closing the stream.
There's no single answer; it depends on how C++ encodes the data and how Java interprets it. The most common encoding of char(-1) is the number 255. Note that this isn't defined by C++; a one's-complement system might encode it as 254. But also note that there are innumerable ways to encode data across the wire: Elias coding, various ASN.1 encodings, decimal digits, hex, etc.
At the Java end, even assuming a simple char-to-byte encoding, it depends on how you de-serialise the byte and into what type.

Categories

Resources