I'm trying to understand how platform independent socket communication works, because I would like to share socket data between a Java server and some native Unix and Windows clients. Sockets are platform independent by design, but the data representation is machine-related, hence it is advantageous if the TCP data abstracts the real data format, because a data format that is supported on one system doesn't have to be necessarily supported on another.
For example if I want to send an unsigned int value from a C++ client program to a Java server I must tell the server that this number should be interpreted as a negative integer. How does this kind of abstraction work? With my limited knowledge I would just send a number as text and then append some kind of unique character sequence that tells the receiver what kind of data he received, but I don't know if this is a viable approach.
To be a bit more concrete: I would like to send messages that contain the following content:
At the beginning of the message some kind of short signal or command
so that the receiver exactly knows what to do with the data that will follow.
Then some textual content of arbitrary length.
Followed by a number, which can be also text, but should be
interpreted separately.
At the end maybe a mark that tells the server that the message ends
here.
TCP processes the data in byte chunks. Does this mean when I write an UTF-8 encoded char in one byte that this char is interpreted in the same way on different machines if the client machines take Java's big endian byte order into account? Thanks for any input and help.
Sockets are independent but not the data transmitted in (Types length, byte order, String encoding, ...)
Look at Thrift, Protobuf or Avro if you want to send binary data with cross-languages and cross-platform functionnalities
Related
I have studies project. I my teacherd don't want to tell me how to solve problem with receive multiple files. I know I need to use function getInputStream() but I don't know how to split those files in this inputStream object. I need to split this inputStream beacuse I need to save each file in folder.
Thank you for your help and for explaining this problem to me.
The answer is that you probably need a transmission protocol like HTTP or FTP. But if you don't want something that high level, what you can do is tar and then gzip your files, which is what people did on unix back in the day. Tar is still basically a transmission protocol, but maybe not as heavyweight as HTTP or FTP
It sounds like your instructor wants you to create a protocol. The reason you will need a protocol is that if you send multiple files across the same socket you wont know when one file stops and another begins. To simplify the problem I will use a simple chat application as example, but the same will apply to files.
Lets say you have a chat app which has only 2 users (one server to client). Each user can send a message of any length. Lets say User1 wants to send User2 the following messages (each line is one message)
Hello User
How are you doing today?
If you send each of those raw messages across the socket you would likely get
Hello UserHow are you doing today?. Now how do you know where one message started and another stopped?
Simple solution is to send something before each message stating a length of characters in the upcoming message, so your message might be
11Hello User24How are you doing today?
So the end user knows that I read an int which tells me <length>, then read <length> characters to get a full message.
Now thats a pretty basic example and not super great. Lets look at a simple packet format I have seen used in a video game:
Field Name Field Type Notes
Length VarInt Length of packet data + length of the packet ID
Packet ID VarInt
Data Byte Array Depends on the connection state and packet ID, see the sections below
This is the basic format all information between the client and server uses. A length of data to be read, a packet type followed by its data for that packet type.
For your use case you likely need something similar, some sort of meta data about the bytes you are sending. EG: Length of file, file name.
I would start by looking at the DataInputStream class for easily reading primitive data types.
I would like to make a chat application. The client will be written with java and the server will be written with C. On the server side, the message will be sent as struct. On the client side, how do I read and separate the message with Java?
Here is a sample packet structure:
struct s_packet{
int text_color;
char text[TEXTSIZE];
char alias[ALIASIZE];
};
Here is a sample server(in C) send function:
send(iter->client.sockfd, (void *)se_packet, sizeof(s_packet), 0);
Here is a sample client recv function in C:
recv(m_sockfd, (void *)&se_packet, sizeof(s_packet),0);
printf("\x1b[3%dm <%s>: %s \x1b[0m\n", se_packet.text_color, se_packet.alias, se_packet.text);
I can read s_packet and separate in C, but How can I do it in java?
How can i separate like that in Java:
printf("\x1b[3%dm <%s>: %s \x1b[0m\n", se_packet.text_color, se_packet.alias, se_packet.text);
The definite answer is that it won't be so easy. The first thing you should understand is how tcp works. It's a stream oriented protocol and there's no such thing as a "message" in tcp. You just send and receive a stream of bytes.
In your snippet of code recv can finish after reading a part of message sent from the server. Instead you should keep a local buffer in java and drop all the data you've received so far. In a while loop you can detect if a message that is ready for processing was received. If your message is not very big (less than the MTU), then you may get lucky and always receive the whole 'message'. If you are not concerned with that then you may just use java.io.InputStream.read(byte[]) method.
The other thing to consider is how you interpret a message you received. Well you have no other choise but to process it as byte[] in Java. First you may want to read s_packet.text_color. It probably will be placed as first 4 bytes in a message. You can construct int from thoes bytes (see Convert 4 bytes to int for example). But this is not a good practice. This is because you send a binary data that is depends on how your s_packet is represented in memory. In real life you usually don't know what will be the size of int or char, it's platform dependent. Or sometimes the order of bytes inside int itself can differ. Instead you should declare your own serialization protocol and how your message is converted to binary data and vice versa.
Hope it helps.
I'm writing a C++ server/client application (TCP) that is working fine but I will soon have to write a Java client which obviously has to be compatible with the C++ server it connects to.
As for now, when the server or client receives strings (text), it loops through the bits till a '\0' is found, which marks the end of the string ...
Here's the question : is it still a good practice to handle strings that way when communicating over Java/C++ rather than C++/C++ ?
There's one thing you should read about: Encodings. Basically, the same sequence of bytes can be interpreted in different ways. As long as you pass things around in C++ or Java, things will agree on their meaning, but when using the net (i.e. a byte stream) you must make up your mind. If in doubt, read about and use UTF-8.
Consider using Protocol Buffers or Thrift instead of rolling your own protocol.
I need to write an UDP server which will wait for packets from uncorrelated devices (max 10000 of them) sending small packets periodically; do some processing with the payload and write the results on SQL. Now I'm done with the SQL part through jdbc, but the payload bytes keep bugging me, how should I access them? Until now I've worked with the payload mapped to a string and then converting the string to hex (two hex chars representing one byte). I'm aware that there's a better way to do this but I don't know it...
Do you not just want to create a DatagramSocket and receive DatagramPackets on it?
You need to specify a maximum length of packet by virtue of the buffer you use to create it, but then you'll be able to find out how much data was actually sent in the packet using getLength().
See the Java Tutorial for more details and an example.
I am writting a .Net/C# client to a Java Server on Solaris.
The Java server is writting Raw byte data in a Gziped format which I need to extract, but I am having trouble to read the data in the right buffer sizes. I read the message not-deterministicly incomplete or complete and can not read the second message in any case.
I am reading the bytes using the NetworkStream class with the DataAvailable property.
My guess is that it could be related to a little/big endian problem.
Do I need to use a special conversion to change the data from big into little Endian? Do I need to read the necessary bytes using the gzip header?
I used to use the same server with an uncompressed protocol before and had no problem using a StreamReader with the ReadLine function before, but that protocol was purely text based.
Edit: Unfortunately I have no choice as the remote server and protocol is given. Is the endiness part of the GZip format or do I only need to convert the header accordingly? The uncompressed data are pure UTF8-encoded strings with line breaks as delimiters.
The GZIP format is not complex. It is available in all its glory in a simple, accessible specification document, IETF RFC 1952.
The GZIP format specifies the bit-order for bytes. It is not tunable with a flag for endianness. The producer of a GZIP stream is responsible for conforming to the spec in that regard, and a consumer of a GZIP stream, likewise.
If I were debugging this, I would look at the bytes on either end of the wire and verify that the bytes going in are the same as the bytes coming out. That's enough to put aside the endian issues.
If you don't have success transmitting a GZIP bytestream, try transmitting test data - 16 bytes of 0xFF, followed by 16 bytes of 0xAA, etc etc. And then, verify that this is the data coming out the other end.
I'm sorry, I don't know what you mean by I read the message not-deterministicly incomplete or complete and can not read the second message in any case. Second message? What second message? The endianness shouldn't affect the amount of data you receive.
It feels to me that you don't have confidence that you are successfully transmitting data. I would suggest that you verify that before working on endian issues and GZIP format issues.