I have a general socket programming question for you.
I have a C struct called Data:
struct data {
double speed;
double length;
char carName[32];
struct Attribs;
}
struct Attribs {
int color;
}
I would like to be able to create a similar structure in Java, create a socket, create the data packet with the above struct, and send it to a C++ socket listener.
What can you tell me about having serialized data (basically, the 1's and 0's that are transferred in the packet). How does C++ "read" these packets and recreate the struct? How are structs like this stored in the packet?
Generally, anything you can tell me to give me ideas on how to solve such a matter.
Thanks!
Be weary of endianness if you use binary serialization. Sun's JVM is Big Endian, and if you are on an Intel x86 you are on a little endian machine.
I would use Java's ByteBuffer for fast native serialization. ByteBuffers are part of the NIO library, thus supposedly higher performance than the ol' DataInput/OutputStreams.
Be especially weary of serializing floats! As suggested above, its safer to transfer all your data to character strings across the wire.
On the C++ side, regardless of the the networking, you will have a filled buffer of data at some point. Thus your deserialization code will look something like:
size_t amount_read = 0;
data my_data;
memcpy(buffer+amount_read, &my_data.speed, sizeof(my_data.speed))
amount_read += sizeof(my_data.speed)
memcpy(buffer+amount_read, &my_data.length, sizeof(my_data.length))
amount_read += sizeof(my_data.length)
Note that the sizes of basic C++ types is implementation defined, so you primitive types in Java and C++ don't directly translate.
You could use Google Protocol buffers. My preferred solution if dealing with a variety of data structures.
You could use JSON for serialization too.
The basic process is:
java app creates some portable version of the structs in the java app, for example XML
java app sends XML to C++ app via a socket
C++ app receives XML from java app
C++ app creates instances of structs using the data in the XML message
Related
Explanation
I need to exchange binary structured data over a stream (TCP socket or
pipe) between C++, Java and Python programs.
Therefore my question:
How to exchange binary structured data over a stream for C++, Java and Python?
There is no way to create the complete object to be serialized beforehand - there must be the possibility to stream in and stream out the data.
Because of performance issues I need some binary protocol format.
I want to use (if possible) some existing library, because hand-crafting all the (de-)serialization is a pain.
What I want
My idea is something like (for C++ writer):
StreamWriter sw(7); // fd to output to.
while( (DataSet const ds(get_next_row_from_db())) ) {
sw << ds; // data set is some structured data
}
and for C++ reader
StreamReader sr(9); // fd for input
while(sr) {
DataSet const ds(sr);
// handle ds
}
with a similar syntax and semantics for Java and Python.
What I did
I thought about using an existing library like Google Protocol Buffers, but this does not support stream handling and there is the need to create the complete object hierarchy before serialization.
Also I though about creating my own binary format, but this is too much work and pain.
I would recommend explicitly documenting how your data types are to be serialized, and writing serialization and deserialization code in each language as needed. I have found in the past that with good documentation of how the data is to be serialized, this is fairly painless.
Your other major option is to standardize on one platform's default serialization method, but that means you have to figure out that method and implement in the other languages. This tends to be trickier as the default serialization methods are often complex and not well documented.
The options are Apache Thrift, Google's protocol buffer and Pache Avro. Good comparison is there at http://www.slideshare.net/IgorAnishchenko/pb-vs-thrift-vs-avro
So I recommend you to try apache Avro.
I'm trying to understand how platform independent socket communication works, because I would like to share socket data between a Java server and some native Unix and Windows clients. Sockets are platform independent by design, but the data representation is machine-related, hence it is advantageous if the TCP data abstracts the real data format, because a data format that is supported on one system doesn't have to be necessarily supported on another.
For example if I want to send an unsigned int value from a C++ client program to a Java server I must tell the server that this number should be interpreted as a negative integer. How does this kind of abstraction work? With my limited knowledge I would just send a number as text and then append some kind of unique character sequence that tells the receiver what kind of data he received, but I don't know if this is a viable approach.
To be a bit more concrete: I would like to send messages that contain the following content:
At the beginning of the message some kind of short signal or command
so that the receiver exactly knows what to do with the data that will follow.
Then some textual content of arbitrary length.
Followed by a number, which can be also text, but should be
interpreted separately.
At the end maybe a mark that tells the server that the message ends
here.
TCP processes the data in byte chunks. Does this mean when I write an UTF-8 encoded char in one byte that this char is interpreted in the same way on different machines if the client machines take Java's big endian byte order into account? Thanks for any input and help.
Sockets are independent but not the data transmitted in (Types length, byte order, String encoding, ...)
Look at Thrift, Protobuf or Avro if you want to send binary data with cross-languages and cross-platform functionnalities
I've got an app written in Java and some native C++ code with system hooks. These two have to communicate with each other. I mean the C++ subprogram must send some data to the Java one. I would've written the whole thing in one language if it was possible for me. What I'm doing now is really silly, but works. I'm hiding the C++ program's window and sending it's data to it's standard output and then I'm reading that output with Java's standard input!!!
Ok, I know what JNI is but I'm looking for something easier for this (if any exists).
Can anyone give me any idea on how to do this?
Any help will be greatly appreciated.
Sockets & Corba are two techniques that come to mind.
Also, try Google's Protocol Buffers or Apache Thrift.
If you don't find JNI 'easy' then you are in need of an IPC (Inter process communication) mechanism. So from your C++ process you could communicate with your Java one.
What you are doing with your console redirection is a form of IPC, in essence that what IPC.
Since the nature of what you are sending isn't exactly clear its very hard to give you a good answer. But if you have 'simple' Objects or 'commands' that could be serialised easily into a simple protocol then you could use a communication protocol such as protocol buffers.
#include <iostream>
#include <boost/interprocess/file_mapping.hpp>
// Create an IPC enabled file
const int FileSize = 1000;
std::filebuf fbuf;
fbuf.open("cpp.out", std::ios_base::in | std::ios_base::out
| std::ios_base::trunc | std::ios_base::binary);
// Set the size
fbuf.pubseekoff(FileSize-1, std::ios_base::beg);
fbuf.sputc(0);
// use boost IPC to use the file as a memory mapped region
namespace ipc = boost::interprocess;
ipc::file_mapping out("cpp.out", ipc::read_write);
// Map the whole file with read-write permissions in this process
ipc::mapped_region region(out, ipc::read_write);
// Get the address of the mapped region
void * addr = region.get_address();
std::size_t size = region.get_size();
// Write to the memory 0x01
std::memset(addr, 0x01, size);
out.flush();
Now your java file could open 'cpp.out' and read the contents like a normal file.
Two approaches off the top of my head:
1) create two processes and use any suitable IPC;
2) compile C++ app into dynamic library and export functions with standard C interface, those should be callable from any language.
I've seen lots of examples of sending serialized data over sockets in Java, but all I want is to send some simple integers and a string. And, the problem is I'm trying to communicate these to a binary written in C.
So, bottom line: how can I just send some bytes over a socket in Java?
You can use the simple OutputStream given by the Socket.
From there you can write bytes.
If you want you can also encapsulate this stream in a BufferedOutputStream to have a buffer.
I would really recommend not using the Java Sockets library directly. I've found Netty (from JBoss) to be really easy to implement and really powerful. The Netty ChannelBuffer class comes with a whole host of options for writing different data types and of course to can write your own encoders and decoders to write POJOs down the stream if you wish.
This page is a really good starter - I was able to make a fairly sophisticated client/server with custom encoders and decoders in under 30 minutes reading this: http://docs.jboss.org/netty/3.2/guide/html/start.html.
If you really want to use Java sockets. The socket output stream can be wrapped in a DataOutputStream which allows you to write many different data types as well, for example:
new DataOutputStream(socket.getOutputStream()).writeInt(5);
I hope that's useful.
I would recommend looking into Protocol Buffers for the serialization and ZeroMQ for the data transfer.
Currently, I'm saving and loading some data in C/C++ structs to files by using fread()/fwrite(). This works just fine when working within this one C app (I can recompile whenever the structure changes to update the sizeof() arguments to fread()/fwrite()), but how can I load this file in other programs without knowing in advance the sizeof()s of the C struct?
In particular, I have written this other Java app that visualizes the data contained in that C struct binary file, but I'd like a general solution as to how read that binary file. (Instead of me having to manually put in the sizeof()s in the Java app source whenever the C structure changes...)
I'm thinking of serializing to text or XML of some sort, but I'm not sure where to start with that (how to serialize in C, then how to deserialize in Java and possibly other languages in the future), and if that is advisable here where one member of the struct is a float array that can go upwards of ~50 MB in binary format (and I have hundreds of these data files to read and write).
The C structure is simple (no severe nesting or pointer references) and looks like the following:
struct MyStructure {
char *title;
int id;
int param1;
int param2;
float *data;
}
The part that are liable to change the most are the param integers.
What are my options here?
If you have control of both code bases, you should consider using Protocol Buffers.
You could use Java's DataInput/DataOutput format that is well described in the javadoc.
Take a look at JSON. http://www.json.org. If you go to from javascript it's a big help. I don't know how good the java support is though.
If your structure isn't going to change (much), and your data is in a pretty consistent format, you could just write the values out to a CSV file, or some other plain format.
This can be easily read in Java, and you won't have to worry about serializing to XML. Sometimes going simple is the easiest route.
Take a look at Resin's Hessian/Burlap services. You may not want the whole service, just part of the API and an understanding of the wire protocol.
If:
your data is essentially a big array of floats;
you are able to test the writing/reading procedure in all the likely environments (=combinations of machines/OS/C compiler) that each end will be running on;
performance is important.
then I would probably just keep writing the data from C in the way that you are doing (maybe with a slight amendment -- see below) and turn the problem into how you read that data from Java.
To read the data back in from Java, use a ByteBuffer. Essentially, pull in slabs of bytes from your data, wrap a ByteBuffer around them, and then use the get(), getFloat(), getInt() etc methods. The NIO package also has "wrapper" buffers, e.g. FloatBuffer, which from tests I've done appear to be about 20% faster for reading large numbers of the same type.
Now, one thing you'll have to be careful about is byte ordering. From Java, you need to call order(ByteOrder.LITTLE _ ENDIAN) or order(ByteOrder.BIG _ ENDIAN) on your buffer before you start reading the data. To decide which to use, I'd recommend that at the very start of the stream, you write some known 16-byte value (e.g. 255 = 0x00ff). Then from Java, pull out these two bytes and check the order (0xff, 0x00 or 0x00, 0xff) to see whether you have little or big endian.
One possibility is creating small XML files with title, ID, params, etc, and then a reference (by filename) to where the float data is contained. Assuming there's nothing special about the float data, and that Java and C are using the same floating point format, you can read that file in with readFloat() of a DataInputStream.
I like the CSV and "Protocol Buffers" answers (though, at a glance, the protocol buffer thing might be very similar to YAML for all I know).
If you need tightly packed records for high volume data, you might consider this:
Create a textual file header describing the current file structure: record sizes (types????) and field names / sizes. Read and parse the header, then use low level binary I/O operations to load up each record's fields, er, object's properties or whatever we are calling it this year.
This gives you the ability to change the strucutre a bit and have it be self-describing, while still allowing you to pack a high volume in a smaller space than XML would allow.
TMTOWTDI, I guess.