How does receiving application knows which serialization mechanism has been used? - java

I am receiving bytes of data onto a kafka topic and those bytes can be sent (by an application) by using plain Java Serialization or JSON serialization or protocol buffer.
So, now when my application reads those bytes from kafka topic, how does it know which Serilization technique was used, which can be: Java Serialization, JSON Serialization, Protocol Buffer.
Is there a way to check this? Does "Serialization format" differ by these different mechanisms?
Any information to understand this would be of great help.

The serialization technique must be agreed/defined between the sender and the receiver.
It must be the contract between the two parts.

Related

Sending Serialized Data

Basically I am doing some networking with a client and server sending "packets" back and forth to each other. I have it working with basic variable data such as ints or strings passing back and forth, however now I want to pass an object.
So I know I have to serialize the data of the object to pass it through the socket. That is working as well (as I can get the correct information if I serialize then de-serialize right away) but the problem comes in when my server receives a packet.
My server interprets packet data based on the first 2 characters of the packet. So 01foobar is a type of packet correlating to whatever "01" is assigned to and 02foobar is a different packet as well. So I don't know the best way to do this with an object attached. What is mean is this...
The way I have tried to do it right now is, serialize my object and get it's string. Then append on 03 to the front. So basically I have a string that looks like 03[B#3e9513b7 (or whatever) then do getBytes() on that string which gives me another byte[] (so I can send it through the socket). Then when the server receives that information, I can append the 03 off and I'm left with just [B#3e9513b7. The problem is, [B#3e9513b7 is now a string, and not a byte[] and in order to deserialize I need to send it the same byte[] as it gave me when it serialized that data. So that got me looking into a way to make [B#3e9513b7 BE the byte[] (aka, so when I do toString() on that new byte[] it returns [B#3e9513b7) but was having issues assigning it like that because it would give me a new byte[] for [B#3e9513b7 as a string. So obviously then, when I send it to be deserialized it has a byte[] that it doesn't know what to do with and throws an error.
So I have to imagine there's a better way to do this, and I'm just making things more complicated than they should be. Any recommendations? I can provide code snippets if needed.
Thanks guys!
Edit: I guess I should mention that I am using Java with using UDP sockets.
If you are looking for a reliable and efficient solution for client-server communication, I would suggest to look at Netty.
Regarding how to serialize/deserialize your objects, you have many choices as Java serialization, XML, JSON ...
You would have to pass your serialized objects in UDP datagrams. However, be aware that UDP datagram size is limited. If you're exchanging big objects, you may want to switch to TCP transport which is more reliable.
You may also want to look at SOAP/REST web services.

How to serialise msgpack over http

tl;dr: Is there an efficient way to convert a msgpack in Java and C# for transport over HTTP.
I've just discovered the msgpack data format. I use JSON just about everything I send over the wire between client and server (that uses HTTP), so I'm keen to try this format out.
I have the Java library and the C# library, but I want to transport msgpacks over HTTP. Is this the intended use, or is it more for a local format?
I noticed a couple of RPC implementations, but they're whole RPC servers.
Thoughts?
-Shane
Transport and encoding are two very different things and it's entirely up to you to choose which transport to use and what data encoding to use, depending on the needs of your application. Sending msgpack data over HTTP is a perfectly valid use case and it is possible, but keep in mind the following two points:
msgpack is a binary encoding, which means it needs to be serialized into bytes before sending, and deserialized from the received bytes on the other end. It also means that it is not human-readable (or writable, for that matters) so it's really hard to inspect (or generate by hand) the HTTP traffic.
unless you intend to stream msgpack-encoded data over HTTP, you'll incur a fairly high overhead cost since the HTTP header size will most likely greatly overshadow the size of the data you're sending. Note that this also applies to JSON, but to a lesser extent since JSON is not as efficient in its encoding.
As far as implementation goes, the sending side would have to serialize your msgpack object into a byte[] before sending it as the request body in your HTTP request. You'll need to set the HTTP Content-Type to application/x-msgpack as well. On the receiving end, read the request body from the input stream (you probably can get your hands on a ByteArrayInputStream and deserialize into your msgpack object).

Java Serialised object vs Non serialised object

1) Can a non-serialised java object be sent over the network to be executed by another JVM or stored in local file storage to get the data restored?
2) What is the difference between serialising and storing the java object vs storing the java object without serialising it?
Serialization is a way to represent a java object as a series of bytes. Its just a format nothing more.
A "build-in" java serialization is a class that provides an API for conversion of the java object to a series of bytes. That's it. Of course, deserialization is a "complementary" process that allows to convert this binary stream back to the object.
The serialization/deserialization itself has nothing to do with the "sending over the network" thing. Its just convenient to send a binary stream that can be created from the object with the serialization.
Even more, sometimes the built-in serialization is not an optimal way to get the binary stream, because sometimes the object can be converted by using less bytes.
So you can use you're custom protocol, provide your own customization for serialization (for example, Externalizable)
or even use third party libraries like Apache Avro
I think this effectively answers both of your questions:
You can turn the non-serialized object (I guess the one that doesn't implement "Serializable" interface) to the series of bytes (byte stream) by yourself if you want and then send it over the network, store in a binary file, whatsoever.
Of course you'll have to understand how to read this binary format for converting back.
Since serialization is just a protocol of conversion and not a "storage related thing", the answer is obvious.
Hope this helps.
In short, you don't store a non-serialized object in java. So I would say no to both questions.
Edit: ObjectOutputStream and ObjectInputStream can write primitives as well as serializable objects, if that's what you are using.
1) Can a non-serialised java object be sent over the network to be
executed by another JVM or stored in local file storage to get the
data restored?
An object is marshalled using ObjectOutputStream to be sent over the wire. Serialization is a Java standard way of storing the state of an object. You can devise your own of doing the same but there is no point re-inventing the wheel unless you see a big problem in the standard way.
2) What is the difference between serialising and storing the java
object vs storing the java object without serialising it?
Serialization stores the state of the object using ObjectOuputStream and can de de-serialized using ObjectInputStream. Serialized object can be saved to a file or can be sent over the network. Serialization is the standard way to achieve all this. But you can always invent your ways to do so if you really have a point to.
The purpose of serialization is to store the state of objects in a self contained way that doesn't require raw memory references, run time state etc. In other words, objects can be represented as a string of bits that can be stored on disk, sent over a network etc.

Java Chat system protocol design, how to determine message type?

I have a chat program implemented in Java. The client can send lots of different types of information to the server (i.e, Joins the server and sends username, password; requests a private chat with another user on the server, disconnects from the server, etc).
I'm looking for the correct way to have the server/client differentiate between 'text' messages that are just meant to be chat text messages sent from one client to the others, and 'command' messages (disconnect, request private chat, request file transfer, etc) that are meant for the server or the client.
I see two options:
Use serialized objects, and determine what they are on the receiving end by doing an 'instanceof'
Send the data as a byte array, reserving the first N bytes of the array to specify the 'type' of the incoming data.
What is the 'correct' way to do this? How to real protocols (oscar, irc) handle this situation?
I've googled around on this topic and only found examples/discussions centering on simple java chat applications. None that go into detail about protocol design (which I ultimately intend to practice).
Thanks to any help...
Second approach is much better, because serialization is a complex mechanism, that can be easily used in a wrong way (for example you may bind yourself to internal content of a concrete serialized class). Plus your protocol will be bound to JVM mechanism.
Using some "protocol header" for message differentiation is a common way in network protocols (FTP, HTTP, etc). It is even better when it is in a text form (people will be able to read it).
You typically have a little message header identifying the type of content in all messages, including standard text/chat messages.
Either of your two suggestions are fine. (In your second approach, you probably want to reserve some bytes for the length of the array as well.)

How can I send data in binary form over a Java socket?

I've seen lots of examples of sending serialized data over sockets in Java, but all I want is to send some simple integers and a string. And, the problem is I'm trying to communicate these to a binary written in C.
So, bottom line: how can I just send some bytes over a socket in Java?
You can use the simple OutputStream given by the Socket.
From there you can write bytes.
If you want you can also encapsulate this stream in a BufferedOutputStream to have a buffer.
I would really recommend not using the Java Sockets library directly. I've found Netty (from JBoss) to be really easy to implement and really powerful. The Netty ChannelBuffer class comes with a whole host of options for writing different data types and of course to can write your own encoders and decoders to write POJOs down the stream if you wish.
This page is a really good starter - I was able to make a fairly sophisticated client/server with custom encoders and decoders in under 30 minutes reading this: http://docs.jboss.org/netty/3.2/guide/html/start.html.
If you really want to use Java sockets. The socket output stream can be wrapped in a DataOutputStream which allows you to write many different data types as well, for example:
new DataOutputStream(socket.getOutputStream()).writeInt(5);
I hope that's useful.
I would recommend looking into Protocol Buffers for the serialization and ZeroMQ for the data transfer.

Categories

Resources