Serialization vs. Byte Code Translation

Serialization vs. Byte Code Translation - java

I'm a beginner with programming, and I was just wondering if there is a difference between the process of serialization and the process of converting to and from byte code (intermediate language).
I found this on javacodegeeks.com:
Serialization is usually used When the need arises to send your data
over network or stored in files. By data I mean objects and not text.
Now the problem is your Network infrastructure and your Hard disk are
hardware components that understand bits and bytes but not Java
objects. Serialization is the translation of your Java object’s
values/states to bytes to send it over network or save it. --> On
other hand, Deserialization is conversion of byte code to
corresponding java objects. <--
From my understanding of this paragraph, serialization may be the process by which java converts its programs to byte code for the ability to transport to different computer environments and still function correctly.
Am I correct in thinking this?

From my understanding of this paragraph, serialization may be the process by which java converts its programs to byte code for the ability to transport to different computer environments and still function correctly. Am I correct in thinking this?
No, compiling with javac creates the byte code that runs on the JVM. VMs (such as the JVM) INTERPRET the bytecode and use some clever and complicated just-in-time compilation (which IS machine/platform-dependent) to give you the final product. See bytecode is just a bunch of instructions that the JVM interprets. Each bytecode opcode is one byte in length, hence the name bytecode.
Serialization on the other hand, converts the state of a Java object into a stream of bytes. These bytes are not instructions like bytecode. Primary purpose of Java Serialization is to write an object into a stream, so that it can be transported through a network and that object can be rebuilt again. When there are two different parties involved, you need a protocol to rebuild the exact same object again. Java serialization API just provides you that. Other ways you can leverage the feature of serialization is, you can use it to perform a deep copy.
Now the problem is your Network infrastructure and your Hard disk are hardware components that understand bits and bytes but not Java objects. Serialization is the translation of your Java object’s values/states to bytes to send it over network or save it. --> On other hand, Deserialization is conversion of byte code to corresponding java objects.
See you can't just pass a java object to the link layer of the network and expect it to be able to send. Networks send bits and bytes across the physical medium. So serializable lets you encode an object in a standard way to binary, pass it across the network, and then decode it at the receiving end back to the object in the exact state the object was in on the sending side

Related

Difference between serialization and normal object storage?

Serialization is the process of converting an object stored in memory into a stream of bytes to be transferred over a network, stored in a DB, etc.
But isn't the object already stored in memory as bits and bytes? Why do we need another process to convert the object stored as bytes into another byte representation? Can't we just transmit the object directly over the network?
I think I may be missing something in the way the objects are stored in memory, or the way the object fields are accessed.
Can someone please help me in clearing up this confusion?

Different systems don't store things in memory in the same way. The obvious example is endianness.
Serialization defines a way by which systems using different in-memory representations can communicate.
Another important fact is that the requirements on in-memory and serialized data may be different: when in-memory, fast read (and maybe write) access is desirable; when serialized, small size is desirable. It is easier to create two different formats to fit these two use cases than it is to create one format which is good for both.
An example which springs to mind is LinkedHashMap: this basically stores two versions of the mapping when in memory (one to capture insertion order; one as a traditional hash map). However, you don't need both of these representations to reconstruct the same map from a serialized form: you only need the insertion order of key/value pairs. As such, the serialized form does not store the same data as the in-memory form.

Serialization turns the pre-existing bytes from the memory into a universal form.
This is done because different systems allocate memory in different ways. Thus, we cannot ensure that the object can be saved directly from the memory on one machine and then be loaded back in properly into another, different machine.
Mabe you can find more information on this page of Oracle docs.

Explanation of object serialization from book Thinking In Java.
When you create an object, it exists for as long as you need it, but under no circumstances does it exist when the program terminates. While this makes sense at first, there are situations in which it would be incredibly useful if an object could exist and hold its information even while the program wasn’t running. Then, the next time you started the program, the object would be there and it would have the same information it had the previous time the program was running. Of course, you can get a similar effect by writing the information to a file or to a database, but in the spirit of making everything an object, it would be quite convenient to declare an object to be "persistent," and have all the details taken care of for you.
Java’s object serialization allows you to take any object that implements the Serializable interface and turn it into a sequence of bytes that can later be fully restored to regenerate the original object. This is even true across a network, which means that the serialization mechanism automatically compensates for differences in operating systems. That is, you can create an object on a Windows machine, serialize it, and send it across the network to a Unix machine, where it will be correctly reconstructed. You don’t have to worry about the data representations on the different machines, the byte ordering, or any other details.
Hope this helps you.

Let's go with that set of mind : we take the object as is , and we send it as byte array over the network. another socket/httphandler receives that byte array.
now, two things come to mind:
ho much bytes to send?
what are these bytes? what class do these btyes represent?
you will have to provide this data as well. so for this action alone we need extra 2 steps.
Now, in C# and Java, as opposed to C++, the objects are scattered throught the heap, each object hold references to the objects it containes , so now we have another requirement
recursivly "catch" all the inner object and pack them into the byte array
now we get packed byte array which represent some object hirarchy, we need to tell the other side how to de-pack this byte array back to object+the object it holds so
Send information on how to unpack that byte array to object hirarchy
Some entities a obejct have cannot be sent over the net, such as functions. so now we have yet another step
Strip away things that cannot be serialized, like functions
this process goes on and one, for every new solution you will find many problems. Serialization is the process of taking that byte array you are talking about and making it something that can be handled in other enviroments, like network/files.

Sizeof in c porting to Java

I have a code in C like this
skip=(unsigned long) (st_row-1)*tot_numcols;
fseek(infile,sizeof(cnum)*skip,0);
Now i have to port it into Java How can I do That.The "cnum" is a Structure in C so I created a class in Java.But about that fseek how can i point to the exact position in File in Java.

Your C design is broken, and you can't do what you apparently want in Java.
It appears that you're storing information out of C structs by blindly dumping the pointer to disk. In addition to being difficult to debug, it's prone to break completely with any change that makes the compiler decide to pack the struct differently, including in particular compiling identical code for 32-bit and 64-bit or little- and big-endian targets. Instead, you should always explicitly serialize structured data. Human-readable formats are best unless there's a very large amount of data.
Java simply doesn't permit this kind of attempt. The Java memory model explicitly hides information about runtime memory packing, and the JVM has wide latitude to organize memory management as it sees fit.
Instead, define a clear format for saving your data, including endianness, and use that from both languages.

How to communicate Java and Labview through TCP/IP, and send data buffers of float point?

I am working on a university project, where I require communicating Java with Labview, bidirectional, and send and receive data in floating point, in data buffers, because the application in Labview generates data at high speed, but I temporarily store and send when the array has a size of 100.
One of my difucultades is to convert data sent from Labview to Java format and viceversa.
Thanks!!

As far as I can see, you have two options:
Use a text base protocol (XML, JSON, something of your own) and just send the literal "1.3454".
pro: it's probably human readable, which simplifies debugging/ asserting that the correct data is transferred. It is also simpler to have different types of messages.
con: This may mean a loss of precision and definitely means some kind of overhead.
If you just have this one kind of data, you could also extract the bytes of the float and send them, so that the other end can read exactly four bytes and reconstruct the float.
pro: no overhead
con: There might be a problem with endianess. I'm not sure if LabVIEW and Java handle all their data in a specific endian or if it depends on the hardware. You might need to reorder the read bytes before reassembling them back to a float. Also different kinds of messages can get more complicated. On this best read the documentation on the TCP Read VI
You can also mix both approaches: extract the bytes from the float, treat each byte as a character and assemble them as a string, which you put into your text based protocol.

Consider using Labview standard tcp-ip lib or websocket.

Data Exchange Formats and Java Seralization

A number of very useful answers posted in this thread helped clear my questions around serialization. From the responses I understand that it is just a means to persist and re-create data in a jvm.So serialization is used for recreating a java object from byte stream. However data could be transferred by means of XML / JSON or via any other data format. So could this be called as serialization? I assume that the difference is that the relevant java libraries would re-create the object using byte stream / xml data / json data etc based on the format of data passed. In case of communication between 2 java based systems, I assume bytestream would be useful where as in case of communication between 2 systems working in different technologies other standard data formats will be used. In case of EJBs / Java RMI , I assume the objects that are transferred between client and server must be serialised as I assume java would be using standard serialization apis to deserialize the objects. Are all these listed above correct?

Wiki sums it up well,
In computer science, in the context of data storage and transmission,
serialization is the process of translating data structures or
object state into a format that can be stored
So your first question
However data could be transferred by means of XML / JSON or via any other data format. So could this be called as serialization?
Yes absolutely. Any format you like, as long as its able to be stored.
Question two:
In case of communication between 2 java based systems, I assume bytestream would be useful where as in case of communication between 2 systems working in different technologies other standard data formats will be used.
Actually Java's built in serialization tends to be only used when its largely invisible to the user and when speed doesn't matter. For example some distributed products might send objects from one node to another using java serialization. For any kind of web service, even from a JVM backed service to another, some kind of friendly format like JSON or XML is far more common. For any product where speed was important or payload size must be as small as possible, they wouldn't use java's serialization but likely some priority binary format.
Protocols like protobuf, avro and thrift were designed to try and give you the best of both worlds. They're somewhat popular but far from universal.
You might also hear the term marshalling, as in a marshaller or marshalling an object. They basically mean the same thing, although in Java land its more common to hear marshalling when you're talking about a non binary format, and serialization when its binary.

Java - Object Stream efficiency over network

Quick design question: I need to implement a form of communication between a client-server network in my game-engine architecture in order to send events between one another.
I had opted to create event objects and as such, I was wondering how efficient it would be to serialize these objects and pass them through an object stream over the simple socket network?
That is, how efficient is it comparatively to creating a string representation of the object, sending the string over via a char stream, and parsing the string client side?
The events will be sent every game loop, if not more; but the event object itself is just a simple wrapper for a few java primitives.
Thanks for your insight!
(tl;dr - are object streams over networks efficient?)

If performance is the primary issue, I suggest using Protocol Buffers over both your own custom serialization and Java's native serialization.
Jon Skeet gives a good explanation as well as benchmarks here: High performance serialization: Java vs Google Protocol Buffers vs ...?
If you can't use PBs, I suspect Java's native serialization will be more optimized than manually serializing/deserializing from a String. Whether or not this difference is significant is likely dependent on how complex of an object you're serializing. As always, you should benchmark to confirm your predictions.
The fact that you're sending things over a network shouldn't matter.

Edit: For time-critical applications Protocol Buffers appear a better choice. However, it appears to me that there is a significant increase in development time. Effectively you'll have to code every exchange message twice: Once as a .proto file which is compiled and spits out java wrappers, and once as a POJO which makes something useful out of these wrappers. But that's guessing from the documentation.
End of Edit
Abstract: Go for the Object Stream
So, what is less? The time it takes to code the object, send the byte stream, and decode it - all by hand - or the time it takes to code the object, send the byte stream, and decode it - all by the trusty and tried serialization mechanism?
You should make sure the objects you send are as small as possible. This can be achieved with enum values, lookup tables and the such, where possible. Might shave a few bytes off each transmission. The serialization algorithm appears very speedy to me, and anything you would code would do exactly the same. When you reinvent the wheel, more often than not you end up with triangles.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.