Java serialization and deserialization when the state of the class changes - java

How does serialization and deserialization work in the following cases:
When a new field is added to the class.
When a non static member is converted to static
When a non transient field becomes transient
When a transient field becomes non transient

In all of the cases described above, the java.io.InvalidClassException would be thrown in case you try to deserialize the class. The reason of this behaviour is that a serial version of the class used for deserialization does not match a serial version of the class used for class serialization. That is default behaviour.
This serial version of the class is used to verify that the serialized and deserialized objects have the same attributes and thus are compatible (which is not the case in your examples in the question).
If you don't explicitly declare a serialVersionUID field (of type long), the JVM will generate one automatically at run-time. However, if you're going to use Java serialization it is highly recommended to declare a serialVersionUID explicitly (because the generated one is compiler-dependent and thus may result in unexpected exceptions of java.io.InvalidClassException).
Suppose you explicitly declared serialVersionUID but you don't updated it during the changes. In your cases:
When a new field is added to the class. The object should de deserialized without any exceptions, a new field would have a default value.
When a non static member is converted to static. Your static field would have a value of corresponding non-static field.
When a non transient field becomes transient. Your transient field would be ignored during deserializtion and thus have a default value.
When a transient field becomes non transient. Because transient fields are ignored during serialization, this case is almost equal to the 1st case - your field would have a default value.

Related

Java non-persistent but serializable variable

In java, how can I declare a variable which is not persistent to a database but it is serializable so that the variable is present in JSON representation of the object containing the variable?
I used the annotation #javax.persistence.Transient, but it doesn't work the way I want since #Transient variables are not serializable.
The issue may be solved by a specific workaround using modifiers. In order to avoid persisting fields, you have 4 options: marking the field with the modifier static, final or transient; or adding the #Transient annotation. Each of these will prevent the field from being persisted into the DB (see here).
Not all these limitations also apply to serialization though. Static and transient modifiers will prevent serialization, but final modifier will not - it will not be persisted but will be serialized (Deserializing in this case is a bit longer, but possible).
I hope this will be applicable to your issue.

SerializationVersionId same but class is modified?

I serialize an object and transfer it over the network.My serialized class object has serilizableId which i defined myself.Now in another JVM I keep the SeriliazibleId same but change some attributes.
What will happen and why?Will It be able deserialize it?
You must refer to the Java Object Serialization Specification here.
In the specific, what you are NOT allowed to do:
Deleting fields - If a field is deleted in a class, the stream written will not contain its value. When the stream is read by an earlier class, the value of the field will be set to the default value because no value is available in the stream. However, this default value may adversely impair the ability of the earlier version to fulfill its contract.
Moving classes up or down the hierarchy - This cannot be allowed since the data in the stream appears in the wrong sequence.
Changing a nonstatic field to static or a nontransient field to transient - When relying on default serialization, this change is equivalent to deleting a field from the class. This version of the class will not write that data to the stream, so it will not be available to be read by earlier versions of the class. As when deleting a field, the field of the earlier version will be initialized to the default value, which can cause the class to fail in unexpected ways.
Changing the declared type of a primitive field - Each version of the class writes the data with its declared type. Earlier versions of the class attempting to read the field will fail because the type of the data in the stream does not match the type of the field.
Changing the writeObject or readObject method so that it no longer writes or reads the default field data or changing it so that it attempts to write it or read it when the previous version did not. The default field data must consistently either appear or not appear in the stream.
Changing a class from Serializable to Externalizable or vice versa is an incompatible change since the stream will contain data that is incompatible with the implementation of the available class.
Changing a class from a non-enum type to an enum type or vice versa since the stream will contain data that is incompatible with the implementation of the available class.
Removing either Serializable or Externalizable is an incompatible change since when written it will no longer supply the fields needed by older versions of the class.
Adding the writeReplace or readResolve method to a class is incompatible if the behavior would produce an object that is incompatible with any older version of the class.
What you are allowed to do instead:
Adding fields - When the class being reconstituted has a field that does not occur in the stream, that field in the object will be initialized to the default value for its type. If class-specific initialization is needed, the class may provide a readObject method that can initialize the field to nondefault values.
Adding classes - The stream will contain the type hierarchy of each object in the stream. Comparing this hierarchy in the stream with the current class can detect additional classes. Since there is no information in the stream from which to initialize the object, the class's fields will be initialized to the default values.
Removing classes - Comparing the class hierarchy in the stream with that of the current class can detect that a class has been deleted. In this case, the fields and objects corresponding to that class are read from the stream. Primitive fields are discarded, but the objects referenced by the deleted class are created, since they may be referred to later in the stream. They will be garbage-collected when the stream is garbage-collected or reset.
Adding writeObject/readObject methods - If the version reading the stream has these methods then readObject is expected, as usual, to read the required data written to the stream by the default serialization. It should call defaultReadObject first before reading any optional data. The writeObject method is expected as usual to call defaultWriteObject to write the required data and then may write optional data.
Removing writeObject/readObject methods - If the class reading the stream does not have these methods, the required data will be read by default serialization, and the optional data will be discarded.
Adding java.io.Serializable - This is equivalent to adding types. There will be no values in the stream for this class so its fields will be initialized to default values. The support for subclassing nonserializable classes requires that the class's supertype have a no-arg constructor and the class itself will be initialized to default values. If the no-arg constructor is not available, the InvalidClassException is thrown.
Changing the access to a field - The access modifiers public, package, protected, and private have no effect on the ability of serialization to assign values to the fields.
Changing a field from static to nonstatic or transient to nontransient - When relying on default serialization to compute the serializable fields, this change is equivalent to adding a field to the class. The new field will be written to the stream but earlier classes will ignore the value since serialization will not assign values to static or transient fields.
If you are using private static final long serialVersionUID; for your class, then you are making sure that any version changes, until they are backward compatible, would not affect the deserialization of your class. If its not backward compatible, then you need to increment the serial version ID.

Java deserialization, changing a field to transient

Background
I have a class, which has no serialization features overriden, and no serialVersionUID, but which still is serialized, stored, later deserialized. This is a configuration object, and when changing configuration, data is actually read from configuration UI, then object is created normally "from scratch" and serialized for storage. Only when it is used, object gets created by deserialization.
Now two fields got added to this class, which should not have been serialized, but were... This of course lead to some deserialization problems (NullPointerException when the fields were left null after default deserialization, breaking class invariants), solved by opening configuration UI and saving configuration, thus saving correct serialized form of the object.
Question
Now, what happens in deserialisation of object from saved configuration data, if I modify the class in one of these ways, to do a quick fix:
remove these fields, and saved data is new version, with these fields in it?
change these fields to transient, and saved data is new version, with these fields in it?
change these fields to transient, and saved data is old version, without these fields?
To make this more concrete, let's say the added field is:
private final Map<String, String> extraProperties = new HashMap<String, String>();
And this is either removed from this class, or changed to private final transient field.
PS. No need to tell me, that custom serialization code should probably be added, and then the whole thing should probably be refactored, to separate persistent and transient data to different classes...
Remove (or make transient) these redundant fields you do not want to serialize. Then try to deserialize loading the old version where the non transient field, now removed, have been present. This will result an error of course, as the class serialVersionUID is now different. However both old and new serialVersionUID should be included in the message.
Now just define private static long serialVersionUID = in your class, setting it to old, previous value. The class content with excess fields in the file will be loaded, and the values of these excess fields will be ignored.
However you now have another problem: you probably have saved files of the two different types: old an new version. These will have different serialVersionUID so we can load one or another but not both. serialVersionUID is final, but maybe you can still set and probe different values as described here.
From the view point of serialization, changing field to transient is the same as field removal. It will not be stored and will not be loaded. However declaring the previously non transient field as transient will change the serialVersionUID if it is not fixed.
If item by item, if the serialVersionUID is now hardcoded and matches the serialVersionUID in the file, the answer to your question is:
Nothing.
Nothing.
Nothing.
As "nothing" I mean that the class is deserialized without assigning values to the transient fields (if these are present) and no error is reported.
In case serialVersionUID's do not match, an exception is thrown, even if the rest of class matches.

Serialization version uid in Java

How is Serialization id stored in the instance of the object ?
The Serialization id we declare in Java is static field;and static fields are not serialized.
There should be some way to store the static final field then. How does java do it ?
The serialVersionUID is not stored in the instance of a "serialized" object, as it is an static field (it is part of the class, not part of the object).
Therefore, it is stored in the compiled bytecode if it is actually defined, otherwise it is computed. In the java specification's words:
If the class has defined serialVersionUID it is retrieved from the class. If the serialVersionUID is >not defined by the class, it is computed from the definition of the class in the virtual machine. If >the specified class is not serializable or externalizable, null is returned.
In the Stream Unique Identifiers section, the algorithm for such computation is explained.
This paragraph is noteworthy (that's why IDEs usually show a warning when a class implementing Serializable has not explicitly defined a serialVersionUID).
Note: It is strongly recommended that all serializable classes explicitly declare serialVersionUID values, since the default serialVersionUID computation is highly sensitive to class details that may vary depending on compiler implementations, and can thus result in unexpected serialVersionUID conflicts during deserialization, causing deserialization to fail.
If you look in the java.io.ObjectStreamClass there you can see that it is actually being serialized. The following method:
java.io.ObjectOutputStream.writeClassDescriptor(ObjectStreamClass)
calls a method which calls the following method:
java.io.ObjectStreamClass.getSerialVersionUID()
Which either computes the serialVersionUID or uses the one declared in the class and found before in the call to the following method:
java.io.ObjectStreamClass.getDeclaredSUID(Class)
So it seems that this static field is an exception from the rule that static fields are not being serialized.
How to read it is described here.
The serial version UID is not stored in objects; it's a static field so it is stored in the class definition. What happens is that when you serialize an object, information about its class has to be stored too; otherwise there would be no way to un-serialize the object. The information stored about the class includes its name and its serial version UID.
You can read the entire protocol here: http://docs.oracle.com/javase/6/docs/platform/serialization/spec/protocol.html
In summary, the entry for a new object is exactly:
newObject:
TC_OBJECT classDesc newHandle classdata[]
Here classDesc is a descriptor of the class which can be either a declaration of a new class, a null reference, or a reference to a previously declared class:
classDesc:
newClassDesc
nullReference
(ClassDesc)prevObject
The declaration of a new class establishes the class's name and serial version UID, a handle that can be used to refer to it later, and additional information on the class encoded as classDescInfo:
newClassDesc:
TC_CLASSDESC className serialVersionUID newHandle classDescInfo
The serialVersionUID is a special field used by the serialization runtime. It's all described in the Java Doc for java.lang.Serializable

When should I change a SerialUID?

I have a bunch of serialized classes. Normally I have generated serial UIDs for all of them as the Java rules are rather restrictive and recreate Serial Numbers with basically any change. But this lead me to the question, that I haven't been able to find an answer for in the internet:
When does it make sense to break backwards compatibility and manually change the Serial Version UID in the class?
Section 5.6 of the Java Spec helps here:
http://download.oracle.com/javase/6/docs/platform/serialization/spec/version.html#6678
5.6 Type Changes Affecting Serialization
With these concepts, we can now describe how the design will cope with
the different cases of an evolving class. The cases are described in
terms of a stream written by some version of a class. When the stream
is read back by the same version of the class, there is no loss of
information or functionality. The stream is the only source of
information about the original class. Its class descriptions, while a
subset of the original class description, are sufficient to match up
the data in the stream with the version of the class being
reconstituted.
The descriptions are from the perspective of the stream being read in
order to reconstitute either an earlier or later version of the class.
In the parlance of RPC systems, this is a "receiver makes right"
system. The writer writes its data in the most suitable form and the
receiver must interpret that information to extract the parts it needs
and to fill in the parts that are not available.
5.6.1 Incompatible Changes
Incompatible changes to classes are those changes for which the
guarantee of interoperability cannot be maintained. The incompatible
changes that may occur while evolving a class are:
Deleting fields - If a field is deleted in a class, the stream written will not contain its value. When the stream is read by an
earlier class, the value of the field will be set to the default value
because no value is available in the stream. However, this default
value may adversely impair the ability of the earlier version to
fulfill its contract.
Moving classes up or down the hierarchy - This cannot be allowed since the data in the stream appears in the wrong sequence.
Changing a nonstatic field to static or a nontransient field to transient - When relying on default serialization, this change is
equivalent to deleting a field from the class. This version of the
class will not write that data to the stream, so it will not be
available to be read by earlier versions of the class. As when
deleting a field, the field of the earlier version will be initialized
to the default value, which can cause the class to fail in unexpected
ways.
Changing the declared type of a primitive field - Each version of the class writes the data with its declared type. Earlier versions of
the class attempting to read the field will fail because the type of
the data in the stream does not match the type of the field.
Changing the writeObject or readObject method so that it no longer writes or reads the default field data or changing it so that it
attempts to write it or read it when the previous version did not. The
default field data must consistently either appear or not appear in
the stream.
Changing a class from Serializable to Externalizable or vice versa is an incompatible change since the stream will contain data that is
incompatible with the implementation of the available class.
Changing a class from a non-enum type to an enum type or vice versa since the stream will contain data that is incompatible with the
implementation of the available class.
Removing either Serializable or Externalizable is an incompatible change since when written it will no longer supply the fields needed
by older versions of the class.
Adding the writeReplace or readResolve method to a class is incompatible if the behavior would produce an object that is
incompatible with any older version of the class.
5.6.2 Compatible Changes
The compatible changes to a class are handled as follows:
Adding fields - When the class being reconstituted has a field that does not occur in the stream, that field in the object will be
initialized to the default value for its type. If class-specific
initialization is needed, the class may provide a readObject method
that can initialize the field to nondefault values.
Adding classes - The stream will contain the type hierarchy of each object in the stream. Comparing this hierarchy in the stream with the
current class can detect additional classes. Since there is no
information in the stream from which to initialize the object, the
class fields will be initialized to the default values.
Removing classes - Comparing the class hierarchy in the stream with that of the current class can detect that a class has been deleted. In
this case, the fields and objects corresponding to that class are read
from the stream. Primitive fields are discarded, but the objects
referenced by the deleted class are created, since they may be
referred to later in the stream. They will be garbage-collected when
the stream is garbage-collected or reset.
Adding writeObject/readObject methods - If the version reading the stream has these methods then readObject is expected, as usual, to
read the required data written to the stream by the default
serialization. It should call defaultReadObject first before reading
any optional data. The writeObject method is expected as usual to call
defaultWriteObject to write the required data and then may write
optional data.
Removing writeObject/readObject methods - If the class reading the stream does not have these methods, the required data will be read by
default serialization, and the optional data will be discarded.
Adding java.io.Serializable - This is equivalent to adding types. There will be no values in the stream for this class so its fields
will be initialized to default values. The support for subclassing
nonserializable classes requires that the class supertype have a
no-arg constructor and the class itself will be initialized to default
values. If the no-arg constructor is not available, the
InvalidClassException is thrown.
Changing the access to a field - The access modifiers public, package, protected, and private have no effect on the ability of
serialization to assign values to the fields.
Changing a field from static to nonstatic or transient to nontransient - When relying on default serialization to compute the
serializable fields, this change is equivalent to adding a field to
the class. The new field will be written to the stream but earlier
classes will ignore the value since serialization will not assign
values to static or transient fields.
Never. You should organize yourself so that classes have the same serialVersionUID for their entire lifetime. You should (a) resist serialization-incompatible changes to the class; (b) write your own readObject()/writeObject()/readResolve()/writeReplace() objects so as to preserve the initial serialization format, and define an explicit serialVersionUID right at the beginning of the class's lifetime. The instant you change this value you have an enormous headache on your hands. Plan to avoid it.
From the JavaDoc of the Serializable interface:
The serialization runtime associates with each serializable class a version number, called a serialVersionUID, which is used during deserialization to verify that the sender and receiver of a serialized object have loaded classes for that object that are compatible with respect to serialization.
I think this is a good hint to answer your question: As soon as you change the class in a way, that serialization is affected (like adding/removing/changing serialized class members), then you really should change the value of serialVersionUID.

Categories

Resources