Java: what happens when an already loaded class is deserialized - java

Let's say class com.Foo is loaded from a JAR and later a class with the same name com.Foo, but different definition (other fields) is deserialized (e.g loaded either from DB, or received from a remote call).
What could be the consequences? Will the new received class have any impact? Let's say that the class is used in other parts of the application, being persisted in DB and serialized/JSON encoded later.

If you deserialize a Class<?> object then the class with the fully-qualified class name gets loaded. If it is already loaded you will get the reference to that class.
I think that for a complete answer you should read Java Object Serialization Specification
Here are some quotations from the spec that I think are interessting:
1.1 Overview
Special handling is required for arrays, enum constants, and objects of type Class, ObjectStreamClass, and String. Other objects must implement either the Serializable or the Externalizable interface to be saved in or restored from a stream.
2. Object Output Classes
If the object is a Class, the corresponding ObjectStreamClass is written to the stream, a handle is assigned for the class, and writeObject returns.
3. Object Input Classes
If the object in the stream is a Class, read its ObjectStreamClass descriptor, add it and its handle to the set of known objects, and return the corresponding Class object.

You have the wrong imagination of how Serialization works. You can write a Class instance to an object stream just like other objects but this will not write the byte code of that class nor its definition to the stream. It just creates a symbolic reference to the class which is resolved like any other class reference of the stream: by using its symbolic name trying to resolve it in the context of the class deserializing it. It does not create a new class.
In fact, an instance of java.lang.Class creates even less dependencies to the actual class than writing an instance of it. The instance depends on the serialized form, e.g. the non-transient field of the class, while the symbolic reference represented by an instance of java.lang.Class does not depend on it.
The compatibility between the class present when writing a stream and the class present when deserializing it is determined by the serialVersionUID if it doesn’t match, deserialization will always fail with an exception. If it matches, the implementation will try its best to recover. Fields not present in the stream get their default values, stream fields not present in the actual class and any other unprocessed extra data will be ignored.

Related

What makes an Object serializable

I have an object with a HashMap field and a few methods that I am trying to serialize. However, at runtime, I am getting a java.io.NotSerializableException.
I was checking to see if HashMaps could be serialized and from what I have read they are so I am not sure what the problem is.
I was just wondering what makes an object be able to be serialized and why would this object that seems to only have fields that can be serialized not be able to as well.
This is defined in the Java platform Spec here:
https://docs.oracle.com/javase/7/docs/platform/serialization/spec/serial-arch.html
The basic rules are these:
"A Serializable class must do the following:
Implement the java.io.Serializable interface
Identify the fields that should be serializable (Use the
serialPersistentFields member to explicitly declare them serializable
or use the transient keyword to denote nonserializable
Have access to the no-arg constructor of its first nonserializable
superclass"
Broadly, in the absence of any indication to the contrary, and field that is not explicitly marked "transient" is a candidate for serialization.
The entire object graph from the target object downwards has to be serializable, or nothing is. That is, every field that references an object (not a primitive) must reference a serializable object.

If i changed the package of a java class. Will the deserialization from the old serialized version still work?

I am using aerospike as cache in my java project.
I changed the package of one of my cached objects.
I have deployed new code on one machine. Getting serialize exception as old cached objects are there in the cache.
I want old code to run on some machines and new on some and both should be able to get/put objects in the aerospike cache.
Is there a way to achieve this?
Why exactly am i getting this exception.
The package to which a class belongs is a fundamental part of that class's identity, as reflected by the package name being part of the class's fully-qualified name. For all practical intents and purposes, changing the package to which a class is assigned drops the original class and replaces it with a completely different class.
In particular, if you serialize an instance of a class named "my.package.MyClass" via Java serialization, then successfully deserializing the result always yields an instance of a class named "my.package.MyClass". If no such class can be loaded, or if the serialization version of the one that is loaded does not match that of the one that was serialized, then deserialization will fail.
If you retain the old class along with the new class then you can perhaps patch up the deserialization problem in the new version of the application. Simply be prepared for objects of the old class, and convert them to instances of the new class as soon as you deserialize them. But if you want the new version of the application to play well with the old, then you must also do the reverse when the new one serializes objects of the affected class, else you will just cause the old application to have deserialization errors. At this point, you should be considering whether changing the package name was all that important after all.
Overall, Java serialization is not well suited for object storage. It is primarily targeted at object communication. You might consider switching to an XML-, JSON-, or YAML-based serialization format, which at least you could twiddle between cache and consumer. Such a change would of course be incompatible with the old version of your application, but so, apparently, is the package change you have already performed.
In general the serialization algorithm does the following:
It writes out the metadata of the class associated with an instance.
It recursively writes out the description of the superclass until it finds java.lang.object.
Once it finishes writing the metadata information, it then starts with the actual data associated with the instance. But this time, it starts from the topmost superclass.
It recursively writes the data associated with the instance, starting from the least superclass to the most-derived class.
During de-serialization it will attempt to re-instantiate the class but since the package name has been changed it can't find the class as specified within the meta data and fail.
It might be worthwhile to look at the following tool: jDeserialize. It does not instantiate any classes described in the stream; instead, it builds up an intermediate representation of the types, instances, and values. Because of this, it can analyze streams without access to the class code that generated them.
You can subclass the ObjectInputStream and use that to intercept the readClassDescriptor, and if the package of the class being deserialised differs from yours you can change it. However, it gets more complicated if the class being deserialised then contains other classes that require derserialisation.
public class MyObjectInputStream extends ObjectInputStream {
public MyObjectInputStream(InputStream in) throws IOException {
super(in);
}
#Override
protected ObjectStreamClass readClassDescriptor() throws IOException, ClassNotFoundException {
ObjectStreamClass resultClassDescriptor = super.readClassDescriptor();
String className = resultClassDescriptor.getName();
if (className.equals("my.old.pacakgename.MyClass")) {
return ObjectStreamClass.lookup(MyClass.class);
}
return resultClassDescriptor;
}
}
To use this
ObjectInputStream ois = new MyObjectInputStream(stream);
MyClass myObject = (MyClass) ois.readObject();

SerializationVersionId same but class is modified?

I serialize an object and transfer it over the network.My serialized class object has serilizableId which i defined myself.Now in another JVM I keep the SeriliazibleId same but change some attributes.
What will happen and why?Will It be able deserialize it?
You must refer to the Java Object Serialization Specification here.
In the specific, what you are NOT allowed to do:
Deleting fields - If a field is deleted in a class, the stream written will not contain its value. When the stream is read by an earlier class, the value of the field will be set to the default value because no value is available in the stream. However, this default value may adversely impair the ability of the earlier version to fulfill its contract.
Moving classes up or down the hierarchy - This cannot be allowed since the data in the stream appears in the wrong sequence.
Changing a nonstatic field to static or a nontransient field to transient - When relying on default serialization, this change is equivalent to deleting a field from the class. This version of the class will not write that data to the stream, so it will not be available to be read by earlier versions of the class. As when deleting a field, the field of the earlier version will be initialized to the default value, which can cause the class to fail in unexpected ways.
Changing the declared type of a primitive field - Each version of the class writes the data with its declared type. Earlier versions of the class attempting to read the field will fail because the type of the data in the stream does not match the type of the field.
Changing the writeObject or readObject method so that it no longer writes or reads the default field data or changing it so that it attempts to write it or read it when the previous version did not. The default field data must consistently either appear or not appear in the stream.
Changing a class from Serializable to Externalizable or vice versa is an incompatible change since the stream will contain data that is incompatible with the implementation of the available class.
Changing a class from a non-enum type to an enum type or vice versa since the stream will contain data that is incompatible with the implementation of the available class.
Removing either Serializable or Externalizable is an incompatible change since when written it will no longer supply the fields needed by older versions of the class.
Adding the writeReplace or readResolve method to a class is incompatible if the behavior would produce an object that is incompatible with any older version of the class.
What you are allowed to do instead:
Adding fields - When the class being reconstituted has a field that does not occur in the stream, that field in the object will be initialized to the default value for its type. If class-specific initialization is needed, the class may provide a readObject method that can initialize the field to nondefault values.
Adding classes - The stream will contain the type hierarchy of each object in the stream. Comparing this hierarchy in the stream with the current class can detect additional classes. Since there is no information in the stream from which to initialize the object, the class's fields will be initialized to the default values.
Removing classes - Comparing the class hierarchy in the stream with that of the current class can detect that a class has been deleted. In this case, the fields and objects corresponding to that class are read from the stream. Primitive fields are discarded, but the objects referenced by the deleted class are created, since they may be referred to later in the stream. They will be garbage-collected when the stream is garbage-collected or reset.
Adding writeObject/readObject methods - If the version reading the stream has these methods then readObject is expected, as usual, to read the required data written to the stream by the default serialization. It should call defaultReadObject first before reading any optional data. The writeObject method is expected as usual to call defaultWriteObject to write the required data and then may write optional data.
Removing writeObject/readObject methods - If the class reading the stream does not have these methods, the required data will be read by default serialization, and the optional data will be discarded.
Adding java.io.Serializable - This is equivalent to adding types. There will be no values in the stream for this class so its fields will be initialized to default values. The support for subclassing nonserializable classes requires that the class's supertype have a no-arg constructor and the class itself will be initialized to default values. If the no-arg constructor is not available, the InvalidClassException is thrown.
Changing the access to a field - The access modifiers public, package, protected, and private have no effect on the ability of serialization to assign values to the fields.
Changing a field from static to nonstatic or transient to nontransient - When relying on default serialization to compute the serializable fields, this change is equivalent to adding a field to the class. The new field will be written to the stream but earlier classes will ignore the value since serialization will not assign values to static or transient fields.
If you are using private static final long serialVersionUID; for your class, then you are making sure that any version changes, until they are backward compatible, would not affect the deserialization of your class. If its not backward compatible, then you need to increment the serial version ID.

When should I change a SerialUID?

I have a bunch of serialized classes. Normally I have generated serial UIDs for all of them as the Java rules are rather restrictive and recreate Serial Numbers with basically any change. But this lead me to the question, that I haven't been able to find an answer for in the internet:
When does it make sense to break backwards compatibility and manually change the Serial Version UID in the class?
Section 5.6 of the Java Spec helps here:
http://download.oracle.com/javase/6/docs/platform/serialization/spec/version.html#6678
5.6 Type Changes Affecting Serialization
With these concepts, we can now describe how the design will cope with
the different cases of an evolving class. The cases are described in
terms of a stream written by some version of a class. When the stream
is read back by the same version of the class, there is no loss of
information or functionality. The stream is the only source of
information about the original class. Its class descriptions, while a
subset of the original class description, are sufficient to match up
the data in the stream with the version of the class being
reconstituted.
The descriptions are from the perspective of the stream being read in
order to reconstitute either an earlier or later version of the class.
In the parlance of RPC systems, this is a "receiver makes right"
system. The writer writes its data in the most suitable form and the
receiver must interpret that information to extract the parts it needs
and to fill in the parts that are not available.
5.6.1 Incompatible Changes
Incompatible changes to classes are those changes for which the
guarantee of interoperability cannot be maintained. The incompatible
changes that may occur while evolving a class are:
Deleting fields - If a field is deleted in a class, the stream written will not contain its value. When the stream is read by an
earlier class, the value of the field will be set to the default value
because no value is available in the stream. However, this default
value may adversely impair the ability of the earlier version to
fulfill its contract.
Moving classes up or down the hierarchy - This cannot be allowed since the data in the stream appears in the wrong sequence.
Changing a nonstatic field to static or a nontransient field to transient - When relying on default serialization, this change is
equivalent to deleting a field from the class. This version of the
class will not write that data to the stream, so it will not be
available to be read by earlier versions of the class. As when
deleting a field, the field of the earlier version will be initialized
to the default value, which can cause the class to fail in unexpected
ways.
Changing the declared type of a primitive field - Each version of the class writes the data with its declared type. Earlier versions of
the class attempting to read the field will fail because the type of
the data in the stream does not match the type of the field.
Changing the writeObject or readObject method so that it no longer writes or reads the default field data or changing it so that it
attempts to write it or read it when the previous version did not. The
default field data must consistently either appear or not appear in
the stream.
Changing a class from Serializable to Externalizable or vice versa is an incompatible change since the stream will contain data that is
incompatible with the implementation of the available class.
Changing a class from a non-enum type to an enum type or vice versa since the stream will contain data that is incompatible with the
implementation of the available class.
Removing either Serializable or Externalizable is an incompatible change since when written it will no longer supply the fields needed
by older versions of the class.
Adding the writeReplace or readResolve method to a class is incompatible if the behavior would produce an object that is
incompatible with any older version of the class.
5.6.2 Compatible Changes
The compatible changes to a class are handled as follows:
Adding fields - When the class being reconstituted has a field that does not occur in the stream, that field in the object will be
initialized to the default value for its type. If class-specific
initialization is needed, the class may provide a readObject method
that can initialize the field to nondefault values.
Adding classes - The stream will contain the type hierarchy of each object in the stream. Comparing this hierarchy in the stream with the
current class can detect additional classes. Since there is no
information in the stream from which to initialize the object, the
class fields will be initialized to the default values.
Removing classes - Comparing the class hierarchy in the stream with that of the current class can detect that a class has been deleted. In
this case, the fields and objects corresponding to that class are read
from the stream. Primitive fields are discarded, but the objects
referenced by the deleted class are created, since they may be
referred to later in the stream. They will be garbage-collected when
the stream is garbage-collected or reset.
Adding writeObject/readObject methods - If the version reading the stream has these methods then readObject is expected, as usual, to
read the required data written to the stream by the default
serialization. It should call defaultReadObject first before reading
any optional data. The writeObject method is expected as usual to call
defaultWriteObject to write the required data and then may write
optional data.
Removing writeObject/readObject methods - If the class reading the stream does not have these methods, the required data will be read by
default serialization, and the optional data will be discarded.
Adding java.io.Serializable - This is equivalent to adding types. There will be no values in the stream for this class so its fields
will be initialized to default values. The support for subclassing
nonserializable classes requires that the class supertype have a
no-arg constructor and the class itself will be initialized to default
values. If the no-arg constructor is not available, the
InvalidClassException is thrown.
Changing the access to a field - The access modifiers public, package, protected, and private have no effect on the ability of
serialization to assign values to the fields.
Changing a field from static to nonstatic or transient to nontransient - When relying on default serialization to compute the
serializable fields, this change is equivalent to adding a field to
the class. The new field will be written to the stream but earlier
classes will ignore the value since serialization will not assign
values to static or transient fields.
Never. You should organize yourself so that classes have the same serialVersionUID for their entire lifetime. You should (a) resist serialization-incompatible changes to the class; (b) write your own readObject()/writeObject()/readResolve()/writeReplace() objects so as to preserve the initial serialization format, and define an explicit serialVersionUID right at the beginning of the class's lifetime. The instant you change this value you have an enormous headache on your hands. Plan to avoid it.
From the JavaDoc of the Serializable interface:
The serialization runtime associates with each serializable class a version number, called a serialVersionUID, which is used during deserialization to verify that the sender and receiver of a serialized object have loaded classes for that object that are compatible with respect to serialization.
I think this is a good hint to answer your question: As soon as you change the class in a way, that serialization is affected (like adding/removing/changing serialized class members), then you really should change the value of serialVersionUID.

Serialize static attributes in Java

What happens if i'll try to serialize an attribute which is static?
thanks
From this article:
Tip 1: Handling Static Variables
Java classes often hold some
globally relevant value in a static
class variable. We won't enter into
the long history of the debate over
the propriety of global variables -
let's just say that programmers
continue to find them useful and the
alternatives suggested by purists
aren't always practical.
For static variables that are
initialized when declared,
serialization doesn't present any
special problems. The first time the
class is used, the variable in
question will be set to the correct
value.
Some statics can't be initialized this
way. They may, for instance, be set by
a human during the running time of the
program. Let's say we have a static
variable that turns on debugging
output in a class. This variable can
be set on a server by sending it some
message, perhaps from a monitor
program. We'll also imagine that when
the server gets this message, the
operator wants debugging turned on in
all subsequent uses of the class in
the clients that are connected to that
server.
The programmer is now faced with a
difficulty. When the class in question
arrives at the client, the static
variable's value doesn't come with it.
However, it contains the default
static state that's set when the
class's no-argument constructor is
called by writeObject(). How can the
client programs receive the new
correct value?
The programmer could create another
message type and transmit that to the
client; however, this requires a
proliferation of message types,
marring the simplicity that the use of
serialization can achieve in
messaging. The solution we've come up
with is for the class that needs the
static transmitted to include a
"static transporter" inner class. This
class knows about all the static
variables in its outer class that must
be set. It contains a member variable
for each static variable that must be
serialized. StaticTransporter copies
the statics into its member variables
in the writeObject() method of the
class. The readObject() method
"unwraps" this bundle and transmits
the server's settings for the static
variables to the client. Since it's an
inner class, it'll be able to write to
the outer class's static variables,
regardless of the level of privacy
with which they were declared.
And from another article:
Static or transient data
However, this "ease" is not true in
all cases. As we shall see,
serialization is not so easily applied
to classes with static or transient
data members. Only data associated
with a specific instance of a class is
serialized, therefore static data,
that is, data associated with a class
as opposed to an instance, is not
serialized automatically. To serialize
data stored in a static variable one
must provide class-specific
serialization.
Similarly, some classes may define
data members to use as scratch
variables. Serializing these data
members may be unnecessary. Some
examples of transient data include
runtime statistics or hash table
mapping references. These data should
be marked with the transient modifier
to avoid serialization. Transient, by
definition, is used to designate data
members that the programmer does not
want or need to be serialized. See
Java in a Nutshell, page 174: mouse
position, preferred size, file handles
(machine specific (native code)).
When writing code if something is
declared transient, then this triggers
(to programmer) necessity of the
posibility of special code for
serialization later.
To serialize an object, you create
some sort of OutputStream object and
then wrap it inside an
ObjectOutputStream object. At this
point you only need to call
writeObject() and your object is
magically serialized and sent to the
OutputStream. To reverse the process,
you wrap an InputStream inside an
ObjectInputStream and call
readObject(). What comes back is, as
usual, a handle to an upcast Object,
so you must downcast to set things
straight. If you need to dynamically
query the type of the object, you can
use the getClass method. Specifically
dk.getClass.getName() returns the name
of the class that dk is an instance
of. I.e., this asks the object for the
name of its corresponding class
object. (Hmmm, True, but what about
syntax? I still need to know what it
is to declare it...too bad) (C++ can
do this in one operation (dynamic_cast
(gives null if wrong type)), java can
use instanceof operator to check if it
is what I think (see Core Java, Ch5
Inheritence, Casting section)
Yes, we can defnitely serialise the static variable, but we wont be able to get any purpose of serialisation on the static variables.
Why because the Static variables are not bounded to any objects in scope.
We serialize objects to store them so they can retrieved later for any use.
Only the Transient varibles you cant make them to get serialised.
You can serialize the value of a static variable / attribute. But strictly speaking, you don't serialize a variable or attribute in its own right, whether it is class level, instance level, or local to a method.
Normally the instance level attributes of an object are serialized as part of the parent object; i.e. the object that they are attributes of. If you translate that to class level attributes, then the notional parent is the class. While there is a runtime object that denotes this class (i.e. the java.lang.Class returned by this.getClass()), this object is not serializable. So from that perspective, a class level (static) attribute is not serializable.

Categories

Resources