Implementing an equivalent to String.intern() for other objects

Implementing an equivalent to String.intern() for other objects - java

I'm trying to implement an equivalent to String.intern(), but for other objets.
My goal is the following:
I've an object A which I will serialize and then deserialize.
If there is another reference to A somewhere, I want the result of the deserialization to be the same reference.
Here is one example of what I would expect.
MyObject A = new MyObject();
A.data1 = 1;
A.data2 = 2;
byte[] serialized = serialize(A);
A.data1 = 3;
MyObject B = deserialize(serialized); // B!=A and B.data1=1, B.data2=2
MyObject C = B.intern(); // Here we should have C == A. Consequently C.data1=3 AND C.data2=2
Here is my implementation atm. (the MyObject class extends InternableObject)
public abstract class InternableObject {
private static final AtomicLong maxObjectId = new AtomicLong();
private static final Map<Long, InternableObject> dataMap = new ConcurrentHashMap<>();
private final long objectId;
public InternableObject() {
this.objectId = maxObjectId.incrementAndGet();
dataMap.put(this.objectId, this);
}
#Override
protected void finalize() throws Throwable {
super.finalize();
dataMap.remove(this.objectId);
}
public final InternableObject intern() {
return intern(this);
}
public static InternableObject intern(InternableObject o) {
InternableObject r = dataMap.get(o.objectId);
if (r == null) {
throw new IllegalStateException();
} else {
return r;
}
}
}
My unit test (which fails):
private static class MyData extends InternableObject implements Serializable {
public int data;
public MyData(int data) {
this.data = data;
}
}
#Test
public void testIntern() throws Exception {
MyData data1 = new MyData(7);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
ObjectOutputStream oos = new ObjectOutputStream(baos);
oos.writeObject(data1);
oos.flush();
baos.flush();
oos.close();
baos.close();
ByteArrayInputStream bais = new ByteArrayInputStream(baos.toByteArray());
ObjectInputStream ois = new ObjectInputStream(bais);
MyData data2 = (MyData) ois.readObject();
Assert.assertTrue(data1 == data2.intern()); // Fails here
}
The failure is due to the fact that, when deserializing, the constructor of InternableObject is called, and thus objectId will be 2 (even if the serialized data contains "1")
Any idea about how to solve this particular problem or, another approach to handle the high level problem ?
Thanks guys

Do not use the constructor to create instances. Use a factory method that checks if an instance already exists first, only create an instance if there isn't already a matching one.
To get serialization to cooperate, your class will need to make use of readResolve() / writeReplace(). http://docs.oracle.com/javase/7/docs/platform/serialization/spec/serial-arch.html#4539
The way you implemented your constructor, you're leaking a reference during construction, which can lead to very hard to nail down problems. Also, your instance map isn't protected by any locks, so its not thread save.

Typically intern() forms an aspect, and maybe should not be realized as a base class, maybe too restricting its usage in a more complex constellation.
There are two aspects:
1. Sharing the "same" object.
Internalizing an object only gives a profit, when several objects can be "internalized" to the same object. So I think, that InternalableObjecte. with a new sequential number is not really adequate. More important is that the class defines a fitting equals and hashCode.
Then you can do an identity Map<Object, Object>:
public class InternMap {
private final Map<Object, Object> identityMap = new HashMap<>();
public static <I extends Internalizable<?>> Object intern(I x) {
Object first = identityMap.get(x);
if (first == null) {
first = x;
identityMap.put(x, x);
}
return first;
}
}
InternMap could be used for any class, but above we restrict it to Internalizable things.
2. Replacing a dynamically created non-shared object with it's .intern().
Which in Java 8 could be realised with a defualt method in an interface:
interface Internalizable<T> {
public static final InternMap interns = new InternMap();
public default T intern(Class<T> klazz) {
return klazz.cast(internMap.intern(this));
}
class C implements Internalizable<C> { ... }
C x = new C();
x = x.intern(C.class);
The Class<T> parameter needed because of type erasure. Concurrency disregarded here.
Prior to Java 8, just use an empty interface Internalizable as _marker: interface, and use a static InternMap.

Related

Creating an object of Class, other than via the constructor

In Java, given
Class c = ...
We can make an object of this class by first obtaining a constructor. For example, if we want to use the default (no parameters) constructor,
c.getConstructor().newInstance()
This seems straightforward, and seems to match how things are done in Java source code.
But, curiously, it is not how things are done in JVM byte code. There, creating an object is done in two steps: new to actually create the object, then invokespecial to call an appropriate constructor.
Is there a way to bypass the constructor when what you have is a Class (with the actual class to be determined at runtime)? If not, was the rationale for the difference between how this works, and how the byte code works, ever documented?

You wanna allocate an uninitialized object.
You can try the library named Objenesis.
Otherwise, you can create an object by serialization. This is a widely used method to create a uninitialized object.
public class Serialization {
static class TestSerialization implements Serializable {
int val = 0;
public TestSerialization() {
System.out.println("constructor");
val = 1;
}
#Override
public String toString() {
return "val is " + val;
}
}
public static void main(String[] args) throws IOException, ClassNotFoundException {
TestSerialization testSerialization = new TestSerialization();
// constructor
// val is 1
System.out.println(testSerialization);
ByteArrayOutputStream bos = new ByteArrayOutputStream();
ObjectOutputStream oos = new ObjectOutputStream(bos);
oos.writeObject(testSerialization);
oos.close();
ObjectInputStream ois = new ObjectInputStream(new ByteArrayInputStream(bos.toByteArray()));
Object obj = ois.readObject();
// val is 1
System.out.println(obj);
}
}
One step closer, you can use ReflectionFactory to create an empty uninitialized object.
public class Main {
static class TestClass {
public int val = 0;
public TestClass() {
val = 1;
}
#Override
public String toString() {
return "value is " + val;
}
}
public static void main(String[] args) throws Exception {
// by constructor
TestClass obj = new TestClass();
// value is 1
System.out.println(obj);
// by reflect
Constructor<TestClass> constructor = TestClass.class.getConstructor();
obj = constructor.newInstance();
// value is 1
System.out.println(obj);
// by ReflectionFactory
ReflectionFactory reflectionFactory = ReflectionFactory.getReflectionFactory();
Constructor<Object> objectConstructor = Object.class.getDeclaredConstructor();
Constructor<?> targetConstructor = reflectionFactory.newConstructorForSerialization(TestClass.class, objectConstructor);
obj = (TestClass) targetConstructor.newInstance();
// value is 0
System.out.println(obj);
}
}

Getting same hashcode every time before serialization and after seriadeserialization of object without using readResolve method in Java why?

Getting same Hashcode every time before serialization and after deserialization of object without using readResolve() method in Java why ?
Here is my class
public class SerializedSingletonClass implements Serializable{
private static final long serialVersionUID = 18989987986l;
private SerializedSingletonClass(){};
private static class InstanceHelper {
private static SerializedSingletonClass obj = new SerializedSingletonClass();
}
public static SerializedSingletonClass getInstance(){
return InstanceHelper.obj;
}
}
Test Class --
public class TestSingleton {
public static void main(String[] args) throws FileNotFoundException,
IOException, ClassNotFoundException {
// Test Serialization for singleton pattern
SerializedSingletonClass instanse1 = SerializedSingletonClass
.getInstance();
ObjectOutputStream obs = new ObjectOutputStream(new FileOutputStream(
"filename1.ser"));
obs.writeObject(instanse1);
obs.close();
ObjectInputStream objInputStream = new ObjectInputStream(
new FileInputStream("filename1.ser"));
SerializedSingletonClass instance2 = (SerializedSingletonClass) objInputStream
.readObject();
objInputStream.close();
System.out.println("instance1==" + instanse1.getClass().hashCode());
System.out.println("instance2==" + instance2.getClass().hashCode());
}
}
Output ::
instance1==1175576547
instance2==1175576547

Your objects are instances of the same class, SerializedSingletonClass. You're getting the hashCode from the class, not from the instance. instanse1.getClass() evaluates to the same thing as instance2.getClass(), so of course they produce the same hashCode.
To find the hashCode of the objects, use instanse1.hashCode() and instance2.hashCode().

Invoking method on actual parameter not declared in formal parameter type

I am an experienced programmer but a Java beginner. I have a benchmarking method that accepts a parameter of type Map and performs some tests on it. It can be invoked on a HashMap, Hashtable, IdentityHashMap, TreeMap etc because these all implement Map. They also all implement Cloneable, but Eclipse tells me I am not allowed to invoke the clone() method.
private static double[] timeMapRemoves(Map<String,Integer> map,
Collection<String> data,
int reps) {
Map<String,Integer> map_clone = map.clone(); // OOPS -- "clone not accessible"
So I delve into the Oracle website and I come up with a solution of sorts
Map<String,Integer> map_clone = null;
Method clone = null;
try {
clone = map.getClass().getMethod("clone", null);
map_clone = (Map<String,Integer>)clone.invoke(map, null);
} catch (NoSuchMethodException | SecurityException
| IllegalAccessException | IllegalArgumentException
| InvocationTargetException e) {
e.printStackTrace();
}
I feel that I may, like Drool Rockworm, have delved too deep and missed the canonical solution.

clone() is protected which means it is only accessible from a subclass or that very same package.
Reiteration from the comments:
It all depends on the context from which it is called, and if that context is the same type then you can call the protected method. Here the context is a different type so it cannot call it.
When you change the parameter to HashMap<K, V> for example you can call it because HashMap overrides the clone() method with a public modifier. So in short: you can't do that with a simple Map<K, V> declaration.
This means a situation like this will work:
class X {
public X(){
X newX = new X().clone();
}
}
but this won't:
class X {
public X(){
String newString = "hello".clone();
}
}
But then again, this will:
class X implements Map<String, String>{
public X(){
Map<String, String> map = new HashMap<>().clone();
}
}
And so will this:
private static double[] timeMapRemoves(HashMap<String,Integer> map,
Collection<String> data,
int reps) {
Map<String, String> someMap = (Map<String, String>) map.clone();
}
Notice how I changed the parameter to HashMap<String,Integer>.
The reason for why this works is very simple: HashMap defines its own clone() method.
public Object clone() {
HashMap<K,V> result = null;
try {
result = (HashMap<K,V>)super.clone();
} catch (CloneNotSupportedException e) {
// assert false;
}
result.table = new Entry[table.length];
result.entrySet = null;
result.modCount = 0;
result.size = 0;
result.init();
result.putAllForCreate(this);
return result;
}

Weird serialisation behavior with HashMap

Consider the three following classes:
EntityTransformer contains a map associating an Entity with a String
Entity is an object containing an ID (used by equals / hashcode), and which contains a reference to an EntityTransformer (note the circular dependency)
SomeWrapper contains an EntityTransformer, and maintains a Map associating Entity's identifiers and the corresponding Entity object.
The following code will create an EntityTransformer and a Wrapper, add two entities to the Wrapper, serialize it, deserialize it and test the presence of the two entitites:
public static void main(String[] args)
throws Exception {
EntityTransformer et = new EntityTransformer();
Wrapper wr = new Wrapper(et);
Entity a1 = wr.addEntity("a1"); // a1 and a2 are created internally by the Wrapper
Entity a2 = wr.addEntity("a2");
byte[] bs = object2Bytes(wr);
wr = (SomeWrapper) bytes2Object(bs);
System.out.println(wr.et.map);
System.out.println(wr.et.map.containsKey(a1));
System.out.println(wr.et.map.containsKey(a2));
}
The output is:
{a1=whatever-a1, a2=whatever-a2}
false
true
So basically, the serialization failed somehow, as the map should contain both entities as Keys. I suspect the cyclic dependency between Entity and EntityTransformer, and indeed if I make static the EntityManager instance variable of Entity, it works.
Question 1: given that I'm stuck with this cyclic dependency, how could I overcome this issue ?
Another very weird thing: if I remove the Map maintaining an association between identifiers and Entities in the Wrapper, everything works fine... ??
Question 2: someone understand what's going on here ?
Bellow is a full functional code if you want to test it:
Thanks in advance for your help :)
public class SerializeTest {
public static class Entity
implements Serializable
{
private EntityTransformer em;
private String id;
Entity(String id, EntityTransformer em) {
this.id = id;
this.em = em;
}
#Override
public boolean equals(Object obj) {
if (obj == null) {
return false;
}
if (getClass() != obj.getClass()) {
return false;
}
final Entity other = (Entity) obj;
if ((this.id == null) ? (other.id != null) : !this.id.equals(
other.id)) {
return false;
}
return true;
}
#Override
public int hashCode() {
int hash = 3;
hash = 97 * hash + (this.id != null ? this.id.hashCode() : 0);
return hash;
}
public String toString() {
return id;
}
}
public static class EntityTransformer
implements Serializable
{
Map<Entity, String> map = new HashMap<Entity, String>();
}
public static class Wrapper
implements Serializable
{
EntityTransformer et;
Map<String, Entity> eMap;
public Wrapper(EntityTransformer b) {
this.et = b;
this.eMap = new HashMap<String, Entity>();
}
public Entity addEntity(String id) {
Entity e = new Entity(id, et);
et.map.put(e, "whatever-" + id);
eMap.put(id, e);
return e;
}
}
public static void main(String[] args)
throws Exception {
EntityTransformer et = new EntityTransformer();
Wrapper wr = new Wrapper(et);
Entity a1 = wr.addEntity("a1"); // a1 and a2 are created internally by the Wrapper
Entity a2 = wr.addEntity("a2");
byte[] bs = object2Bytes(wr);
wr = (Wrapper) bytes2Object(bs);
System.out.println(wr.et.map);
System.out.println(wr.et.map.containsKey(a1));
System.out.println(wr.et.map.containsKey(a2));
}
public static Object bytes2Object(byte[] bytes)
throws IOException, ClassNotFoundException {
ObjectInputStream oi = null;
Object o = null;
try {
oi = new ObjectInputStream(new ByteArrayInputStream(bytes));
o = oi.readObject();
}
catch (IOException io) {
throw io;
}
catch (ClassNotFoundException cne) {
throw cne;
}
finally {
if (oi != null) {
oi.close();
}
}
return o;
}
public static byte[] object2Bytes(Object o)
throws IOException {
ByteArrayOutputStream baos = null;
ObjectOutputStream oo = null;
byte[] bytes = null;
try {
baos = new ByteArrayOutputStream();
oo = new ObjectOutputStream(baos);
oo.writeObject(o);
bytes = baos.toByteArray();
}
catch (IOException ex) {
throw ex;
}
finally {
if (oo != null) {
oo.close();
}
}
return bytes;
}
}
EDIT
There is a good summary of what is potentially in play for this issue:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4957674
The problem is that HashMap's readObject() implementation , in order
to re-hash the map, invokes the hashCode() method of some of its keys,
regardless of whether those keys have been fully deserialized.
If a key contains (directly or indirectly) a circular reference to the
map, the following order of execution is possible during
deserialization --- if the key was written to the object stream before
the hashmap:
Instantiate the key
Deserialize the key's attributes
2a. Deserialize the HashMap (which was directly or indirectly pointed to by the key)
2a-1. Instantiate the HashMap
2a-2. Read keys and values
2a-3. Invoke hashCode() on the keys to re-hash the map
2b. Deserialize the key's remaining attributes
Since 2a-3 is executed before 2b, hashCode() may return the wrong
answer, because the key's attributes have not yet been fully
deserialized.
Now that does not explain fully why the issue can be fixed if the HashMap from Wrapper is removed, or move to the EntityTransformer class.

This is a problem with circular initialisation. Whilst Java Serialisation can handle arbitrary cycles, the initialisation has to happen in some order.
There's a similar problem in AWT where Component (Entity) contains a reference to its parent Container (EntityTransformer). What AWT does is to make the parent reference in Component transient.
transient Container parent;
So now each Component can complete its initialisation before Container.readObject adds it back in:
for(Component comp : component) {
comp.parent = this;

Even stranger, if you do
Map<Entity, String> map = new HashMap<>(wr.et.map);
System.out.println(map.containsKey(a1));
System.out.println(map.containsKey(a2));
After serializing and de-serializing, you will get the correct output.
Also:
for( Entity a : wr.et.map.keySet() ){
System.out.println(a.toString());
System.out.println(wr.et.map.containsKey(a));
}
Gives:
a1
false
a2
true
I think you found a bug. Most likely, serialization broke the hashing somehow.
In fact, I think you might have found this bug.

Can you override the serialization to transform the reference into a key value before serializing, and then transform it back on deserialization?
It seems like it would be pretty trivial to find the hash key of the EntityTransformer when serializing and use that value instead, (maybe provide a value in the structure called parentKey) and null out the reference. Then when reserializing, you find the EntityTransformer associated with that key value and assign its reference.

Java: accessing transient object fields inside class

Accessing private transient object fields from any method in class must be controlled with some code. What is the best practice?
private transient MyClass object = null;
internal get method:
private MyClass getObject() {
if (object == null)
object = new MyClass();
return object;
}
// use...
getObject().someWhat();
or "make sure" method:
private void checkObject() {
if (object == null)
object = new MyClass();
}
// use...
checkObject();
object.someWhat();
or something clever, more safe or more powerful?

Transient fields are lost at serialization but you need them only after deserialization, so you have to restore them to what you need in the readObject method...

Have to post a new answer about transient because it's too long for a comment. Following code prints
Before: HELLO FOO BAR
After: HELLO null null
public class Test {
public static void main(String[] args) throws Exception {
final Foo foo1 = new Foo();
System.out.println("Before:\t" + foo1.getValue1() + "\t" + foo1.getValue2() + "\t" + foo1.getValue3());
final File tempFile = File.createTempFile("test", null);
// to arrange for a file created by this method to be deleted automatically
tempFile.deleteOnExit();
final FileOutputStream fos = new FileOutputStream(tempFile);
final ObjectOutputStream oos = new ObjectOutputStream(fos);
oos.writeObject(foo1);
oos.close();
final FileInputStream fis = new FileInputStream(tempFile);
final ObjectInputStream ois = new ObjectInputStream(fis);
final Foo foo2 = (Foo) ois.readObject();
ois.close();
System.out.println("After:\t" + foo2.getValue1() + "\t" + foo2.getValue2() + "\t" + foo2.getValue3());
}
static class Foo implements Serializable {
private static final long serialVersionUID = 1L;
private String value1 = "HELLO";
private transient String value2 = "FOO";
private transient String value3;
public Foo() {
super();
this.value3 = "BAR";
}
public String getValue1() {
return this.value1;
}
public String getValue2() {
return this.value2;
}
public String getValue3() {
return this.value3;
}
}
}

Most safe (and normal) way would be either directly initializing it:
private transient MyClass object = new MyClass();
or using the constructor
public ParentClass() {
this.object = new MyClass();
}
Lazy loading in getters (as you did in your example) is only useful if the constructor and/or initialization blocks of MyClass is doing fairly expensive stuff, but it is not threadsafe.
The transient modifier doesn't make any difference. It only skips the field whenever the object is about to be serialized.
Edit: not relevant anymore. As proven by someone else, they indeed don't get reinitialized on deserialization (interesting thought though, it will actually only happen if they are declared static). I'd go ahead with the lazy loading approach or by resetting them through their setters directly after deserialization.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Implementing an equivalent to String.intern() for other objects - java

Related

Creating an object of Class, other than via the constructor

Getting same hashcode every time before serialization and after seriadeserialization of object without using readResolve method in Java why?

Invoking method on actual parameter not declared in formal parameter type

Weird serialisation behavior with HashMap

Java: accessing transient object fields inside class

Categories

Resources