Garbage collector work with 2 WeakHashMaps - java

I have cache, implemented with WeakHashMap, like this:
private static WeakHashMap<Object, WeakReference<Object>> objects = new WeakHashMap<>();
I have an instance of class City:
City c = new City();
I now add this instance to my map like this:
objects.put(c, new WeakReference<Object>(c));
According to WeakHashMap jvm implementation, if key doesn't have strong references to it, it's deleted from the map (in its free time).
So, if my object 'c' is not used in the program anymore, it will be deleted from 'objects' map.
So far, so good.
But what happens if I have two maps?
private static WeakHashMap<Object, WeakReference<Object>> objects1 = new WeakHashMap<>();
private static WeakHashMap<Object, WeakReference<Object>> objects2 = new WeakHashMap<>();
City c = new City();
objects1.put(c, new WeakReference<Object>(c));
objects2.put(c, new WeakReference<Object>(c));
Will GC collect the object 'c' in this case?

Take a piece of paper, draw a graph with the objects as vertices, references as edges.
If you can't find a path of strong edges from a GC root (e.g. static field or local variable on the stack) to the object in equestion then it is not strongly reachable and thus eligible for GC.

For Sure it will collect it(when GC starts), because its still referenced by WeakReference and not a Strong reference, not matter how many WeakReferences are referencing it.
You can read about WeakReference here : WeakReference
Here is an Example to demonstrate it:
public class WeakHashMapExample {
//strongly reference key to prevent GC from collecting it
private static final Key stronglyRefKey1 = new Key(1);
public static void main(String[] args) throws InterruptedException {
WeakHashMap<Key, String> cache1 = new WeakHashMap<>();
WeakHashMap<Key, String> cache2 = new WeakHashMap<>();
//adding same keys
Key key2 = new Key(2);
cache1.put(stronglyRefKey1, "val 1");
cache1.put(key2, "val 2");
cache2.put(stronglyRefKey1, "val 1");
cache2.put(key2, "val 2");
key2 = null; // remove strong reference
//may or may not print Key(2) key, depends if GC starts at this point
System.out.println("cache1 = " + cache1);
System.out.println("cache2 = " + cache2);
//for GC to start so all weak reference should be cleared
System.gc();
//after GC ha been ran, key(2) will be removed because its only referenced by weak reference of the WeakHashMap
System.out.println("cache1 = " + cache1);
System.out.println("cache2 = " + cache2);
}
private static class Key{
int value;
private Key(int value) {
this.value = value;
}
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Key key = (Key) o;
if (value != key.value) return false;
return true;
}
#Override
public int hashCode() {
return value;
}
#Override
public String toString() {
return "Key{value=" + value +'}';
}
}
}

Weak is weak, it won't become strong if it is joined by another weak.
It will be garbage collected if no other strong reference. No doubt.

Related

How to get java object itself when its about to be collected

How can we execute a piece of code using the object (its state is needed) before it gets collected if we don't have control over its source (cant enforce implementing some interface or finally block)?
Java Reference types allow us to access an object if someone else makes it strongly reachable + if we use reference queues we can also be notified when the object is collected, unless my understanding is wrong that's all you can do with reference types, no matter what you use at any point the object is either strongly reachable or its gone and you have null.
All i really need is a way to get notified when specific object is about to be collected.
There is a reason why the Reference API doesn’t allow to retrieve the collected object: allowing to make a collected object reachable again, like happening with the finalize() method, is exactly what is not intended.
The standard approach is to create subclasses of the reference types to store the information associated with the referent, e.g. everything necessary to perform the cleanup action, within the specialized reference object. Of course, this information must not include strong references to the referent itself.
private static final ReferenceQueue<Integer> QUEUE = new ReferenceQueue<>();
static class IntegerPhantomReference extends PhantomReference<Integer> {
final int value;
public IntegerPhantomReference(Integer ref) {
super(ref, QUEUE);
value = ref.intValue();
}
public String toString() {
return "Integer[value="+value+"]";
}
}
private static final Set<IntegerPhantomReference> REGISTERED = new HashSet<>();
public static void main(String[] args) throws InterruptedException {
List<Integer> stronglyReferenced = new ArrayList<>();
for(int i = 0; i < 10; i++) {
Integer object = new Integer(i);
stronglyReferenced.add(object);
REGISTERED.add(new IntegerPhantomReference(object));
}
gcAndPoll("initial");
stronglyReferenced.removeIf(i -> i%2 == 0);
gcAndPoll("after removing even");
stronglyReferenced.clear();
gcAndPoll("after remove all");
if(REGISTERED.isEmpty()) System.out.println("all objects collected");
}
private static void gcAndPoll(String msg) throws InterruptedException {
System.out.println(msg);
System.gc(); Thread.sleep(100);
for(;;) {
Reference<?> r = QUEUE.poll();
if(r == null) break;
System.out.println("collected "+r);
REGISTERED.remove(r);
}
}
initial
after removing even
collected Integer[value=4]
collected Integer[value=8]
collected Integer[value=6]
collected Integer[value=2]
collected Integer[value=0]
after remove all
collected Integer[value=1]
collected Integer[value=5]
collected Integer[value=3]
collected Integer[value=7]
collected Integer[value=9]
all objects collected
For completeness, there is a hack that allows to resurrect a collected object, which will stop working in Java 9.
The documentation of PhantomReference says:
Unlike soft and weak references, phantom references are not automatically cleared by the garbage collector as they are enqueued.
It’s not clear why this has been specified and the get() method of PhantomReference has been overridden to always return null, exactly to disallow taking any benefit from the fact that this reference has not been cleared. Since the purpose of this special behavior is unclear, it has been removed from the specification in Java 9 and these references are automatically cleared like any other.
But for previous versions, it is possible to use Reflection with access override to access the referent, to do exactly what the API was not intended to allow. Needless to say, that’s just for informational purpose and is strongly discouraged (and as said, it stops working in Java 9).
private static final ReferenceQueue<Integer> QUEUE = new ReferenceQueue<>();
private static final Set<PhantomReference<Integer>> REGISTERED = new HashSet<>();
public static void main(String[] args)
throws InterruptedException, IllegalAccessException {
List<Integer> stronglyReferenced = new ArrayList<>();
for(int i = 0; i < 10; i++) {
Integer object = new Integer(i);
stronglyReferenced.add(object);
REGISTERED.add(new PhantomReference<>(object, QUEUE));
}
gcAndPoll("initial");
stronglyReferenced.removeIf(i -> i%2 == 0);
gcAndPoll("after removing even");
stronglyReferenced.clear();
gcAndPoll("after remove all");
if(REGISTERED.isEmpty()) System.out.println("all objects collected");
}
static final Field REFERENT;
static {
try {
REFERENT = Reference.class.getDeclaredField("referent");
REFERENT.setAccessible(true);
} catch (NoSuchFieldException ex) {
throw new ExceptionInInitializerError(ex);
}
}
private static void gcAndPoll(String msg)
throws InterruptedException, IllegalAccessException {
System.out.println(msg);
System.gc();
Thread.sleep(100);
for(;;) {
Reference<?> r = QUEUE.poll();
if(r == null) break;
Object o = REFERENT.get(r);
System.out.println("collected (and now resurrected)"+o);
REGISTERED.remove(r);
}
}

Infinite Loop in Hazelcast IMap for compute method

I try use Set interface as value for hazelcast IMap instance and when I run my test I found that test hung inside ConcurrentMap#compute method.
Why do I have infinite loop when I use hazelcast IMap in this code:
import com.hazelcast.config.Config;
import com.hazelcast.config.MapConfig;
import com.hazelcast.core.Hazelcast;
import com.hazelcast.core.IMap;
import java.io.Serializable;
import java.util.*;
public class Main {
public static void main(String[] args) {
IMap<String, HashSet<StringWrapper>> store = Hazelcast.newHazelcastInstance(
new Config().addMapConfig(new MapConfig("store"))
).getMap("store");
store.compute("user", (k, value) -> {
HashSet<StringWrapper> newValues = Objects.isNull(value) ? new HashSet<>() : new HashSet<>(value);
newValues.add(new StringWrapper("user"));
return newValues;
});
store.compute("user", (k, value) -> {
HashSet<StringWrapper> newValues = Objects.isNull(value) ? new HashSet<>() : new HashSet<>(value);
newValues.add(new StringWrapper("user"));
return newValues;
});
System.out.println(store.keySet());
}
// Data class
public static class StringWrapper implements Serializable {
String value;
public StringWrapper() {}
public StringWrapper(String value) {
this.value = value;
}
public String getValue() {
return value;
}
public void setValue(String value) {
this.value = value;
}
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
if (!super.equals(o)) return false;
StringWrapper value = (StringWrapper) o;
return Objects.equals(this.value, value.value);
}
#Override
public int hashCode() {
return Objects.hash(super.hashCode(), value);
}
}
}
Hazelcast: 3.9.3
Java:build 1.8.0_161-b12
Operating system: macOS High Sierra 10.13.3
#Alykoff I reproduced the issue based on above example & ArrayList version, which is reported as a github issue: https://github.com/hazelcast/hazelcast/issues/12557.
There are 2 seperate problems:
1 - When using HashSet, the problem is how Java deserialize the HashSet/ArrayList (collections) & how compute method works. Inside compute method (since Hazelcast complied with Java 6 & there is no compute method to override, default implementation from ConcurrentMap called ), this block causes the infinite loop:
// replace
if (replace(key, oldValue, newValue)) {
// replaced as expected.
return newValue;
}
// some other value replaced old value. try again.
oldValue = get(key);
this replace method calls IMap replace method. IMap checks if the current value equal to the user-supplied value. But because of a Java Serialization optimization, the check fails. Please check HashSet.readObject method. You'll see that when deserializing the HashSet, since element size is known, it creates the inner HashMap with a capacity:
// Set the capacity according to the size and load factor ensuring that
// the HashMap is at least 25% full but clamping to maximum capacity.
capacity = (int) Math.min(size * Math.min(1 / loadFactor, 4.0f),
HashMap.MAXIMUM_CAPACITY);
But your HashSet, created without an initial capacity, has a default capacity of 16, while the deserialized one has the initial capacity of 1. This changes the serialization, index 51 contains the current capacity & it seems JDK re-calculate it based on size when deserializing the object to minimize the size.
Please see below example:
HazelcastInstance hz = Hazelcast.newHazelcastInstance();
IMap<String, Collection<String>> store = instance.getMap("store");
Collection<String> val = new HashSet<>();
val.add("a");
store.put("a", val);
Collection<String> oldVal = store.get("a");
byte[] dataOld = ((HazelcastInstanceProxy) hz).getSerializationService().toBytes(oldVal);
byte[] dataNew = ((HazelcastInstanceProxy) hz).getSerializationService().toBytes(val);
System.out.println(Arrays.equals(dataNew, dataOld));
This code prints false. But if you create the HashSet with the initial size 1, then both byte arrays are equal. And in your case, you won't get an infinite loop.
2 - When using ArrayList, or any other collection, there's another problem which you pointed above. Due to how compute method implemented in ConcurrentMap, when you assign the old value to the newValue & add a new element, you actually modify the oldValue thus causing replace method fail. But when you change the code to new ArrayList(value), now you're creating a new ArrayList & value collection is not modified. It's a best practice to wrap a collection before using it if you don't want to modify the original one. Same works for HashSet if you create with size 1 due to the first issue I explained.
So in your case, you should use
Collection<String> newValues = Objects.isNull(value) ? new HashSet<>(1) : new HashSet<>(value);
or
Collection<String> newValues = Objects.isNull(value) ? new ArrayList<>() : new ArrayList<>(value);
That HashSet case seems to be a JDK issue, rather than an optimization. I don't know any of these cases can be solved/fixed in Hazelcast, unless Hazalcast overrides the HashXXX collection serialization & overrides the compute method.

Best way to map multiple determinants to a value in Java

I have a requirement in which I need to map multiple determinants to values.
Each set of determinants in a given job execution is guaranteed to be unique. The value to be determined doesn't have to be unique but it probably is.
Depending on the input to the job execution, this could be either one key, or the combination of two keys, or the combination of n keys that will be mapped to a single value. In practice this n will probably be limited to no more than 5, although it is possible it could exceed that.
Each job execution will have a set number of determinants for all inputs (I.e., all inputs will have either 2 determinants, 3 determinants, or n determinants, and will not have a mix).
One key example: foo --> bar
Two keys: foo, bar --> baz
Three keys: foo, bar, baz --> hai
Prior to this, the requirement was that I would only ever map two values to another value. I created an immutable Key class with two member variables and the appropriate override of equals and hashCode.
public class Key {
String determinant0;
String determinant1;
public Key(String d0, d1) {
determinant0 = d0;
determinant1 = d1;
}
// ..
}
However, now that I may be dealing with n number of values, I want to take a look at using a list as the key.
Map<List, String> map = new HashMap<List, String>();
map.put(Arrays.asList("foo", "bar", "baz"), "hai");
String determined = map.get(Arrays.AsList("foo","bar","baz"));
assert (determined.equals("hai"));
This question reminds me that it is bad to use a mutable object (like a List) as a key in a map. However, in my application, the key is only set once and is never altered. Here is an alternative from this question that forces it to be immutable:
HashMap<List<String>, String> map;
map.put(
// unmodifiable so key cannot change hash code
Collections.unmodifiableList(Arrays.asList("foo", "bar", "baz")),
"hai"
);
In addition, I could always make a class like the following to prevent mutations on the list:
public class Key {
List<String> determinants;
public Key(List<String> determinants) {
this.determinants = determinants
}
#Override
public boolean equals(Object obj) {
//...
}
#Override
public int hashCode() {
//...
}
}
Key key = new Key(Arrays.asList("foo","bar","baz"));
Using a plain array as the key won't work, because an array's equal method only checks for identity:
Map<String[], String> map = new HashMap<String[], String>();
String[] key = new String[]{"foo", "bar", "baz"}
map.put(key, "hai");
System.out.println(map.get(key)); // null
That could be fixed by the following:
public class Key {
String[] determinants;
public Key(String... determinants) {
this.determinants = determinants;
}
#Override
public boolean equals(Object obj) {
//...
}
#Override
public int hashCode() {
//...
}
}
How about concatting all the determinants together in a string?
public class Key {
String hash = "";
public Key(String... determinants) {
for (String determinant : determinants) {
hash += determinant + "_";
}
}
#Override
public boolean equals(Object obj) {
//...
}
#Override
public int hashCode() {
//...
}
}
Which one of these solutions (or another one that I did not propose) is the best suited for these requirements?
As a comment, your question includes too much details and could have been way shorter. Now comes my answer.
I prefer using a wrapper class that completely hides the representation of the class. One thing you can do as a small optimization is storing the hashCode of your keys to prevent computing it every time. The equals method will be called more rarely (each collision in the map) and you can't do much about it :
public class Key {
private String[] determinants;
private int hashCode;
public Key(String... determinants) {
if (determinants == null || determinants.length == 0) {
throw new IllegalArgumentException("Please provide at least one value");
}
this.determinants = determinants;
this.hashCode = Objects.hash(determinants);
}
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (!(o instanceof Key)) return false;
Key that = (Key) o;
return Arrays.equals(determinants, that.determinants);
}
#Override
public int hashCode() {
return hashCode;
}
}

Cache that has access to all existing items

I have a system where objects (for the purposes of this question they are immutable) are created based on a request object (could be as simple as a url or a long). They are created with a factory method and not with new.
If an object for a request already exists, requesting a new object would be done more efficiently if instead we can get a reference to the existing instance.
To that end I have created a class, called UniversalCache<K, V> for lack of a better name at this time. It has an LruCache so that an X amount of strong references are kept, and a HashMap<K, SoftReference<V> > to keep track of all the objects that might still be kept alive via other strong references in the system (I'm not relying on a SoftReference keeping the objects from being GC'd).
When a new object is created that is not already in the cache, it is added to the cache along with its key. To search for it in the cache I use the key to get the reference and check if it still has a reference to an object.
The problem I'm having is how to remove these key/reference pairs once the objects get garbage collected. I don't want to go over the whole HashMap searching for references for which poll returns null. Since the referent is not always available, I can't use it to obtain or generate a key back. So I'm extending SoftReference to store the key and use it to remove the pair from the HashMap. Is this a good idea? I have a KeyedSoftReference<K,Rt> that has an additional field of the same type K for the key as the cache (and Rt which ends up being the same as V).
In particular I'd like advice on where to handle the ReferenceQueue (at the moment it's in get) and how to cast the object I get from ReferenceQueue.poll().
This is the code that I have up to now:
package com.frozenkoi.oss;
import java.lang.ref.Reference;
import java.lang.ref.ReferenceQueue;
import java.lang.ref.SoftReference;
import java.util.HashMap;
import android.util.LruCache;
public class UniversalCache<K, V> {
private final LruCache<K, V> mStrongCache;
private final HashMap<K, KeyedSoftReference<K, V> > mSoftCache;
private final ReferenceQueue<V> mRefQueue;
private static class KeyedSoftReference<K, Rt> extends SoftReference<Rt>
{
private final K mKey;
public KeyedSoftReference(K key, Rt r, ReferenceQueue<? super Rt> q)
{
super(r, q);
mKey = key;
}
public K getKey()
{
return mKey;
}
}
public UniversalCache(int strongCacheMaxItemCount)
{
mStrongCache = new LruCache<K, V>(strongCacheMaxItemCount);
mSoftCache = new HashMap<K, KeyedSoftReference<K, V> >();
mRefQueue = new ReferenceQueue<V>();
}
private void solidify(K key, V value)
{
mStrongCache.put(key, value);
}
public void put(K key, V value)
{
solidify(key, value);
mSoftCache.put(key, new KeyedSoftReference<K, V>(key, value, mRefQueue));
}
public V get(K key)
{
//if it's in Strong container, must also be in soft.
//just check in one of them
KeyedSoftReference<K,? extends V> tempRef = mSoftCache.get(key);
final V tempVal = (null!=tempRef)?tempRef.get():null;
V retVal = null;
if (null == tempVal)
{
mSoftCache.remove(key);
retVal = tempVal;
}
else
{
//if found in LruCache container, must be also in Soft one
solidify(key, tempVal);
retVal = tempVal;
}
//remove expired entries
while (null != (tempRef = (KeyedSoftReference<K,V>)mRefQueue.poll())) //Cast
{
//how to get key from val?
K tempKey = tempRef.getKey();
mSoftCache.remove(tempKey);
}
return retVal;
}
}

Java class that efficiently handle the contains operation

I'm using inside an iterative algorithm an HashSet that is dynamically enlarged at each algorithm iteration by adding new objects (via method add). Very frequently I check if a generated object has been already put inside the HashSet by using the contains method. Observe that the HashSet may include several thousand objects.
Here follows a citation from the doc about class HashSet:
"This class offers constant time performance for the basic operations (add, remove, contains and size), assuming the hash function disperses the elements properly among the buckets."
Apart from other considerations provided inside the doc (not reported for simplicity), I see that add and contains are executed in constant time.
Please, can you suggest another data structure in Java that provides better performance for the "contains" operation with respect to my problem?
Classes from Apache Commons or Guava are also accepted.
The performance of HashSet.contains() will be as good as you can get provided your objects have a properly implemented hashCode() method. That will ensure proper distribution among the buckets.
See Best implementation for hashCode method
As other answers already stated "constant time" is the best runtime-behaviout you can get.
If you will get it does depend on your hashcode-implementation, but since you use the NetBeans suggestion you shouldn't be too bad there.
As to how to keep the "constant time" as small as possible:
try to allocate your HashSet large enough from the very beginning to avoid costly rehash-operations
You can cache your calculated hashcode the first time hashCode() is called and return the cached value later on. There should be no need to add some triggering-mechanism to clear the cache on object-updates, since your relevant fields should be immutable - if they aren't you are bound to run into trouble using HashSet anyway.
You can let your object remember if it has been put in that hashset. Just have a boolean field to store if it was added to the hash set. Then you don't need to call contains on the HashSet but just read the field value of your object. This method will only work if the object is put in exactly one hashset that will check the boolean field.
It might be extended to a constant number of hashsets using java.util.BitSet in the object contained in the hashset where every hashset can be identified by a unique integer when the number of hashsets is known before the algorithm starts.
Since you are saying that you are calling contains frequently, it makes sense to replace newly generated objects with equal existing objects (object pooling), since the overhead of that will amortize by having contains being only a single field read.
As requested here is some sample code. The special set implementation is about 4 times faster than a normal hash set on my machine. However the question is how well this code reflects your use case.
public class FastSetContains {
public static class SetContainedAwareObject {
private final int state;
private boolean contained;
public SetContainedAwareObject(int state) {
this.state = state;
}
public void markAsContained() {
contained = true;
}
public boolean isContained() {
return contained;
}
public void markAsRemoved() {
contained = false;
}
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + state;
return result;
}
#Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
SetContainedAwareObject other = (SetContainedAwareObject) obj;
if (state != other.state)
return false;
return true;
}
}
public static class FastContainsSet extends
HashSet<SetContainedAwareObject> {
#Override
public boolean contains(Object o) {
SetContainedAwareObject obj = (SetContainedAwareObject) o;
if (obj.isContained()) {
return true;
}
return super.contains(o);
}
#Override
public boolean add(SetContainedAwareObject e) {
boolean add = super.add(e);
e.markAsContained();
return add;
}
#Override
public boolean addAll(Collection<? extends SetContainedAwareObject> c) {
boolean addAll = super.addAll(c);
for (SetContainedAwareObject o : c) {
o.markAsContained();
}
return addAll;
}
#Override
public boolean remove(Object o) {
boolean remove = super.remove(o);
((SetContainedAwareObject) o).markAsRemoved();
return remove;
}
#Override
public boolean removeAll(Collection<?> c) {
boolean removeAll = super.removeAll(c);
for (Object o : c) {
((SetContainedAwareObject) o).markAsRemoved();
}
return removeAll;
}
}
private static final Random random = new Random(1234L);
private static final int additionalObjectsPerIteration = 10;
private static final int iterations = 100000;
private static final int differentObjectCount = 100;
private static final int containsCountPerIteration = 50;
private static long nanosSpentForContains;
public static void main(String[] args) {
Map<SetContainedAwareObject, SetContainedAwareObject> objectPool = new HashMap<>();
// switch comment use different Set implementaiton
//Set<SetContainedAwareObject> set = new FastContainsSet();
Set<SetContainedAwareObject> set = new HashSet<>();
//warm up
for (int i = 0; i < 100; i++) {
addAdditionalObjects(objectPool, set);
callSetContainsForSomeObjects(set);
}
objectPool.clear();
set.clear();
nanosSpentForContains = 0L;
for (int i = 0; i < iterations; i++) {
addAdditionalObjects(objectPool, set);
callSetContainsForSomeObjects(set);
}
System.out.println("nanos spent for contains: " + nanosSpentForContains);
}
private static void callSetContainsForSomeObjects(
Set<SetContainedAwareObject> set) {
int containsCount = set.size() > containsCountPerIteration ? set.size()
: containsCountPerIteration;
int[] indexes = new int[containsCount];
for (int i = 0; i < containsCount; i++) {
indexes[i] = random.nextInt(set.size());
}
Object[] elements = set.toArray();
long start = System.nanoTime();
for (int index : indexes) {
set.contains(elements[index]);
}
long end = System.nanoTime();
nanosSpentForContains += (end - start);
}
private static void addAdditionalObjects(
Map<SetContainedAwareObject, SetContainedAwareObject> objectPool,
Set<SetContainedAwareObject> set) {
for (int i = 0; i < additionalObjectsPerIteration; i++) {
SetContainedAwareObject object = new SetContainedAwareObject(
random.nextInt(differentObjectCount));
SetContainedAwareObject pooled = objectPool.get(object);
if (pooled == null) {
objectPool.put(object, object);
pooled = object;
}
set.add(pooled);
}
}
}
Anothe Edit:
using the following as the Set.contains implementation makes it about 8 times faster than a normal hashset:
#Override
public boolean contains(Object o) {
SetContainedAwareObject obj = (SetContainedAwareObject) o;
return obj.isContained();
}
EDIT:
This technique has a bit with the class enhancement of OpenJPA in common. The enhancement of OpenJPA enables a class to track its persistent state which is used by the entity manager. The suggested method enables an object to track if itself is contained in a set which is used by the algorithm.

Categories

Resources