Java SoftHashMap Implementation - java

I wanted to implement SoftHashMap based on Java SoftReference and HashMap. Java docs, about WeakHashMap, say that keys are weak references rather than values. I was wondering what hashcode() would be used for put and pull functions of the underlying HashMap. I am assuming WeakHashMap put works like this: hashMap.put(new WeakReference(key), value); If this is true, how would the entry be found for a key.
Wouldn't it be better if values were wrapped in a WeakReference rather than keys?

If you look at this IBM article, you'll see that in the possible implementation they give:
public class WeakHashMap<K,V> implements Map<K,V> {
private static class Entry<K,V> extends WeakReference<K>
implements Map.Entry<K,V> {
private V value;
private final int hash;
private Entry<K,V> next;
...
}
public V get(Object key) {
int hash = getHash(key);
Entry<K,V> e = getChain(hash);
while (e != null) {
K eKey= e.get();
if (e.hash == hash && (key == eKey || key.equals(eKey)))
return e.value;
e = e.next;
}
return null;
}
put adds an Entry normally - but the Entry is a WeakReference referring to the Key object. If the Key is garbage collected, the Entry will eventually be cleared out by WeakHashMap's expungeStaleEntries() method, called every so often from other WeakHashMap operations.

Related

Java - how to get a key object (or entry) stored in HashMap by key?

I'd like to get the "canonical" key object for each key usable to query a map. See here:
Map<UUID, String> map = new HashMap();
UUID a = new UUID("ABC...");
map.put(a, "Tu nejde o zamykání.");
UUID b = new UUID("ABC...");
String string = map.get(b); // This gives that string.
// This is what I am looking for:
UUID againA = map.getEntry(b).key();
boolean thisIsTrue = a == againA;
A HashMap uses equals(), which is the same for multiple unique objects. So I want to get the actual key from the map, which will always be the same, no matter what object was used to query the map.
Is there a way to get the actual key object from the map? I don't see anything in the interface, but perhaps some clever trick I overlooked?
(Iterating all entries or keys doesn't count.)
Is there a way to get the actual key object from the map?
OK, so I am going to make some assumptions about what you mean. After all, you said that your question doesn't need clarification, so the obvious meaning that I can see must be the correct one. Right? :-)
The answer is No. There isn't a way.
Example scenario (not compileable!)
UUID uuid = UUID.fromString("xxxx-yyy-zzz");
UUID uuid2 = UUID.fromString("xxxx-yyy-zzz"); // same string
println(uuid == uuid2); // prints false
println(uuid.equals(true)); // prints true
Map<UUID, String> map = new ...
map.put(uuid, "fred");
println(map.get(uuid)); // prints fred
println(map.get(uuid2)); // prints fred (because uuid.equals(uuid2) is true)
... but, the Map API does not provide a way to find the actual key (in the example above it is uuid) in the map apart from iterating the key or entry sets. And I'm not aware of any existing Map class (standard or 3rd-party) that does provide this1.
However, you could implement your own Map class with an additional method for returning the actual key object. There is no technical reason why you couldn't, though you would have more code to write, test, maintain, etcetera.
But I would add that I agree with Jim Garrison. If you have a scenario where you have UUID objects (with equality-by-value semantics) and you also want to implement equality by identity semantics, then there is probably something wrong with your application's design. The correct approach would be to change the UUID.fromString(...) implementation to always return the same UUID object for the same input string.
1 - This is not to say that such a map implementation doesn't exist. But if it does, you should be able to find it if you look hard enough Note that Questions asking us to find or recommend a library are off-topic!
There is a (relatively) simple way of doing this. I’ve done so in my applications from time to time, when needed ... not for the purpose of == testing, but to reduce the number of identical objects being stored when tens of thousand of objects exist, and are cross-referenced with each other. This significantly reduced my memory usage, and improved performance ... while still using equals() for equality tests.
Just maintain a parallel map for interning the keys.
Map<UUID, UUID> interned_keys = ...
UUID key = ...
if (interned_keys.contains(key))
key = interned_keys.get(key)
Of course, it is far better when the object being stored knows what its own identity is. Then you get the interning basically for free.
class Item {
UUID key;
// ...
}
Map<UUID, Item> map = ...
map.put(item.key, item);
UUID key = ...
key = map.get(key).key; // get interned key
I think there are valid reasons for wanting the actual key. For example, to save memory. Also keep in mind that the actual key may store other objects. For instance, suppose you have a vertex of a graph. The vertex can store the actual data (Say a String, for instance), as well as the incident vertices. A vertex hash value can be dependent only on the data. So to look up a vertex with some data,
D, look up a vertex with data, D,and with with no incident values. Now if you can return the actual vertex in the map you will be able to get the actual incident to the vertex.
It seems to me that many map implementations could easily provide a getEntry method. For example, the HashMap implementation for get is:
public V get(Object key) {
Node<K,V> e;
return (e = getNode(hash(key), key)) == null ? null : e.value;
}
final Node<K,V> getNode(int hash, Object key) {
Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
if ((tab = table) != null && (n = tab.length) > 0 &&
(first = tab[(n - 1) & hash]) != null) {
if (first.hash == hash && // always check first node
((k = first.key) == key || (key != null && key.equals(k))))
return first;
if ((e = first.next) != null) {
if (first instanceof TreeNode)
return ((TreeNode<K,V>)first).getTreeNode(hash, key);
do {
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
return e;
} while ((e = e.next) != null);
}
}
return null;
}
One could use the getNode method to return an Entry:
public getEntry(Object key){
Node<K,V> e = getNode(hash(key),key);
if(e == null) return null;
return new Entry<>(e.key,e.value);
}
The easiest way is to duplicate the reference to the key in the value using a generic Pair type, like this:
HashMap<UUID,Pair<UUID,String>> myMap = new HashMap<>();
When you put them in the map, you provide the reference to the key to the pair. The cost is one reference per entry.
void add(UUID uuid, String str)
{
myMap.put(uuid,Pair.of(uuid,str));
}
Pair<UUID,String> get(UUID uuid)
{
return myMap.get(uuid);
}
Then getFirst() of the Pair is your key. getSecond() is the value.
Whatever you do, it's going to cost you in either time or space.
Your Pair class will be something like:
public class Pair<A,B>
{
private final A a;
private final B b;
public Pair(A a, B b)
{
this.a = a;
this.b = b;
}
/**
* #return the first argument of the Pair
*/
public A getFirst()
{
return this.a;
}
/**
* #return the second argument of the Pair
*/
public B getSecond()
{
return this.b;
}
/**
* Create a Pair.
*
* #param a The first argument (of type A)
* #param b The second argument (of type B)
*
* #return A Pair of A and B
*/
public static <A,B> Pair<A,B> of(A a, B b)
{
return new Pair<>(a,b);
}
// Don't forget to get your IDE to produce a hashcode()
// and equals() method for you, depending
// on if you allow nulls or not, or DIY.
}
it could help. You can use a for each like below.
Map<String,Object> map = new HashMap<>();
map.put("hello1", new String("Hello"));
map.put("hello2", new String("World"));
map.put("hello3", new String("How"));
map.put("hello4", new String("Are u"));
for(Map.Entry<String,Object> e: map.entrySet()){
System.out.println(e.getKey());
}

TreeMap java implementation - putting 1st element

public V put(K key, V value) {
Entry<K,V> t = root;
if (t == null) {
compare(key, key); // type (and possibly null) check
root = new Entry<>(key, value, null);
size = 1;
modCount++;
return null;
}
int cmp;
...
}
final int compare(Object k1, Object k2) {
return comparator==null ? ((Comparable<? super K>)k1).compareTo((K)k2)
: comparator.compare((K)k1, (K)k2);
}
After facing some bug in my application, I had to debug TreeMaps put method. My issue was in comparing objects that were put in the map. What is odd, is that when I put FIRST element to the Map, it key gets compared with itself. I can't understand why would it work like that. Any insights (besides the commented "type (and possibly null) check")? Why wouldn't they just check if key was null? What kind of "type" check is made out there and what for?
As mentioned in the comment, https://bugs.openjdk.java.net/browse/JDK-5045147 is the issue where this was introduced. From the discussion in that issue, the original fix was the following:
BT2:SUGGESTED FIX
Doug Lea writes:
"Thanks! I have a strong sense of deja vu that I've
added this before(!) but Treemap.put should have the
following trap added."
public V put(K key, V value) {
Entry<K,V> t = root;
if (t == null) {
+ if (key == null) {
+ if (comparator == null)
+ throw new NullPointerException();
+ comparator.compare(key, key);
+ }
incrementSize();
root = new Entry<K,V>(key, value, null);
return null;
}
The intention seems to throw a NPE in case the comparator of the TreeMap is null, or the comparator does not accept null keys (which conforms to the API specification). It seems the fix was shortened to one line:
compare(key, key);
which is defined as:
#SuppressWarnings("unchecked")
final int compare(Object k1, Object k2) {
return comparator==null ? ((Comparable<? super K>)k1).compareTo((K)k2)
: comparator.compare((K)k1, (K)k2);
}
Hence this test will do both the null check and the type check, namely the cast to Comparable.
I believe this is the place where TreeMap< K,V > checks if K implements Comparable if no Comparator is supplied. You get a ClassCastException otherwise.

Iterate over ConcurrentHashMap while deleting entries

I want to periodically iterate over a ConcurrentHashMap while removing entries, like this:
for (Iterator<Entry<Integer, Integer>> iter = map.entrySet().iterator(); iter.hasNext(); ) {
Entry<Integer, Integer> entry = iter.next();
// do something
iter.remove();
}
The problem is that another thread may be updating or modifying values while I'm iterating. If that happens, those updates can be lost forever, because my thread only sees stale values while iterating, but the remove() will delete the live entry.
After some consideration, I came up with this workaround:
map.forEach((key, value) -> {
// delete if value is up to date, otherwise leave for next round
if (map.remove(key, value)) {
// do something
}
});
One problem with this is that it won't catch modifications to mutable values that don't implement equals() (such as AtomicInteger). Is there a better way to safely remove with concurrent modifications?
Your workaround works but there is one potential scenario. If certain entries have constant updates map.remove(key,value) may never return true until updates are over.
If you use JDK8 here is my solution
for (Iterator<Entry<Integer, Integer>> iter = map.entrySet().iterator(); iter.hasNext(); ) {
Entry<Integer, Integer> entry = iter.next();
Map.compute(entry.getKey(), (k, v) -> f(v));
//do something for prevValue
}
....
private Integer prevValue;
private Integer f(Integer v){
prevValue = v;
return null;
}
compute() will apply f(v) to the value and in our case assign the value to the global variable and remove the entry.
According to Javadoc it is atomic.
Attempts to compute a mapping for the specified key and its current mapped value (or null if there is no current mapping). The entire method invocation is performed atomically. Some attempted update operations on this map by other threads may be blocked while computation is in progress, so the computation should be short and simple, and must not attempt to update any other mappings of this Map.
Your workaround is actually pretty good. There are other facilities on top of which you can build a somewhat similar solution (e.g. using computeIfPresent() and tombstone values), but they have their own caveats and I have used them in slightly different use-cases.
As for using a type that doesn't implement equals() for the map values, you can use your own wrapper on top of the corresponding type. That's the most straightforward way to inject custom semantics for object equality into the atomic replace/remove operations provided by ConcurrentMap.
Update
Here's a sketch that shows how you can build on top of the ConcurrentMap.remove(Object key, Object value) API:
Define a wrapper type on top of the mutable type you use for the values, also defining your custom equals() method building on top of the current mutable value.
In your BiConsumer (the lambda you're passing to forEach), create a deep copy of the value (which is of type your new wrapper type) and perform your logic determining whether the value needs to be removed on the copy.
If the value needs to be removed, call remove(myKey, myValueCopy).
If there have been some concurrent changes while you were calculating whether the value needs to be removed, remove(myKey, myValueCopy) will return false (barring ABA problems, which are a separate topic).
Here's some code illustrating this:
import java.util.Random;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ConcurrentMap;
import java.util.concurrent.atomic.AtomicInteger;
public class Playground {
private static class AtomicIntegerWrapper {
private final AtomicInteger value;
AtomicIntegerWrapper(int value) {
this.value = new AtomicInteger(value);
}
public void set(int value) {
this.value.set(value);
}
public int get() {
return this.value.get();
}
#Override
public boolean equals(Object obj) {
if (this == obj) {
return true;
}
if (!(obj instanceof AtomicIntegerWrapper)) {
return false;
}
AtomicIntegerWrapper other = (AtomicIntegerWrapper) obj;
if (other.value.get() == this.value.get()) {
return true;
}
return false;
}
public static AtomicIntegerWrapper deepCopy(AtomicIntegerWrapper wrapper) {
int wrapped = wrapper.get();
return new AtomicIntegerWrapper(wrapped);
}
}
private static final ConcurrentMap<Integer, AtomicIntegerWrapper> MAP
= new ConcurrentHashMap<>();
private static final int NUM_THREADS = 3;
public static void main(String[] args) throws InterruptedException {
for (int i = 0; i < 10; ++i) {
MAP.put(i, new AtomicIntegerWrapper(1));
}
Thread.sleep(1);
for (int i = 0; i < NUM_THREADS; ++i) {
new Thread(() -> {
Random rnd = new Random();
while (!MAP.isEmpty()) {
MAP.forEach((key, value) -> {
AtomicIntegerWrapper elem = MAP.get(key);
if (elem == null) {
System.out.println("Oops...");
} else if (elem.get() == 1986) {
elem.set(1);
} else if ((rnd.nextInt() & 128) == 0) {
elem.set(1986);
}
});
}
}).start();
}
Thread.sleep(1);
new Thread(() -> {
Random rnd = new Random();
while (!MAP.isEmpty()) {
MAP.forEach((key, value) -> {
AtomicIntegerWrapper elem =
AtomicIntegerWrapper.deepCopy(MAP.get(key));
if (elem.get() == 1986) {
try {
Thread.sleep(10);
} catch (Exception e) {}
boolean replaced = MAP.remove(key, elem);
if (!replaced) {
System.out.println("Bailed out!");
} else {
System.out.println("Replaced!");
}
}
});
}
}).start();
}
}
You'll see printouts of "Bailed out!", intermixed with "Replaced!" (removal was successful, as there were no concurrent updates that you care about) and the calculation will stop at some point.
If you remove the custom equals() method and continue to use a copy, you'll see an endless stream of "Bailed out!", because the copy is never considered equal to the value in the map.
If you don't use a copy, you won't see "Bailed out!" printed out, and you'll hit the problem you're explaining - values are removed regardless of concurrent changes.
Let us consider what options you have.
Create your own Container-class with isUpdated() operation and use your own workaround.
If your map contains just a few elements and you are iterating over the map very frequently compared against put/delete operation. It could be a good choice to use CopyOnWriteArrayList
CopyOnWriteArrayList<Entry<Integer, Integer>> lookupArray = ...;
The other option is to implement your own CopyOnWriteMap
public class CopyOnWriteMap<K, V> implements Map<K, V>{
private volatile Map<K, V> currentMap;
public V put(K key, V value) {
synchronized (this) {
Map<K, V> newOne = new HashMap<K, V>(this.currentMap);
V val = newOne.put(key, value);
this.currentMap = newOne; // atomic operation
return val;
}
}
public V remove(Object key) {
synchronized (this) {
Map<K, V> newOne = new HashMap<K, V>(this.currentMap);
V val = newOne.remove(key);
this.currentMap = newOne; // atomic operation
return val;
}
}
[...]
}
There is a negative side effect. If you are using copy-on-write Collections your updates will be never lost, but you can see some former deleted entry again.
Worst case: deleted entry will be restored every time if map get copied.

why HashMap.keySet() doesn't return an empty Set [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
Where do the keys in keySet come from?
The class KeySet is a inner class of HashMap, it has the access to HashMap variables, but there is no direct variable like a Set<K> which stores only keys of the map to refer.
I can only find a Entry<K,V>[] table. But it have stored key and value.
Does the keySet() method do something when new KeySet() is called to make the reference? May be like:
for(Entry e : table) {
keySet.put(e.getKey());
}
then the keySet stored keys,and when add or remove a key-value,it also add or remove the keys in keySet the same?
public Set<K> keySet() {
Set<K> ks = keySet;
return (ks != null ? ks : (keySet = new KeySet()));
}
The source code shows just a new KeySet(), but why isn't it empty,but has the keys? To make it clearer:
Map map = new HashMap();
map.put(1, 1); //null
map.keySet(); //[1]
map.put(2, 2); //[1,2]
map.remove(2); //[1]
debug and breakpoint at each line,inspect each line and watch the keySet variable of the map will show the result above, right?
Once the keySet() called,the put and remove will take the same effect to the keySet,right? I've watched into the method put and remove of HashMap.
for "put()",if called addEntry -> createEntry -> after calling "table[bucketIndex] = new Entry<>(hash, key, value, e);" the keySet will add the key,
for "remove()" ->removeEntryForKey -> after calling table[i] = next; the key in keySet was removed, so i think there must be some association between table[] and keySet, and then i asked this question...
keySet() returns an internal Set implementation backed by the HashMap. So, for example, calling contains(key) on that Set calls containsKey(key) on the backing HashMap.
It doesn't create an independent Set holding the keys of the original HashMap (as you suggested in your code snippet), since such a Set wouldn't be backed by the original HashMap, so changes in the HashMap won't be reflected in the Set and vice versa.
Here's the Java 6 implementation :
/**
* Each of these fields are initialized to contain an instance of the
* appropriate view the first time this view is requested. The views are
* stateless, so there's no reason to create more than one of each.
*/
transient volatile Set<K> keySet = null;
public Set<K> keySet() {
Set<K> ks = keySet;
return (ks != null ? ks : (keySet = new KeySet()));
}
private final class KeySet extends AbstractSet<K> {
public Iterator<K> iterator() {
return newKeyIterator();
}
public int size() {
return size;
}
public boolean contains(Object o) {
return containsKey(o);
}
public boolean remove(Object o) {
return HashMap.this.removeEntryForKey(o) != null;
}
public void clear() {
HashMap.this.clear();
}
}
You can browse the source code of java.util.HashMap to understand how this works.
The keySet() function actually returns a member variable of the HashMap instance as the following JDK source code got:
public Set<K> [More ...] keySet() {
Set<K> ks = keySet;
return (ks != null ? ks : (keySet = new KeySet()));
}
Then the keySet is a member variable of an HashMap where its a locally defined class:
private final class [More ...] KeySet extends AbstractSet<K> {
public Iterator<K> [More ...] iterator() {
return newKeyIterator();
}
public int [More ...] size() {
return size;
}
public boolean [More ...] contains(Object o) {
return containsKey(o);
}
public boolean [More ...] remove(Object o) {
return HashMap.this.removeEntryForKey(o) != null;
}
public void [More ...] clear() {
HashMap.this.clear();
}
}
So as you can see, it simply defines another "view" to the same data held in the HashMap. Nothing is duplicated so the consistency between the keySet view and the original map view is guaranteed.
ok,i get the reason.the method really return a empty KeySet.but when break at it.the eclipse will call the method AbstractCollection.toString()...then the KeySet.iterator() method called.

Mark an empty space into an array without using null

I am extending AbstractMap and I want to implement my own hash-map using two parallel arrays:
K[] keys;
V[] values;
Suppose I want to store null values as well, how could I initialize these two arrays so that I can differentiate between a space in the array where I could place some new key-value pairs and a space where I am storing a null?
Might I suggest not using two arrays, and instead do something along the lines of:
class Node {
K key;
V value;
}
Node[] nodes;
Then a non-entry is an element in nodes that is equal to null.
If the values can be null but the keys cannot be null then having a null key would mean that there is no key.
If the key can also be null you can use a parallel array of booleans to store whether each space is taken or not.
K[] keys;
V[] values;
boolean[] hasValue;
Not quite sure the details of your question, but you could always have some special object for your "blank".
private static final Object BLANK = new Object();
Then if the item in the array == BLANK, then consider it to be an empty slot.
Since there can only be one null key, you can simply have a special reference value (not in the array) that holds the value of the object mapped from this null key (and possibly a boolean indicating if this value has been set). Unfortunately this will probably complicate iteration.
E.g.
private boolean isNullMapped = false;
private V nullValue = null;
public put(K key, V value)
{
if (key == null) { nullValue = value; }
...
}
Alternatively, you can wrap all keys in a wrapper object (supposing you still want to use parallel arrays instead of entries), and if the value contained in this wrapper object is null, then it represents the null key.
E.g.
private static class KeyWrapper<K>
{
public K key;
}
Lastly, as a question for consideration, if you are not having entries in your arrays, but instead are directly holding arrays of K and V, then how are you accounting for different keys that happen to share the same hash code? The java.util implementation has arrays of entries that also act as linked lists to account for this possibility (and incidentally, the null key is always mapped to array index 0).
Storing a null value is not a problem in your scenario. So long as keys[n] != null, just return values[n] whether values[n] is null or not.
Remember that you are not being asked to key on n but objects of type K so every access of the Map will require a search through keys to find the key they are looking for.
However, if you want to allow the storage of a value against a null key then using something like private static final Object NULL_KEY = "NULL" would probably do the trick as the other suggestions point out.
private static final Object NULL_KEY = "NULL";
K[] keys;
V[] values;
private int find(K key) {
for (int i = 0; i < keys.length; i++) {
if (keys[i] == key) {
return i;
}
}
return -1;
}
public V put(K key, V value) {
V old = null;
if (key != null) {
int i = find(key);
if (i >= 0) {
old = values[i];
values[i] = value;
} else {
// ...
}
} else {
return put((K) NULL_KEY, value);
}
return old;
}
public V get(K key) {
if (key != null) {
int i = find(key);
if (i >= 0) {
return values[i];
}
return null;
} else {
return (get((K) NULL_KEY));
}
}
In the java.util implementation a special object representing null is used.

Categories

Resources