Resizing a map implementation in java - java

I am writing my own hashmap. I am able to define put/get methods. When I am trying to resize the array, I am getting null values:
public void resize() {
if (100 * size >= 75 * capacity) {
int oldcap = capacity;
capacity = capacity * 2;
Entry[] resizedBuckets = new Entry[capacity];
for (int i = 0; i < oldcap; i++) {
resizedBuckets[i] = this.buckets[i];
}
this.buckets = resizedBuckets;
}
}
My whole code is as below:
package lib;
class Entry<K, V> {
K key;
V value;
Entry<K, V> next;
Entry(K key, V value, Entry next) {
this.key = key;
this.value = value;
}
}
public class MyMap<K, V> {
Entry[] buckets;
int size = 0;
static int capacity = 2;
MyMap(int capacity) {
this.buckets = new Entry[capacity];
}
public MyMap() {
this(capacity);
}
public void resize() {
if (100 * size >= 75 * capacity) {
int oldcap = capacity;
capacity = capacity * 2;
Entry[] resizedBuckets = new Entry[capacity];
for (int i = 0; i < oldcap; i++) {
resizedBuckets[i] = this.buckets[i];
}
this.buckets = resizedBuckets;
}
}
public void put(K key, V value) {
int bucket = key.hashCode() % capacity;
//i have bucket it
Entry newEntry = new Entry<K, V>(key, value, null);
if (this.buckets[bucket] == null) {
this.buckets[bucket] = newEntry;
size++;
} else {
Entry currentNode = this.buckets[bucket];
Entry prevNode = null;
while (currentNode.next != null && currentNode.key != key) {
prevNode = currentNode;
currentNode = currentNode.next;
}
if (currentNode.key == key) {
if (prevNode == null) {
newEntry.next = currentNode.next;
this.buckets[bucket] = newEntry;
} else {
newEntry.next = currentNode.next;
prevNode.next = newEntry;
}
} else {
currentNode.next = newEntry;
size++;
}
}
resize();
}
public V get(K key) {
int bucket = key.hashCode() % capacity;
Entry currentBucket = this.buckets[bucket];
while (currentBucket != null) {
if (currentBucket.key == key)
return (V) currentBucket.value;
currentBucket = currentBucket.next;
}
return null;
}
}
class MyMain {
public static void main(String[] args) {
MyMap<String, Integer> myMap = new MyMap<>();
myMap.put("name", 1);
myMap.put("2name", 2);
myMap.put("3name", 2);
myMap.put("4name", 3);
myMap.put("5name", 2);
myMap.put("6name", 2);
myMap.put("3name", 3);
System.out.println(myMap);
System.out.println(myMap.get("name"));
System.out.println(myMap.get("2name"));
System.out.println(myMap.get("3name"));
}
}
I am getting following output:
lib.MyMap#30f39991
null
null
3
I should be getting:
1
2
3
What's the reason? I think the issue is with resize method, but I am unable to figure it out.

As mentioned in the comments your bucket index changed as capacity was increased during the resize operation. And thus int bucket = key.hashCode() % capacity; inside put and get returns a different bucket index than before.
I haven't tested it, but I think, if you do
resizedBuckets[this.buckets[i].getKey().hashCode() % capacity] = this.buckets[i];
inside the resize method, this might already work.
As #Voo pointed out, the above change only works as long as you don't have any hash collisions before resizing. For a complete solution you will need to recalculate the index for every key!

Have a look at your code:
public void put(K key, V value) {
int bucket = key.hashCode() % capacity;
...
resize(); // this will modify capacity;
}
public void get(K key) {
int bucket = key.hashCode() % capacity;
...
}
You resize() the map, but store the old hashCode() x capacity key bucket. This invalidates the key somewhat for your get()-method. Because if you would put the same value-key-pair again with your modified map size, it would not land in the same bucket as the old key-value-pair.
You have to store the keys in your resized map with the new capacity instead of simply copying them.

Related

Java HashMap: Changing Bucket Implementation to Linear Probing method

In advance, I apologize for my lack of experience, these are advanced concepts that are difficult to wrap my head around. From what I understand, linear probing is circular, it won't stop until it finds an empty cell.
However I am not sure how to implement it. Some example on how to would be greatly appreciated. Sorry again for the inexperience, I'm not some vetted programmer, I'm picking this up very slowly.
public boolean ContainsElement(V element)
{
for(int i = 0; i < capacity; i++)
{
if(table[i] != null)
{
LinkedList<Entry<K, V>> bucketMethod = table[i];
for(Entry<K, V> entry : bucketMethod)
{
if(entry.getElement().equals(element))
{
return true;
}
}
}
}
return false;
}
Here's a working hash table based on the pseudocode examples found in the Wikipedia article for open addressing.
I think the main differences between the Wikipedia example and mine are:
Treating the hashCode() a little bit due to the way Java does modulo (%) with negative numbers.
Implemented simple resizing logic.
Changed the logic in the remove method a little bit because Java doesn't have goto.
Otherwise, it's more or less just a direct translation.
package mcve;
import java.util.*;
import java.util.stream.*;
public class OAHashTable {
private Entry[] table = new Entry[16]; // Must be >= 4. See findSlot.
private int size = 0;
public int size() {
return size;
}
private int hash(Object key) {
int hashCode = Objects.hashCode(key)
& 0x7F_FF_FF_FF; // <- This is like abs, but it works
// for Integer.MIN_VALUE. We do this
// so that hash(key) % table.length
// is never negative.
return hashCode;
}
private int findSlot(Object key) {
int i = hash(key) % table.length;
// Search until we either find the key, or find an empty slot.
//
// Note: this becomes an infinite loop if the key is not already
// in the table AND every element in the array is occupied.
// With the resizing logic (below), this will only happen
// if the table is smaller than length=4.
while ((table[i] != null) && !Objects.equals(table[i].key, key)) {
i = (i + 1) % table.length;
}
return i;
}
public Object get(Object key) {
int i = findSlot(key);
if (table[i] != null) { // Key is in table.
return table[i].value;
} else { // Key is not in table
return null;
}
}
private boolean tableIsThreeQuartersFull() {
return ((double) size / (double) table.length) >= 0.75;
}
private void resizeTableToTwiceAsLarge() {
Entry[] old = table;
table = new Entry[2 * old.length];
size = 0;
for (Entry e : old) {
if (e != null) {
put(e.key, e.value);
}
}
}
public void put(Object key, Object value) {
int i = findSlot(key);
if (table[i] != null) { // We found our key.
table[i].value = value;
return;
}
if (tableIsThreeQuartersFull()) {
resizeTableToTwiceAsLarge();
i = findSlot(key);
}
table[i] = new Entry(key, value);
++size;
}
public void remove(Object key) {
int i = findSlot(key);
if (table[i] == null) {
return; // Key is not in the table.
}
int j = i;
table[i] = null;
--size;
while (true) {
j = (j + 1) % table.length;
if (table[j] == null) {
break;
}
int k = hash(table[j].key) % table.length;
// Determine if k lies cyclically in (i,j]
// | i.k.j |
// |....j i.k.| or |.k..j i...|
if ( (i<=j) ? ((i<k)&&(k<=j)) : ((i<k)||(k<=j)) ) {
continue;
}
table[i] = table[j];
i = j;
table[i] = null;
}
}
public Stream<Entry> entries() {
return Arrays.stream(table).filter(Objects::nonNull);
}
#Override
public String toString() {
return entries().map(e -> e.key + "=" + e.value)
.collect(Collectors.joining(", ", "{", "}"));
}
public static class Entry {
private Object key;
private Object value;
private Entry(Object key, Object value) {
this.key = key;
this.value = value;
}
public Object getKey() { return key; }
public Object getValue() { return value; }
}
public static void main(String[] args) {
OAHashTable t = new OAHashTable();
t.put("A", 1);
t.put("B", 2);
t.put("C", 3);
System.out.println("size = " + t.size());
System.out.println(t);
t.put("X", 4);
t.put("Y", 5);
t.put("Z", 6);
t.remove("C");
t.remove("B");
t.remove("A");
t.entries().map(e -> e.key)
.map(key -> key + ": " + t.get(key))
.forEach(System.out::println);
}
}
java.util.HashMap implementation of java.util.Map internally provides linear probing that is HashMap can resolve collisions in hash tables.

Implementation of Custom HashMap code issues

I am preparing my own custom HashMap implementation in Java. Below is my imlementation.
public class Entry<K,V> {
private final K key;
private V value;
private Entry<K,V> next;
public Entry(K key, V value, Entry<K,V> next) {
this.key = key;
this.value = value;
this.next = next;
}
public V getValue() {
return value;
}
public void setValue(V value) {
this.value = value;
}
public Entry<K, V> getNext() {
return next;
}
public void setNext(Entry<K, V> next) {
this.next = next;
}
public K getKey() {
return key;
}
}
public class MyCustomHashMap<K,V> {
private int DEFAULT_BUCKET_COUNT = 10;
private Entry<K,V>[] buckets;
public MyCustomHashMap() {
buckets = new Entry[DEFAULT_BUCKET_COUNT];
for (int i = 0;i<DEFAULT_BUCKET_COUNT;i++)
buckets[i] = null;
}
public void put(K key,V value){
/**
* This is the new node.
*/
Entry<K,V> newEntry = new Entry<K,V>(key, value, null);
/**
* If key is null, then null keys always map to hash 0, thus index 0
*/
if(key == null){
buckets[0] = newEntry;
}
/**
* get the hashCode of the key.
*/
int hash = hash(key);
/**
* if the index does of the bucket does not contain any element then assign the node to the index.
*/
if(buckets[hash] == null) {
buckets[hash] = newEntry;
} else {
/**
* we need to traverse the list and compare the key with each of the keys till the keys match OR if the keys does not match then we need
* to add the node at the end of the linked list.
*/
Entry<K,V> previous = null;
Entry<K,V> current = buckets[hash];
while(current != null) {
boolean done = false;
while(!done) {
if(current.getKey().equals(key)) {
current.setValue(value);
done = true; // if the keys are same then replace the old value with the new value;
} else if (current.getNext() == null) {
current.setNext(newEntry);
done = true;
}
current = current.getNext();
previous = current;
}
}
previous.setNext(newEntry);
}
}
public V getKey(K key) {
int hash = hash(key);
if(buckets[hash] == null) {
return null;
} else {
Entry<K,V> temp = buckets[hash];
while(temp != null) {
if(temp.getKey().equals(key))
return temp.getValue(); // returns value corresponding to key.
temp = temp.getNext();
}
return null; //return null if key is not found.
}
}
public void display() {
for(int i = 0; i < DEFAULT_BUCKET_COUNT; i++) {
if(buckets[i] != null) {
Entry<K,V> entry = buckets[i];
while(entry != null){
System.out.print("{"+entry.getKey()+"="+entry.getValue()+"}" +" ");
entry=entry.getNext();
}
}
}
}
public int bucketIndexForKey(K key) {
int bucketIndex = key.hashCode() % buckets.length;
return bucketIndex;
}
/**
*
* #param key
* #return
*/
private int hash(K key){
return Math.abs(key.hashCode()) % buckets.length;
}
public static void main(String[] args) {
// TODO Auto-generated method stub
MyCustomHashMap<String, Integer> myCustomHashMap = new MyCustomHashMap<String, Integer>();
myCustomHashMap.put("S", 22);
myCustomHashMap.put("S", 1979);
myCustomHashMap.put("V", 5);
myCustomHashMap.put("R", 31);
System.out.println("Value corresponding to key R: "+myCustomHashMap.getKey("R"));
System.out.println("Value corresponding to key V: "+myCustomHashMap.getKey("V"));
System.out.println("Displaying the contents of the HashMap:: ");
myCustomHashMap.display();
}
}
1) I feel that put (K key,V value) is somewhat flawed. Please do kindly validate and let me know what's wrong here. On entering the same key its giving me wrong result. I have not yet tested it for collision cases having different keys.
2) It is said that we rehash the hashCode so that it eliminates wrong implementation of hashCode. how do I do it because if I give hashCode of key i.e. hash(key.hashCode()) then it dosn't take as it can't compute hashCode of int. How to do this?
Any help would be highly appreciated.
Thanks
Sid
You handle null key incorrectly :
if(key == null){
buckets[0] = newEntry;
}
It's possible that buckets[0] already contains entries, in which case you will lose those entries.
The following loop has some issues :
Entry<K,V> previous = null;
Entry<K,V> current = buckets[hash];
while(current != null) {
boolean done = false;
while(!done) {
if(current.getKey().equals(key)) {
current.setValue(value);
done = true;
} else if (current.getNext() == null) {
current.setNext(newEntry);
done = true;
}
current = current.getNext();
previous = current; // you are not really setting previous to
// to the previous Entry in the list - you
// are setting it to the current Entry
}
}
previous.setNext(newEntry); // you don't need this statement. You
// already have a statement inside the
// loop that adds the new Entry to the list
It looks like removing any statements related to previous will fix this loop.
EDIT:
As kolakao commented, in order for your implementation to be efficient (i.e. require expected constant time for get and put), you must resize the HashMap when the number of entries exceeds some threshold (in order for the average number of entries in each bucket to be bound by a constant).
It is said that we rehash the hashCode so that it eliminates wrong implementation of hashCode. how do I do it because if I give hashCode of key i.e. hash(key.hashCode()) then it dosn't take as it can't compute hashCode of int. How to do this?
The idea of re-hashing doesn't involve calling hashCode for the hashCode of the key. It involves running some hardcoded function on the value obtained by key.hashCode().
For example, in Java 7 implementation of HashMap, the following function is used :
static int hash(int h) {
// This function ensures that hashCodes that differ only by
// constant multiples at each bit position have a bounded
// number of collisions (approximately 8 at default load factor).
h ^= (h >>> 20) ^ (h >>> 12);
return h ^ (h >>> 7) ^ (h >>> 4);
}
Then you use it with :
int hash = hash(key.hashCode());
int bucket = hash % buckets.length;

(Amateur Programmer) Custom HashMap size is always one less than expected?

Good afternoon,
For my current Computer Science course our professor has us implementing a HashMap class without utilizing Java's built in Map class ( besides using the interface it ). I've finished most of it and am in debugging mode now. My professors grading server is telling me I'm always returning one "size" less than is expected ie. my class is returning size 48 when size 49 is expected. I think i've narrowed it down to being in the put() method, because its the only method where the size is being incremented upwards, but i'm not sure. Any information about the class would be appreciated. Also, as a important side note, i'm utilizing the "chaining" collision resolution technique.
Thanks everyone !
{package adt;
import java.util.ArrayList;
import java.util.Collection;
import java.util.HashSet;
import java.util.Map;
import java.util.Set;
#SuppressWarnings("unchecked")
public class HashMap<K , V> implements Map<K, V>{
//linked list style
private class Entry<K, V> {
private K key;
private V value;
private Entry<K, V> next;
private Entry( K key, V value){
this.key = key;
this.value = value;
this.next = null;
}
}
private Entry<K, V>[] table = new Entry[1024]; // Creates a table of Entries. Because of the chaining implementation, each Entry will be treated like a linked list, so each Entry has a .next.
int size = 0;
public HashMap(){
for(int i =0; i<table.length; i++){
table[i] = null;
}
}
#Override
public int size() {
return size;
}
#Override
public boolean isEmpty() {
return size ==0;
}
#Override
public boolean containsKey(Object key) {
int location = Hash(key) % table.length;
Entry<K, V> e = table[location];
if(table[location] == null){
return false;
}else{
while (e!= null && e.key.equals(key) == false){
e = e.next;
if( e.key.equals(key)){
return true;
}
}
}
return false;
}
#Override
public boolean containsValue(Object value) {
for (int i = 0; i<table.length; i++){
if(table[i] != null){
Entry<K, V> e = table[i];
while( e.value.equals(value)==false){
e = e.next;
if(e.value.equals(value)){
return true;
}
}
}
}
return false;
}
#Override
public V get(Object key) {
V value = null;
int location = Hash(key)% table.length;
if (table[location] == null){
value = null;
}else{
Entry<K, V> e = table[location];
while(e !=null && e.key.equals(key) == false){
e = e.next;
if (e == null){
value = null;
}
else{
value = (V) e.value;
}
}
}
return value;
}
#Override
public V put(K key, V value) {
V returnValue = null;
int location = Hash(key) % table.length;
if ( table[location]== null){
table[location] = new Entry< K, V>(key, value);
size++;
}
else{
Entry<K, V> e = table[location];
while ( e.next != null && e.key.equals(key) == false){
e = e.next;
if( e.key.equals(key)){
e.value = value;
returnValue = e.value;
}else if(e.next == null){
e.next = new Entry<K, V>(key, value);
returnValue = e.value;
size++;
}
}
}
return returnValue;
}
#Override
public V remove(Object key) {
int location = Hash(key) % table.length;
V value = null;
Entry<K, V> e = table[location];
if(table[location] == null){
value = null;
}else {
while(e.next != null && e.key.equals(key) == false){
e = e.next;
if(e.key.equals(key)){
value = e.value;
e = null;
size--;
}else{
value = null;
}
}
}
return value;
}
#Override
public void putAll(Map<? extends K, ? extends V> m) {
// TODO Auto-generated method stub
}
#Override
public void clear() {
for(int i =0; i<table.length; i++){
table[i] = null;
}
size = 0;
}
#Override
public Set<K> keySet() {
Set<K> s = new HashSet<K>();
Entry<K, V> e;
if(!isEmpty()){
for(int i = 0; i<table.length; i++){
e = table[i];
if(e != null){
s.add(e.key);
while(e.next != null){
s.add(e.key);
e=e.next;
}
}
}
}
return s;
}
#Override
public Collection<V> values() {
Collection<V> c = new ArrayList<V>();
Entry<K,V> e;
if(isEmpty()==false){
for(int i=0; i<table.length;i++){
e = table[i];
if(e != null){
c.add(e.value);
while(e.next !=null){
c.add(e.value);
}
}
}
}
return c;
}
#SuppressWarnings("rawtypes")
#Override
public Set entrySet() {
Set<Entry> s = new HashSet<Entry>();
Entry<K,V> e;
if(isEmpty()==false){
for(int i = 0; i<table.length; i++){
e = table[i];
if( e!=null){
while(e.next != null){
s.add(e);
}
}
}
}
return s;
}
public int Hash(Object k){
String key = k.toString();
int n = 13;
for(int i = 0; i<key.length(); i++){
n += n + key.charAt(i);
}
n = n*31;
return n;
}
public boolean equals(Object o){
Map<K, V> m2;
if( o instanceof Map){
m2 = (Map<K,V>)o;
if(entrySet().equals(m2.entrySet())){
return true;
}
}
return false;
}
}
}
In general, there seems to be several cases where you pre-increment the e value to e.next without even comparing e.key.
Take put, for example, since that is what we are talking about.
You get an entry from the table
While the key after it is not null and it's current key doesn't equal the key you want to put...
Move the entry??? Why? You haven't checked the first entry if it's key equaled anything yet.
You are skipping the first Entry's key.
You should be treating it like a for-loop. Initialize, test some condition, then increment
int location = Hash(key) % table.length;
Entry<K, V> e = table[location];
while ( e.next != null && !e.key.equals(key)) {
if( e.key.equals(key)) {
// do some stuff
}
e = e.next;
}
Or, you could just rewrite it as a for-loop like this
int location = Hash(key) % table.length;
for (Entry<K, V> e = table[location]; e.next != null && !e.key.equals(key); e = e.next) {
if( e.key.equals(key)) {
// do some stuff
}
}
Almost every method in your code seems to do the same pre-increment thing and it is causing you to skip the first Entry's key.
Another problem with the approach you have is the section of the while-loop that says e.key.equals(key) == false.
(Ignoring the fact that you can rewrite that as I have above)... if the keys were equal, then the entire while-loop will be skipped.
My suggestion would be to simply remove that condition from the while because you are comparing the keys inside the while loop anyways.
Back to put specifically, though.
Your if-else condition doesn't match up. The variables you are comparing are different, so there is no "else" condition. You need two separate if statements that handle your different conditions like so. And
if(e.next == null){
e.next = new Entry<K, V>(key, value);
size++;
returnValue = e.value;
}
if(e.key.equals(key)){
e.value = value;
returnValue = e.value;
}
#cricket_007 spotted where you got it wrong, but I wanted to post my version anyway. I think it is more readable, with a clearer logic.
#Override
public V put(K key, V value) {
int index = hash(key) % table.length;
Entry<K, V> prev = null;
for (Entry<K, V> entry = table[index]; entry != null; prev = entry, entry = entry.next) {
if (Objects.equals(entry.key, key)) {
V res = entry.value;
entry.value = value;
return res;
}
}
Entry<K, V> newEntry = new Entry<K, V>(key, value);
if (prev == null) table[index] = newEntry;
else prev.next = newEntry;
size++;
return null;
}
This code has two clearly distinct parts : the first part is a search through the existing keys until you exhaust all of them. As soon as you find one matching the key passed as a parameter, you can return.
The second part is what happens when no match was found. In this case, the size is always incremented. This makes it very clear when you need or need not to increment the size and insert and element.
Bonus: Objects.equals handles the case where the existing key is null. If you use it in conjunction with Objects.hashCode or any hash that accepts null, your map can support null keys.

How to implement a Least Frequently Used (LFU) cache?

Least Frequently Used (LFU) is a type of cache algorithm used to manage memory within a computer. The standard characteristics of this method involve the system keeping track of the number of times a block is referenced in memory. When the cache is full and requires more room the system will purge the item with the lowest reference frequency.
What would be the best way to implement a most-recently-used cache of objects, say in Java?
I've already implemented one using LinkedHashMap(by maintaining the no. of times objects are accessed) But I'm curious if any of the new concurrent collections would be better candidates.
Consider this case : Suppose cache is full and we need to make space for another one. Say two objects are noted in cache which are accessed for one time only. Which one to remove if we come to know that other(which is not in cache)object is being accessed for more than once ?
Thanks!
You might benefit from the LFU implementation of ActiveMQ: LFUCache
They have provided some good functionality.
I think, the LFU data structure must combine priority queue (for maintaining fast access to lfu item) and hash map (for providing fast access to any item by its key); I would suggest the following node definition for each object stored in cache:
class Node<T> {
// access key
private int key;
// counter of accesses
private int numAccesses;
// current position in pq
private int currentPos;
// item itself
private T item;
//getters, setters, constructors go here
}
You need key for referring to an item.
You need numAccesses as a key for priority queue.
You need currentPos to be able to quickly find a pq position of item by key.
Now you organize hash map (key(Integer) -> node(Node<T>)) to quickly access items and min heap-based priority queue using number of accesses as priority. Now you can very quickly perform all operations (access, add new item, update number of acceses, remove lfu). You need to write each operation carefully, so that it maintains all the nodes consistent (their number of accesses, their position in pq and there existence in hash map). All operations will work with constant average time complexity which is what you expect from cache.
According to me, the best way to implement a most-recently-used cache of objects would be to include a new variable as 'latestTS' for each object. TS stands for timestamp.
// A static method that returns the current date and time as milliseconds since January 1st 1970
long latestTS = System.currentTimeMillis();
ConcurrentLinkedHashMap is not yet implemented in Concurrent Java Collections.
(Ref: Java Concurrent Collection API). However, you can try and use ConcurrentHashMap and DoublyLinkedList
About the case to be considered: in such case, as I have said that you can declare latestTS variable, based upon the value of latestTS variable, you can remove an entry and add the new object. (Don't forget to update frequency and latestTS of the new object added)
As you have mentioned, you can use LinkedHashMap as it gives element access in O(1) and also, you get the order traversal.
Please, find the below code for LFU Cache:
(PS: The below code is the answer for the question in the title i.e. "How to implement LFU cache")
import java.util.LinkedHashMap;
import java.util.Map;
public class LFUCache {
class CacheEntry
{
private String data;
private int frequency;
// default constructor
private CacheEntry()
{}
public String getData() {
return data;
}
public void setData(String data) {
this.data = data;
}
public int getFrequency() {
return frequency;
}
public void setFrequency(int frequency) {
this.frequency = frequency;
}
}
private static int initialCapacity = 10;
private static LinkedHashMap<Integer, CacheEntry> cacheMap = new LinkedHashMap<Integer, CacheEntry>();
/* LinkedHashMap is used because it has features of both HashMap and LinkedList.
* Thus, we can get an entry in O(1) and also, we can iterate over it easily.
* */
public LFUCache(int initialCapacity)
{
this.initialCapacity = initialCapacity;
}
public void addCacheEntry(int key, String data)
{
if(!isFull())
{
CacheEntry temp = new CacheEntry();
temp.setData(data);
temp.setFrequency(0);
cacheMap.put(key, temp);
}
else
{
int entryKeyToBeRemoved = getLFUKey();
cacheMap.remove(entryKeyToBeRemoved);
CacheEntry temp = new CacheEntry();
temp.setData(data);
temp.setFrequency(0);
cacheMap.put(key, temp);
}
}
public int getLFUKey()
{
int key = 0;
int minFreq = Integer.MAX_VALUE;
for(Map.Entry<Integer, CacheEntry> entry : cacheMap.entrySet())
{
if(minFreq > entry.getValue().frequency)
{
key = entry.getKey();
minFreq = entry.getValue().frequency;
}
}
return key;
}
public String getCacheEntry(int key)
{
if(cacheMap.containsKey(key)) // cache hit
{
CacheEntry temp = cacheMap.get(key);
temp.frequency++;
cacheMap.put(key, temp);
return temp.data;
}
return null; // cache miss
}
public static boolean isFull()
{
if(cacheMap.size() == initialCapacity)
return true;
return false;
}
}
Here's the o(1) implementation for LFU - http://dhruvbird.com/lfu.pdf
I have tried to implement this below LFU cache implementation. Took reference from this -
LFU paper. My implementation is working nicely.
If anyone wants to provide any further suggestion to improve it again, please let me know.
import java.util.HashMap;
import java.util.Map;
import java.util.Objects;
import java.util.TreeMap;
public class LFUCacheImplementation {
private Map<Integer, Node> cache = new HashMap<>();
private Map<Integer, Integer> counts = new HashMap<>();
private TreeMap<Integer, DoublyLinkedList> frequencies = new TreeMap<>();
private final int CAPACITY;
public LFUCache(int capacity) {
this.CAPACITY = capacity;
}
public int get(int key) {
if (!cache.containsKey(key)) {
return -1;
}
Node node = cache.get(key);
int frequency = counts.get(key);
frequencies.get(frequency).remove(new Node(node.key(), node.value()));
removeFreq(frequency);
frequencies.computeIfAbsent(frequency + 1, k -> new DoublyLinkedList()).add(new Node(node.key(), node.value()));
counts.put(key, frequency + 1);
return cache.get(key).value();
}
public void set(int key, int value) {
if (!cache.containsKey(key)) {
Node node = new Node(key, value);
if (cache.size() == CAPACITY) {
int l_count = frequencies.firstKey();
Node deleteThisNode = frequencies.get(l_count).head();
frequencies.get(l_count).remove(deleteThisNode);
int deleteThisKey = deleteThisNode.key();
removeFreq(l_count);
cache.remove(deleteThisKey);
counts.remove(deleteThisKey);
}
cache.put(key, node);
counts.put(key, 1);
frequencies.computeIfAbsent(1, k -> new DoublyLinkedList()).add(node);
}
}
private void removeFreq(int frequency) {
if (frequencies.get(frequency).size() == 0) {
frequencies.remove(frequency);
}
}
public Map<Integer, Node> getCache() {
return cache;
}
public Map<Integer, Integer> getCounts() {
return counts;
}
public TreeMap<Integer, DoublyLinkedList> getFrequencies() {
return frequencies;
}
}
class Node {
private int key;
private int value;
private Node next;
private Node prev;
public Node(int key, int value) {
this.key = key;
this.value = value;
}
public Node getNext() {
return next;
}
public void setNext(Node next) {
this.next = next;
}
public Node getPrev() {
return prev;
}
public void setPrev(Node prev) {
this.prev = prev;
}
public int key() {
return key;
}
public int value() {
return value;
}
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (!(o instanceof Node)) return false;
Node node = (Node) o;
return key == node.key &&
value == node.value;
}
#Override
public int hashCode() {
return Objects.hash(key, value);
}
#Override
public String toString() {
return "Node{" +
"key=" + key +
", value=" + value +
'}';
}
}
class DoublyLinkedList {
private int size;
private Node head;
private Node tail;
public void add(Node node) {
if (null == head) {
head = node;
} else {
tail.setNext(node);
node.setPrev(tail);
}
tail = node;
size++;
}
public void remove(Node node) {
if(null == head || null == node) {
return;
}
if(this.size() == 1 && head.equals(node)) {
head = null;
tail = null;
} else if (head.equals(node)) {
head = node.getNext();
head.setPrev(null);
} else if (tail.equals(node)) {
Node prevToTail = tail.getPrev();
prevToTail.setNext(null);
tail = prevToTail;
} else {
Node current = head.getNext();
while(!current.equals(tail)) {
if(current.equals(node)) {
Node prevToCurrent = current.getPrev();
Node nextToCurrent = current.getNext();
prevToCurrent.setNext(nextToCurrent);
nextToCurrent.setPrev(prevToCurrent);
break;
}
current = current.getNext();
}
}
size--;
}
public Node head() {
return head;
}
public int size() {
return size;
}
}
Client code to use the above cache implementation -
import java.util.Map;
public class Client {
public static void main(String[] args) {
Client client = new Client();
LFUCache cache = new LFUCache(4);
cache.set(11, function(11));
cache.set(12, function(12));
cache.set(13, function(13));
cache.set(14, function(14));
cache.set(15, function(15));
client.print(cache.getFrequencies());
cache.get(13);
cache.get(13);
cache.get(13);
cache.get(14);
cache.get(14);
cache.get(14);
cache.get(14);
client.print(cache.getCache());
client.print(cache.getCounts());
client.print(cache.getFrequencies());
}
public void print(Map<Integer, ? extends Object> map) {
for(Map.Entry<Integer, ? extends Object> entry : map.entrySet()) {
if(entry.getValue() instanceof Node) {
System.out.println("Cache Key => "+entry.getKey()+", Cache Value => "+((Node) entry.getValue()).toString());
} else if (entry.getValue() instanceof DoublyLinkedList) {
System.out.println("Frequency Key => "+entry.getKey()+" Frequency Values => [");
Node head = ((DoublyLinkedList) entry.getValue()).head();
while(null != head) {
System.out.println(head.toString());
head = head.getNext();
}
System.out.println(" ]");
} else {
System.out.println("Count Key => "+entry.getKey()+", Count Value => "+entry.getValue());
}
}
}
public static int function(int key) {
int prime = 31;
return key*prime;
}
}
How about a priority queue? You can keep elements sorted there with keys representing the frequency. Just update the object position in the queue after visiting it. You can update just from time to time for optimizing the performance (but reducing precision).
Many implementations I have seen have runtime complexity O(log(n)). This means, when the cache size is n, the time needed to insert/remove an element into/from chache is logarithmic. Such implementations use usually a min heap to maintain usage frequencies of elements. The root of the heap contains the element with lowest frequency, and can be accessed in O(1) time. But to maintain the heap property we have to move an element, every time it is used (and frequency is incremented) inside of the heap, to place it into proper position, or when we have to insert new element into the cache (and so put it into the heap).
But the runtime complexity can be reduced to O(1), when we maintain a hashmap (Java) or unordered_map (C++) with the element as key. Additinally we need two sorts of lists, frequency list and elements lists. The elements lists contain elements that have same frequency, and the frequency list contain the element lists.
frequency list
1 3 6 7
a k y x
c l z
m n
Here in the example we see the frequency list that has 4 elements (4 elements lists). The element list 1 contains elements (a,c,m), the elements list 3 contains elements (k, l, n) etc.
Now, when we use say element y, we have to increment its frequency and put it in the next list. Because the elements list with frequency 6 becomes empty, we delete it. The result is:
frequency list
1 3 7
a k y
c l x
m n z
We place the element y in the begin of the elements list 7. When we have to remove elements from the list later, we will start from the end (first z, then x and then y).
Now, when we use element n, we have to increment its frequency and put it into the new list, with frequencies 4:
frequency list
1 3 4 7
a k n y
c l x
m z
I hope the idea is clear. I provide now my C++ implementation of the LFU cache, and will add later a Java implementation.
The class has just 2 public methods, void set(key k, value v)
and bool get(key k, value &v). In the get method the value to retrieve will be set per reference when the element is found, in this case the method returns true. When the element is not found, the method returns false.
#include<unordered_map>
#include<list>
using namespace std;
typedef unsigned uint;
template<typename K, typename V = K>
struct Entry
{
K key;
V value;
};
template<typename K, typename V = K>
class LFUCache
{
typedef typename list<typename Entry<K, V>> ElementList;
typedef typename list <pair <uint, ElementList>> FrequencyList;
private:
unordered_map <K, pair<typename FrequencyList::iterator, typename ElementList::iterator>> cacheMap;
FrequencyList elements;
uint maxSize;
uint curSize;
void incrementFrequency(pair<typename FrequencyList::iterator, typename ElementList::iterator> p) {
if (p.first == prev(elements.end())) {
//frequency list contains single list with some frequency, create new list with incremented frequency (p.first->first + 1)
elements.push_back({ p.first->first + 1, { {p.second->key, p.second->value} } });
// erase and insert the key with new iterator pair
cacheMap[p.second->key] = { prev(elements.end()), prev(elements.end())->second.begin() };
}
else {
// there exist element(s) with higher frequency
auto pos = next(p.first);
if (p.first->first + 1 == pos->first)
// same frequency in the next list, add the element in the begin
pos->second.push_front({ p.second->key, p.second->value });
else
// insert new list before next list
pos = elements.insert(pos, { p.first->first + 1 , {{p.second->key, p.second->value}} });
// update cachMap iterators
cacheMap[p.second->key] = { pos, pos->second.begin() };
}
// if element list with old frequency contained this singe element, erase the list from frequency list
if (p.first->second.size() == 1)
elements.erase(p.first);
else
// erase only the element with updated frequency from the old list
p.first->second.erase(p.second);
}
void eraseOldElement() {
if (elements.size() > 0) {
auto key = prev(elements.begin()->second.end())->key;
if (elements.begin()->second.size() < 2)
elements.erase(elements.begin());
else
elements.begin()->second.erase(prev(elements.begin()->second.end()));
cacheMap.erase(key);
curSize--;
}
}
public:
LFUCache(uint size) {
if (size > 0)
maxSize = size;
else
maxSize = 10;
curSize = 0;
}
void set(K key, V value) {
auto entry = cacheMap.find(key);
if (entry == cacheMap.end()) {
if (curSize == maxSize)
eraseOldElement();
if (elements.begin() == elements.end()) {
elements.push_front({ 1, { {key, value} } });
}
else if (elements.begin()->first == 1) {
elements.begin()->second.push_front({ key,value });
}
else {
elements.push_front({ 1, { {key, value} } });
}
cacheMap.insert({ key, {elements.begin(), elements.begin()->second.begin()} });
curSize++;
}
else {
entry->second.second->value = value;
incrementFrequency(entry->second);
}
}
bool get(K key, V &value) {
auto entry = cacheMap.find(key);
if (entry == cacheMap.end())
return false;
value = entry->second.second->value;
incrementFrequency(entry->second);
return true;
}
};
Here are examples of usage:
int main()
{
LFUCache<int>cache(3); // cache of size 3
cache.set(1, 1);
cache.set(2, 2);
cache.set(3, 3);
cache.set(2, 4);
rc = cache.get(1, r);
assert(rc);
assert(r == 1);
// evict old element, in this case 3
cache.set(4, 5);
rc = cache.get(3, r);
assert(!rc);
rc = cache.get(4, r);
assert(rc);
assert(r == 5);
LFUCache<int, string>cache2(2);
cache2.set(1, "one");
cache2.set(2, "two");
string val;
rc = cache2.get(1, val);
if (rc)
assert(val == "one");
else
assert(false);
cache2.set(3, "three"); // evict 2
rc = cache2.get(2, val);
assert(rc == false);
rc = cache2.get(3, val);
assert(rc);
assert(val == "three");
}
Here is a simple implementation of LFU cache in Go/Golang based on here.
import "container/list"
type LFU struct {
cache map[int]*list.Element
freqQueue map[int]*list.List
cap int
maxFreq int
lowestFreq int
}
type entry struct {
key, val int
freq int
}
func NewLFU(capacity int) *LFU {
return &LFU{
cache: make(map[int]*list.Element),
freqQueue: make(map[int]*list.List),
cap: capacity,
maxFreq: capacity - 1,
lowestFreq: 0,
}
}
// O(1)
func (c *LFU) Get(key int) int {
if e, ok := c.cache[key]; ok {
val := e.Value.(*entry).val
c.updateEntry(e, val)
return val
}
return -1
}
// O(1)
func (c *LFU) Put(key int, value int) {
if e, ok := c.cache[key]; ok {
c.updateEntry(e, value)
} else {
if len(c.cache) == c.cap {
c.evict()
}
if c.freqQueue[0] == nil {
c.freqQueue[0] = list.New()
}
e := c.freqQueue[0].PushFront(&entry{key, value, 0})
c.cache[key] = e
c.lowestFreq = 0
}
}
func (c *LFU) updateEntry(e *list.Element, val int) {
key := e.Value.(*entry).key
curFreq := e.Value.(*entry).freq
c.freqQueue[curFreq].Remove(e)
delete(c.cache, key)
nextFreq := curFreq + 1
if nextFreq > c.maxFreq {
nextFreq = c.maxFreq
}
if c.lowestFreq == curFreq && c.freqQueue[curFreq].Len() == 0 {
c.lowestFreq = nextFreq
}
if c.freqQueue[nextFreq] == nil {
c.freqQueue[nextFreq] = list.New()
}
newE := c.freqQueue[nextFreq].PushFront(&entry{key, val, nextFreq})
c.cache[key] = newE
}
func (c *LFU) evict() {
back := c.freqQueue[c.lowestFreq].Back()
delete(c.cache, back.Value.(*entry).key)
c.freqQueue[c.lowestFreq].Remove(back)
}

Linear Probing on Java HashTable implementation

So I have a HashTable implementation here that I wrote using only Arrays and had a little bit of help with the code. Unfortunately, I don't quite understand one of the lines someone added while running the "get" or "put" method. What exactly is happening in the while loop below? It is a method for linear probing correct? Also why is the loop checking the conditions it's checking?
Specifically,
int hash = hashThis(key);
while(data[hash] != AVAILABLE && data[hash].key() != key) {
hash = (hash + 1) % capacity;
}
Here's the whole Java class below for full reference.
public class Hashtable2 {
private Node[] data;
private int capacity;
private static final Node AVAILABLE = new Node("Available", null);
public Hashtable2(int capacity) {
this.capacity = capacity;
data = new Node[capacity];
for(int i = 0; i < data.length; i++) {
data[i] = AVAILABLE;
}
}
public int hashThis(String key) {
return key.hashCode() % capacity;
}
public Object get(String key) {
int hash = hashThis(key);
while(data[hash] != AVAILABLE && data[hash].key() != key) {
hash = (hash + 1) % capacity;
}
return data[hash].element();
}
public void put(String key, Object element) {
if(key != null) {
int hash = hashThis(key);
while(data[hash] != AVAILABLE && data[hash].key() != key) {
hash = (hash + 1) % capacity;
}
data[hash] = new Node(key, element);
}
}
public String toString(){
String s="<";
for (int i=0;i<this.capacity;i++)
{
s+=data[i]+", ";
}
s+=">";
return s;
}
Thank you.
I just rewrote some part of the code and added the findHash-method - try to avoid code-duplication!
private int findHash(String key) {
int hash = hashThis(key);
// search for the next available element or for the next matching key
while(data[hash] != AVAILABLE && data[hash].key() != key) {
hash = (hash + 1) % capacity;
}
return hash;
}
public Object get(String key) {
return data[findHash(key)].element();
}
public void put(String key, Object element) {
data[findHash(key)] = new Node(key, element);
}
What you asked for is - what exactly does this findHash-loop? The data was initialized with AVAILABLE - meaning: the data does not (yet) contain any actual data. Now - when we add an element with put - first a hashValue is calculated, that is just an index in the data array where to put the data. Now - if we encounter that the position has already been taken by another element with the same hash value but a different key, we try to find the next AVAILABLE position. And the get method essentially works the same - if a data element with a different key is detected, the next element is probed and so on.
The data itself is a so called ring-buffer. That is, it is searched until the end of the array and is next search again at the beginning, starting with index 0. This is done with the modulo % operator.
Alright?
Sample Hashtable implementation using Generics and Linear Probing for collision resolution. There are some assumptions made during implementation and they are documented in javadoc above class and methods.
This implementation doesn't have all the methods of Hashtable like keySet, putAll etc but covers most frequently used methods like get, put, remove, size etc.
There is repetition of code in get, put and remove to find the index and it can be improved to have a new method to find index.
class HashEntry<K, V> {
private K key;
private V value;
public HashEntry(K key, V value) {
this.key = key;
this.value = value;
}
public void setKey(K key) { this.key = key; }
public K getKey() { return this.key; }
public void setValue(V value) { this.value = value; }
public V getValue() { return this.value; }
}
/**
* Hashtable implementation ...
* - with linear probing
* - without loadfactor & without rehash implementation.
* - throws exception when table is full
* - returns null when trying to remove non existent key
*
* #param <K>
* #param <V>
*/
public class Hashtable<K, V> {
private final static int DEFAULT_CAPACITY = 16;
private int count;
private int capacity;
private HashEntry<K, V>[] table;
public Hashtable() {
this(DEFAULT_CAPACITY);
}
public Hashtable(int capacity) {
super();
this.capacity = capacity;
table = new HashEntry[capacity];
}
public boolean isEmpty() { return (count == 0); }
public int size() { return count; }
public void clear() { table = new HashEntry[this.capacity]; count = 0; }
/**
* Returns null if either probe count is higher than capacity else couldn't find the element.
*
* #param key
* #return
*/
public V get(K key) {
V value = null;
int probeCount = 0;
int hash = this.hashCode(key);
while (table[hash] != null && !table[hash].getKey().equals(key) && probeCount <= this.capacity) {
hash = (hash + 1) % this.capacity;
probeCount++;
}
if (table[hash] != null && probeCount <= this.capacity) {
value = table[hash].getValue();
}
return value;
}
/**
* Check on the no of probes done and terminate if probe count reaches to its capacity.
*
* Throw Exception if table is full.
*
* #param key
* #param value
* #return
* #throws Exception
*/
public V put(K key, V value) throws Exception {
int probeCount = 0;
int hash = this.hashCode(key);
while (table[hash] != null && !table[hash].getKey().equals(key) && probeCount <= this.capacity) {
hash = (hash + 1) % this.capacity;
probeCount++;
}
if (probeCount <= this.capacity) {
if (table[hash] != null) {
table[hash].setValue(value);
} else {
table[hash] = new HashEntry(key, value);
count++;
}
return table[hash].getValue();
} else {
throw new Exception("Table Full!!");
}
}
/**
* If key present then mark table[hash] = null & return value, else return null.
*
* #param key
* #return
*/
public V remove(K key) {
V value = null;
int probeCount = 0;
int hash = this.hashCode(key);
while (table[hash] != null && !table[hash].getKey().equals(key) && probeCount <= this.capacity) {
hash = (hash + 1) % this.capacity;
probeCount++;
}
if (table[hash] != null && probeCount <= this.capacity) {
value = table[hash].getValue();
table[hash] = null;
count--;
}
return value;
}
public boolean contains(Object value) {
return this.containsValue(value);
}
public boolean containsKey(Object key) {
for (HashEntry<K, V> entry : table) {
if (entry != null && entry.getKey().equals(key)) {
return true;
}
}
return false;
}
public boolean containsValue(Object value) {
for (HashEntry<K, V> entry : table) {
if (entry != null && entry.getValue().equals(value)) {
return true;
}
}
return false;
}
#Override
public String toString() {
StringBuilder data = new StringBuilder();
data.append("{");
for (HashEntry<K, V> entry : table) {
if (entry != null) {
data.append(entry.getKey()).append("=").append(entry.getValue()).append(", ");
}
}
if (data.toString().endsWith(", ")) {
data.delete(data.length() - 2, data.length());
}
data.append("}");
return data.toString();
}
private int hashCode(K key) { return (key.hashCode() % this.capacity); }
public static void main(String[] args) throws Exception {
Hashtable<Integer, String> table = new Hashtable<Integer, String>(2);
table.put(1, "1");
table.put(2, "2");
System.out.println(table);
table.put(1, "3");
table.put(2, "4");
System.out.println(table);
table.remove(1);
System.out.println(table);
table.put(1, "1");
System.out.println(table);
System.out.println(table.get(1));
System.out.println(table.get(3));
// table is full so below line
// will throw an exception
table.put(3, "2");
}
}
Sample run of the above code.
{2=2, 1=1}
{2=4, 1=3}
{2=4}
{2=4, 1=1}
1
null
Exception in thread "main" java.lang.Exception: Table Full!!
at Hashtable.put(Hashtable.java:95)
at Hashtable.main(Hashtable.java:177)

Categories

Resources