I was to draw a diagram of a hash table, size 6, after these functions were ran.
add (13) add(21) add(7) add(25)
I'm very unfamiliar with hash tables, but I came up with this.
(7)(13)(21)(25)( )( )
I know that when you add an element to a hash, it is assigned a specific hash code, but I dont understand how to find this. Can someone explain this to me?
To build on #HovercraftFullOfEels answer, not implementing hashCode() properly in an object, will cause HashMap to behave badly when that object is added to it. In other words, a pre-requisite to using HashMap is that all potential members have properly implemented hashCode().
Here is some more information on implementing hashCode(): https://www.google.com/search?q=implementing+hashcode+in+java
define your table like
HashMap newMap = new Hashmap();
then to place elements inside you do
newMap.put(1, "what ever");
Unfortunately, hash tables require two arguments to actually assign values. A hash table works like this:
hashtable.add(key, value);
The key is run through a hash function, and the value is placed into an array at the index provided by the hash function, like so:
public class FauxHashTable {
Object[] storage;
public FauxHashTable(int size)
{
storage = new Object[size];
}
public void add(Object key, Object value)
{
storage[hashFunction(key)] = value;
}
public int hashFunction(Object o)
{
// Does some magic and returns a unique identifying integer of the object! A mathematical function that returns a different integer for EVERY object put in!
}
}
So, your question doesn't make sense to me because you give an add function with only one argument. I think the fact that you added the Java tag is confusing a lot of the answerers here because when one says hash table with Java, one usually means the HashMap class. HashMap is an implementation of a hash table, but I don't think it's quite what you're asking about here.
Since you say "I was to draw", I assume this is homework and so I'm going to try and not straight up give you the answer.
Related
I have a HashMap where keys are mutable complex objects - hash changes over their lifetime. I know exactly what objects are changed, but only after the fact - their removal using map.remove(object) will not work because the hash changed. Number of objects in the map is about in range [10, 10 000], the issue is rather in number of changes and accesses.
It would be demanding to do a "would you change" check on each object before changing it - double the work, not to mention the mess of a code necessary for it.
I do iterate entries in the map later on, so I figured I could simply mark objects for removal and get rid of them using iterator.remove(), but unfortunately HashMap$HashIterator#remove calls hash(key).
The one option that comes to my mind is to throw away the original map and rehash all objects that are not marked for removal into a new map, but that would generate a lot of extra time and memory garbage - would like to avoid it.
Another option would be writing my own HashMap that keeps track of where exactly is stored every element (say map formed by 2D object array = two int coordinates). This would be more efficient, but also a lot more to write and test.
Is there any easier way to do this that I have missed?
Edit:
I use wrappers over the complex object that supply different hash/equals pairs depending on subset of properties. Each object may be in multiple maps. Say I look for red object in map that uses wrappers with hash/equals over color, create red dummy object, and do map.get(dummy).
Implementations of hash/equals and specific properties they touch are not part of my code.
All maps are objects mapped onto themselves (like Set implementation, but I do need map access methods). I can store hashes in those wrappers, and then they will adhere to the contract from hash perspective, but equals will still fail me.
I do understand that by changing hash/equals output is undefined behavior, but it really should not matter in theory - I change object, and then I do not want to use the map until the changed object is gone from it. Hash map should not really need to call equals() or hash() for object it is already pointing at with iterator.
All maps are objects mapped onto themselves (like Set implementation, but I do need map access methods). I can store hashes in those wrappers, and then they will adhere to the contract from hash perspective, but equals will still fail me.
As others said, either try to find an immutable key (e.g. generated or a subset of some immutable properties) or have a look at other data structures, are some of the general recommendations witout seeing the code.
I didn't quite understand why you can "store hashes in those wrappers" but still have trouble with the equals method. (I guess the stored hashes would no be unique so they could be checked in the equals method?)
But if you have immutable hashes and if you have only one instance per "equal" object (not one instance stored in the map and another but equal instance used for lookup), you could have a look at the IdentityHashMap class.
Previous state:
User supplies equals/hash lambdas that work over complex object to place it each map in correct place (looking up objects of similar properties in constant time).
Complex object did change in inconvenient times causing issues with reinsert - object changes, pull it out, return it with new hash.
Current solution:
In theory could be solved with custom implementation of hash map (note NOT hash map interface, would not uphold its contract). This map would cache hashes for its contents for rehash purposes, and maintain coordinates in underlying structure so equals is not necessary for removal with values iterator. May implement it later to reduce memory footprint.
Used solution was forcing user to supply key that wraps all used properties and adds hash/equals that considers those properties. Now even though complex object changes, its key stays the same until prompted for update (not inside of the map at the time of the update).
public class Node {
public HashMap<Key, Node> map;
public Data<T> data;
public Key key;
public Node parent;
public void update() {
if (parent != null) parent.map.remove(key);
key.update(data);
if (parent != null) parent.map.put(key, this);
}
}
public abstract class Key {
public abstract void update(Data data);
public abstract int hashCode();
public abstract boolean equals(Object obj);
}
public class MyKey extends Key {
private Object value = null;
public final void update(Data data) {
value = data.value;
}
public final boolean equals(Object obj) {
IdentityKey that = (IdentityKey)obj;
return this.value == that.value;
}
public final int hashCode() {
return value == null ? 0 : value.hashCode();
}
}
This requires a lot of primitive Key implementations, but at least it works. Will probably look for something better.
I'm having some trouble when using .put(Integer, String) in Java.
To my understanding, when a collision happens the HashMap asks whether the to value are the same with .equals(Object) and if they are not the two values are stored in a LinkedList. Nevertheless, size() is 1 and the hash iterator only shows one result, the last one.
Apart form this, java HashMap API states:put
public V put(K key, V value)
Associates the specified value with the specified key in this map. If
the map previously contained a mapping for the key, the old value is
replaced.
THIS IS NOT WHAT I HAVE READ EVERYWHERE.
Thoughts?
public class HashProblema {
public static void main(String[] args) {
HashMap<Integer, String> hash= new HashMap();
hash.put(1, "sdaaaar");
hash.put(1, "bjbh");
System.out.println(hash.size());
for (Object value : hash.values()) {
System.out.println(value);
}
}
}
The output is -:
1
bjbh
Since the mapping for the key exist, it is replaced and the size remains 1 only.
The value gets over written by the new key..the size remains one and the value gets changed..This is how it works, as key values are always unique..You can't map multiple values on 1 key.
The API is the definitive reference and that is what you must believe.
A collision occurs when the hash of of a key already exists in the HashMap. Then the values of the keys are compared, and if they are the different, the entries are placed in a linked list. If the keys are the same, then the old key-value in the HashMap is overwritten.
API documentation should normally be treated as authoritative unless there is very good reason to doubt its accuracy.
You should almost certainly ignore any claim that doesn't flag itself as 'knowingly' at odds with documentation and provide a testable evidence.
I humbly suggest you might be confused about the role of a linked 'collision' list. As it happens HashMap in Java uses a linked-list to store multiple values for which the hash-code of the key is placed in the same 'bucket' as one or more other keys.
A HashMap in Java will always store a Key-Value-Pair. There are no linked lists involved. What you are describing is the general idea of a hash map (often taught in computer science class), but the implementation in Java is different. Here, you will always have one value per key only (the last one you put in that place).
However, you are free to define a HashMap that contains List objects. Though, you have to keep track of duplicates and collisions on your own then
This is a homework question so I'm not looking for specific implementation but more an understanding of how to implement the following:
I have to create a hash table class, I understand how a hash table works but I am confused about how it actually hashes objects. In the examples we've seen we generally see integers get stored in a hash table (for simplicity) and they are hashed using an algorithm such as value%10.
I'm fine with this but confused about the following. We have been asked to write a class that can take any object and provide methods for insertion etc. I'm not sure how I can call Object%10 considering I can't just find the modulus of an object. With this in mind given I'm not to know what sort of object a user could pass to this class (it could be one they have written themselves) how are you expected to write a hash function for all possible objects? Am I missing something here?
I've tried Googling but I'm not exactly sure what to Google so I'm coming up with not much, thanks
Hashcode doesnt always have to be value%10, In case of object it is a number derived using state of object ie attributes of object.
If you class like
public class MyClass {
int a;
int b;
}
then Hashcode can be simple as
public int hashCode() {
int result = a + b;
return result;
}
Check the methods of the Object class. Every object in Java has those methods. See if one of them can help you.
Imagine a simple case:
class B{
public final String text;
public B(String text){
this.text = text;
}
}
class A {
private List<B> bs = new ArrayList<B>;
public B getB(String text){
for(B b :bs){
if(b.text.equals(text)){
return b;
}
}
return null;
}
[getter/setter]
}
Imagine that for each instance of A, the List<B> is large and we need to call getB(String) often. However assume that it is also possible for the list to change (add/remove element, or even being reassigned).
At this stage, the average complexity for getB(String) is O(n). In order to improved that I was wondering if we could use some clever caching.
Imagine we cache the List<B> in a Map<String, B> where the key is B.text. That would improve the performance but it won't work if the list is changed (new element or deleted element) or reassigned (A.bs points to a new reference).
To go around that I thought that, along with the Map<String, B>, we could store a hash of the list bs. When we call getB(String) method, we compute the hash of the list bs. If the hash hasn't changed, we fetch the result from the map, if it has we reload the map.
The problem is that computing the hash for a java.util.List goes through all the element of the list and computes their hash, which is at least O(n).
Question
What I'd like to know is whether the JVM will be faster at computing the hash for the List than going through my loop in the getB(String) method. May be that depends on the implementation of hash for B. If so what kind of things could work? In a nutshell, I'd like to know whether this is stupid or could bring some performance improvement.
Without actually explaining why, you seem for some reason to believe that it is essential to keep the list structure as well. The only reasonable reason for this is that you need the order of the collection to be kept consistent. If you switch to a "plain" map, the order of the values is no longer constant, e.g. kept in the order in which you add the items to the map.
If you need both to keep the order (list behaviour) and access individual items using a key, you can use a LinkedHashMap, which essentially joins the behaviour of a LinkedList and a HashMap. Even if LinkedHashMap.values() returns a collection and not a list, the list behaviour is guaranteed within the collection.
Another issue with your question is, that you cannot use the list's hash code to safely determine changes. If the hash code has changed, you are indeed sure that the list has changed as well. If two hash codes are identical, you can still not be sure that the lists are actually identical. E.g. if the hash code implementation is based on strings, the hash codes for "1a" and "2B" are identical.
If so what kind of things could work?
Simply put: don't let anything else mutate your list without you knowing about it. I suspect you currently have something like:
public List<String> getAllBs() {
return bs;
}
... and a similar setter. If you stop doing that, and instead just have appropriate mutation methods, then you can make sure that your code is the only code to mutate the list... which means you can either remember that your map is "dirty" or just mutate the map at the same time that you mutate the list.
You could implement your own class IndexedBArrayList which extends ArrayList<B>.
Then you add this functionality to it:
A private HashMap<String, B> index
All mutator methods of ArrayList are overridden to keep this index hash map updated in addition to calling the corresponding super-method.
A new public B getByString(String) method which uses the hash map
From your description it does not seem that you need a List<B>.
Replace the List with a HashMap. If you need to search for Bs the best data structure is the hashmap and not the list.
I've seen other questions about getting objects from Set's based on index value and I understand why that is not possible. But I haven't been able to find a good explanation for why a get by object is not allowed so thought I would ask.
HashSet is backed by a HashMap so getting an object from it should be pretty straightforward. As it is now, it appears I would have to iterate over each item in the HashSet and test for equality which seems unnecessary.
I could just use a Map but I have no need for a key:value pair, I just need a Set.
For example say I have Foo.java:
package example;
import java.io.Serializable;
public class Foo implements Serializable {
String _id;
String _description;
public Foo(String id){
this._id = id
}
public void setDescription(String description){
this._description = description;
}
public String getDescription(){
return this._description;
}
public boolean equals(Object obj) {
//equals code, checks if id's are equal
}
public int hashCode() {
//hash code calculation
}
}
and Example.java:
package example;
import java.util.HashSet;
public class Example {
public static void main(String[] args){
HashSet<Foo> set = new HashSet<Foo>();
Foo foo1 = new Foo("1");
foo1.setDescription("Number 1");
set.add(foo1);
set.add(new Foo("2"));
//I want to get the object stored in the Set, so I construct a object that is 'equal' to the one I want.
Foo theFoo = set.get(new Foo("1")); //Is there a reason this is not allowed?
System.out.println(theFoo.getDescription); //Should print Number 1
}
}
Is it because the equals method is meant to test for "absolute" equality rather than "logical" equality (in which case contains(Object o) would be sufficient)?
Java Map/Collection Cheat Sheet
Will it contain key/value pair or values only?
1) If it contains pairs, the choice is a map. Is order important?
. 1-1) If yes, follow insertion order or sort by keys?
. . 1-1-1) If ordered, LinkedHashMap
. . 1-1-2) If sorted, TreeMap
. 1-2) If order is not important, HashMap
2) If it stores only values, the choice is a collection. Will it contain duplicates?
. 2-1) If yes, ArrayList
. 2-2) If it will not contain duplicates, is primary task searching for elements
(contains/remove)?
. . 2-2-1) If no, ArrayList
. . 2-2-2) If yes, is order important?
. . . 2-2-2-1) If order is not important, HashSet
. . . 2-2-2-2) If yes, follow insertion order or sort by values?
. . . . 2-2-2-2-1) if ordered, LinkedHashSet
. . . . 2-2-2-2-2) if sorted, TreeSet
A Set is a Collection of objects which treats a.equals(b) == true as duplicates, so it doesn't make sense to try to get the same object you already have.
If you are trying to get(Object) from a collection, a Map is likely to be more appropriate.
What you should write is
Map<String, String> map = new LinkedHashMap<>();
map.put("1", "Number 1");
map.put("2", null);
String description = map.get("1");
if an object is not in the set (based on equals), add it, if it is in the set (based on equals) give me the set's instance of that object
In the unlikely event you need this you can use a Map.
Map<Bar, Bar> map = // LinkedHashMap or ConcurrentHashMap
Bar bar1 = new Bar(1);
map.put(bar1, bar1);
Bar bar1a = map.get(new Bar(1));
If you want to know that new Foo("1"); object is already present in the set then you need to use contains method as:
boolean present = set.contains(new Foo("1"));
The get kind of method i.e. set.get(new Foo("1")); is not supported because it doesn't make sense. You are already having the object i.e. new Foo("1") then what extra information you would be looking through get method.
Your last sentence is the answer.
get(Object o) would run through the HashSet looking for another object being equal to o (using equals(o) method). So it is indeed the same as contains(o), only not returning the same result.
HashSet is a little bit simplier than HashMap. If you don't need the features of HashMap, why use it? If the method like getObject(ObjectType o) was implemented by Java we dont need to iterate over the set after calling contain() methode...
The reason why there is no get is simple:
If you need to get the object X from the set is because you need something from X and you dont have the object.
If you do not have the object then you need some means (key) to locate it. ..its name, a number what ever. Thats what maps are for right.
map.get( "key" ) -> X!
Sets do not have keys, you need yo traverse them to get the objects.
So, why not add a handy get( X ) -> X
That makes no sense right, because you have X already, purist will say.
But now look at it as non purist, and see if you really want this:
Say I make object Y, wich matches the equals of X, so that set.get(Y)->X. Volia, then I can access the data of X that I didn have. Say for example X has a method called get flag() and I want the result of that.
Now look at this code.
Y
X = map.get( Y );
So Y.equals( x ) true!
but..
Y.flag() == X.flag() = false. ( Were not they equals ?)
So, you see, if set allowed you to get the objects like that It surely is to break the basic semantic of the equals. Later you are going to live with little clones of X all claming that they are the same when they are not.
You need a map, to store stuff and use a key to retrieve it.
if you only want know what are in the Hashset, you can use .toString(); method to display all Hashset Contents separated by comma.
A common use case of a get method on Set might be to implement an intern set. If that's what you're trying to achieve, consider using the Interner interface and Interners factory from Google Guava.
I've got the same problem as the thread author and I've got a real reason why
a Set should have a get method:
I overwrote equals of e.g. X, the content of the set Set and so the contained
object is not necessarily the same as the checked one. In my scenario I'll remove
semantic doubles in an other collection and enrich the "original" with some relations
of the "double" so I need the "original" to be able to drop the double.
get(Object o) is useful when we have one information linked to other information just like key value pair found in HashMap .So using get() method on one information we can get the second information or vice-versa.
Now, if HashSet provides get(Object o) method you need to pass an object. So if you have the object to pass to the get(Object o) method that means you already have the object, then what is need of get(Object o) method.
As everyone mentioned before, there is no such method and for good reasons. That being said, if you wish to get a certain object from a HashSet in java 8 using a one-liner (almost), simply use streams. In your case, it would be something like:
Foo existing = set.stream().filter(o -> o.equals(new Foo("1"))).collect(Collectors.toList()).iterator().next();
Note that an exception will be thrown if the element doesn't exist so it is technically not a one-liner, though if the filter is properly implemented it should be faster than a traditional iteration over the collection elements.