Difference between HashSet and HashMap? - java

Apart from the fact that HashSet does not allow duplicate values, what is the difference between HashMap and HashSet?
I mean implementation wise? It's a little bit vague because both use hash tables to store values.

HashSet is a set, e.g. {1,2,3,4,5}
HashMap is a key -> value (key to value) map, e.g. {a -> 1, b -> 2, c -> 2, d -> 1}
Notice in my example above that in the HashMap there must not be duplicate keys, but it may have duplicate values.
In the HashSet, there must be no duplicate elements.

They are entirely different constructs. A HashMap is an implementation of Map. A Map maps keys to values. The key look up occurs using the hash.
On the other hand, a HashSet is an implementation of Set. A Set is designed to match the mathematical model of a set. A HashSet does use a HashMap to back its implementation, as you noted. However, it implements an entirely different interface.
When you are looking for what will be the best Collection for your purposes, this Tutorial is a good starting place. If you truly want to know what's going on, there's a book for that, too.

HashSet
HashSet class implements the Set interface
In HashSet, we store objects(elements or values)
e.g. If we have a HashSet of string elements then it could depict a
set of HashSet elements: {“Hello”, “Hi”, “Bye”, “Run”}
HashSet does not allow duplicate elements that mean you
can not store duplicate values in HashSet.
HashSet permits to have a single null value.
HashSet is not synchronized which means they are not suitable for thread-safe operations until unless synchronized explicitly.[similarity]
add contains next notes
HashSet O(1) O(1) O(h/n) h is the table
HashMap
HashMap class implements the Map interface
HashMap is
used for storing key & value pairs. In short, it maintains the
mapping of key & value (The HashMap class is roughly equivalent to
Hashtable, except that it is unsynchronized and permits nulls.) This
is how you could represent HashMap elements if it has integer key
and value of String type: e.g. {1->”Hello”, 2->”Hi”, 3->”Bye”,
4->”Run”}
HashMap does not allow duplicate keys however it allows having duplicate values.
HashMap permits single null key and any number of null values.
HashMap is not synchronized which means they are not suitable for thread-safe operations until unless synchronized explicitly.[similarity]
get containsKey next Notes
HashMap O(1) O(1) O(h/n) h is the table
Please refer this article to find more information.

It's really a shame that both their names start with Hash. That's the least important part of them. The important parts come after the Hash - the Set and Map, as others have pointed out. What they are, respectively, are a Set - an unordered collection - and a Map - a collection with keyed access. They happen to be implemented with hashes - that's where the names come from - but their essence is hidden behind that part of their names.
Don't be confused by their names; they are deeply different things.

The Hashset Internally implements HashMap. If you see the internal implementation the values inserted in HashSet are stored as keys in the HashMap and the value is a Dummy object of Object class.
Difference between HashMap vs HashSet is:-
HashMap contains key value pairs and each value can be accessed by key where as HashSet needs to be iterated everytime as there is no get method.
HashMap implements Map interface and allows one null value as a key and multiple null values as values, whereas HashSet implements Set interface, allows only one null value and no duplicated values.(Remeber one null key is allowed in HashMap key hence one null value in HashSet as HashSet implemements HashMap internally).
HashSet and HashMap do not maintain the order of insertion while iterating.

HashSet allows us to store objects in the set where as HashMap allows us to store objects on the basis of key and value. Every object or stored object will be having key.

As the names imply, a HashMap is an associative Map (mapping from a key to a value), a HashSet is just a Set.

Differences between HashSet and HashMap in Java
1) First and most significant difference between HashMap and HashSet is that HashMap is an implementation of Map interface while HashSet is an implementation of Set interface, which means HashMap is a key value based data-structure and HashSet guarantees uniqueness by not allowing duplicates.In reality HashSet is a wrapper around HashMap in Java, if you look at the code of add(E e) method of HashSet.java you will see following code :
public boolean add(E e)
{
return map.put(e, PRESENT)==null;
}
where its putting Object into map as key and value is an final object PRESENT which is dummy.
2) Second difference between HashMap and HashSet is that , we use add() method to put elements into Set but we use put() method to insert key and value into HashMap in Java.
3) HashSet allows only one null key, but HashMap can allow one null key + multiple null values.
That's all on difference between HashSet and HashMap in Java. In summary HashSet and HashMap are two different type of Collection one being Set and other being Map.

Differences between HashSet and HashMap in Java
HashSet internally uses HashMap to store objects.when add(String) method called it calls HahsMap put(key,value) method where key=String object & value=new Object(Dummy).so it maintain no duplicates because keys are nothing but Value Object.
the Objects which are stored as key in Hashset/HashMap should override hashcode & equals contract.
Keys which are used to access/store value objects in HashMap should declared as Final because when it is modified Value object can't be located & returns null.

A HashMap is to add, get, remove, ... objects indexed by a custom key of any type.
A HashSet is to add elements, remove elements and check if elements are present by comparing their hashes.
So a HashMap contains the elements and a HashSet remembers their hashes.

A HashSet uses a HashMap internally to store its entries. Each entry in the internal HashMap is keyed by a single Object, so all entries hash into the same bucket. I don't recall what the internal HashMap uses to store its values, but it doesn't really matter since that internal container will never contain duplicate values.
EDIT: To address Matthew's comment, he's right; I had it backwards. The internal HashMap is keyed with the Objects that make up the Set elements. The values of the HashMap are an Object that's just simply stored in the HashMap buckets.

Differences:
with respect to heirarchy:
HashSet implements Set.
HashMap implements Map and stores a mapping of keys and values.
A use of HashSet and HashMap with respect to database would help you understand the significance of each.
HashSet: is generally used for storing unique collection objects.
E.g: It might be used as implementation class for storing many-to-one relation ship between
class Item and Class Bid where (Item has many Bids)
HashMap: is used to map a key to value.the value may be null or any Object /list of Object (which is object in itself).

A HashSet is implemented in terms of a HashMap. It's a mapping between the key and a PRESENT object.

HashMap is a Map implementation, allowing duplicate values but not duplicate keys.. For adding an object a Key/Value pair is required. Null Keys and Null values are allowed. eg:
{The->3,world->5,is->2,nice->4}
HashSet is a Set implementation,which does not allow duplicates.If you tried to add a duplicate object, a call to public boolean add(Object o) method, then the set remains unchanged and returns false. eg:
[The,world,is,nice]

Basically in HashMap, user has to provide both Key and Value, whereas in HashSet you provide only Value, the Key is derived automatically from Value by using hash function. So after having both Key and Value, HashSet can be stored as HashMap internally.

HashSet and HashMap both store pairs , the difference lies that in HashMap you can specify a key while in HashSet the key comes from object's hash code

HashMaps allow one null key and null values. They are not synchronized, which increases efficiency. If it is required, you can make them synchronized using Collections.SynchronizedMap()
Hashtables don't allow null keys and are synchronized.

The main difference between them you can find as follows:
HashSet
It does not allow duplicate keys.
Even it is not synchronized, so this will have better performance.
It allows a null key.
HashSet can be used when you want to maintain a unique list.
HashSet implements Set interface and it is backed by the hash table(actually HashMap instance).
HashSet stores objects.
HashSet doesn’t allow duplicate elements but null values are allowed.
This interface doesn’t guarantee that order will remain constant over time.
HashMap
It allows duplicate keys.
It is not synchronized, so this will have better performance.
HashMap does not maintain insertion order.
The order is defined by the Hash function.
It is not Thread Safe
It allows null for both key and value.
It allows one null key and as many null values as you like.
HashMap is a Hash table-based implementation of the Map interface.
HashMap store object as key and value pair.
HashMap does not allow duplicate keys but null keys and values are allowed.
Ordering of the element is not guaranteed overtime.

EDIT - this answer isn't correct. I'm leaving it here in case other people have a similar idea. b.roth and justkt have the correct answers above.
--- original ---
you pretty much answered your own question - hashset doesn't allow duplicate values. it would be trivial to build a hashset using a backing hashmap (and just a check to see if the value already exists). i guess the various java implementations either do that, or implement some custom code to do it more efficiently.

HashMap is a implementation of Map interface
HashSet is an implementation of Set Interface
HashMap Stores data in form of key value pair
HashSet Store only objects
Put method is used to add element in map
Add method is used to add element is Set
In hash map hashcode value is calculated using key object
Here member object is used for calculating hashcode value which can be same for two objects so equal () method is used to check for equality if it returns false that means two objects are different.
HashMap is faster than hashset because unique key is used to access object
HashSet is slower than Hashmap

Related

HashMaps in Java - Indexed or not

The question:
Is the HashMap(Map) we use in java indexed or not ? And if it is indexed, could you please explain it briefly ?
Try using LinkedHashMap instead of HashMap, which orders elements by insertion. Reference: Java - get index of key in HashMap?
HashMaps contain key&value pairs. Every key may only exist once.
You cannot get values using an index, but need to use the method
HashMap.get()
A decent tutorial can be found here:
https://beginnersbook.com/2013/12/hashmap-in-java-with-example/
HashMaps are an implementation of a Hash Table. The hashCode() function, that all Classes inherit from the Object class, is used to, ideally, produce a unique integer. The hashCode integer is used to map an object value to a location where, assuming low collisions, search time is O(1). If the hashCode() function is poor and produces the same value for many keys within the HashMap, search time may degrade (The HashMap will rehash itself once the number of entries in the Map exceeds the product of the load factor and the current capacity).
In a sense, HashMaps are indexed and that is what enables their good performance. However, they are not indexed in the manner you could obtain an item from index and the HashMap class will not ensure that order is retained; you will need to use a LinkedHashMap.

How does Java determine uniqueness of Hashmap key?

I want to maintain a list of objects such that each object in the list is unique.Also I want to retrieve it at one point. Objects are in thousands and I can't modify their source to add a unique id. Also hascodes are unreliable.
My approach was to utilize the key uniqueness of a map.
Say a maintain a map like :
HashMap<Object,int> uniqueObjectMap.
I will add object to map with as a key and set a random int as value. But how does java determine if the object is unique when used as a key ?
Say,
List listOne;
List listTwo;
Object o = new Object;
listOne.add(o);
listTwo.add(o);
uniqueObjectMap.put(listOne.get(0),randomInt()); // -- > line 1
uniqueObjectMap.put(listTw0.get(0),randomInt()); // --> line 2
Will line 2 give an unique key violation error since both are referring to the same object o ?
Edit
So if will unqiueObjectMap.containsKey(listTwo.get(0)) return true ? How are objects determined to be equal here ? Is a field by field comparison done ? Can I rely on this to make sure only one copy of ANY type of object is maintained in the map as key ?
Will line 2 give an unique key violation error since both are referring to the same object o ?
- No. If a key is found to be already present, then its value will be overwritten with the new one.
Next, HashMap has a separate hash() method which Applies a supplemental hash function to a given hashCode (of key objects), which defends against poor quality hash functions.
It does so by calling the Object's hashcode() function.
The default implementation is roughly equivalent to the object's unique identifier (much like a memory address); however, there are objects that are compare-by-value. If dealing with a compare-by-value object, hashcode() will be overridden to compute a number based on the values, such that two identical values yield the same hashcode() number.
As for the collection items that are hash based, the put(...) operation is fine with putting something over the original location. In short, if two objects yeild the same hashcode() and a positive equals(...) result, then operations will assume that they are for all practical purposes the same object. Thus, put may replace the old with the new, or do nothing, as the object is considered the same.
It may not store two copies in the same "place" as it makes no sense to store two copies at the same location. Thus, sets will only contain one copy, as will map keys; however, lists will possibly contain two copies, depending on how you added the second copy.
How are objects determined to be equal here ?
By using equals and Hashcode function of Object class.
Is a field by field comparison done ?
No, if you dont implement equals and hashcode, java will compare the references of your objects.
Can I rely on this to make sure only one copy of ANY type of object is maintained in the map as key ?
No.
Using a Set is a better approch than using Map because it removes duplicates by his own, but in this case it wont work either because Set determinates duplicates the same way like a Map does it with Keys.
If you will refer to same then it ll not throw an error because when HashMap get same key then it's related value will be overwrite.
If the same key is exist in HashMap then it will be overwritten.
if you want to check if the key or value is already exist or not then you can use:
containsKey() and containsValue().
ex :
hashMap.containsKey(0);
this will return true if the key named 0 is already exist otherwise false.
By getting hashcode value using hash(key.hashCode())
HashMap has an inner class Entry with attributes
final K key;
V value;
Entry<K ,V> next;
final int hash;
Hash value is used to calculate the index in the array for storing Entry object, there might be the scenario where 2 unequal object can have same equal hash value.
Entry objects are stored in linked list form, in case of collision, all entry object with same hash value are stored in same Linkedlist but equal method will test for true equality. In this way, HashMap ensure the uniqueness of keys.

Java - what is returned when two keys map to same value?

In Java, I understand if two keys maps to one value , linear chaining occurs due to collision.
For Example:
 Map myMap= new HashMap(); //Lets says both of them get mapped to same bucket-A and
myMap.put("John", "Sydney");//linear chaining has occured.
myMap.put("Mary","Mumbai"); //{key1=John}--->[val1=Sydney]--->[val2=Mumbai]
So when I do:
myMap.get("John"); // or myMap.get("Mary")
What does the JVM return since bucket-A contains two values?
Does it return the ref to "chain"? Does it return "Sydney"? Or does it return "Mumbai"?
Linear chaining happens when your keys have the same hashcode and not when two keys map to one value.
So when I do: myMap.get("John"); // or myMap.get("Mary")
map.get("John") gives you Sydney
map.get("Mary") gives you Mumbai
What does the JVM return since bucket-A contains two values?
If the same bucket contains two values, then the equals method of the key is used to determine the correct value to return.
It is worthwhile mentioning the worst-case scenario of storing (K,V) pairs all having the same hashCode for Key. Your hashmap degrades to a linked list in that scenario.
The hashCode of your method determines what 'bucket' (aka list, aka 'linear chain') it will be put in. The equals method determines which object will actually be picked from the 'bucket', in the case of collision. This is why its important to properly implement both methods on all object you intend to store in any kind of hash map.
Your keys are different.
First some terminology
key: the first parameter in the put
value: the second parameter in the put
entry: an Object that holds both the key & the value
When you put into a HashMap the map will call hashCode() on the key and work out which hash bucket the entry needs to go into. If there is something in this bucket already then a LinkedList is formed of entries in the bucket.
When you get from a HashMap the map will call hashCode() on the key and work out which hash bucket to get the entry from. If there is more than one entry in the bucket the the map will walk along the LinkedList until it finds an entry with a key that equals() the key supplied.
A map will always return the Object tied to that key, the value from the entry. Map performance degrades rapidly if hashCode() returns the same (or similar) values for different keys.
You need to use java generics, so your code should really read
Map<String, String> myMap = new HashMap<String, String>();
This will tell the map that you want it to store String keys and values.
From my understanding, the Map first resolves the correct bucket (identified by the hashcode of the key). If there's more than one key in the same bucket, the equals method is used to find the right value in the bucket.
Looking at your example what confuses you is that you think values are chained for a given key. In fact Map.Entry objects are chained for a given hashcode. The hashCode of the key gives you the bucked, then you look at the chained entries to find the one with the equal key.

How list differ from map?

In java, List and Map are using in collections. But i couldn't understand at which situations we should use List and which time use Map. What is the major difference between both of them?
Now would be a good time to read the Java collections tutorial - but fundamentally, a list is an ordered sequence of elements which you can access by index, and a map is a usually unordered mapping from keys to values. (Some maps preserve insertion order, but that's implementation-specific.)
It's usually fairly obvious when you want a key/value mapping and when you just want a collection of elements. It becomes less clear if the key is part of the value, but you want to be able to get at an item by that key efficiently. That's still a good use case for a map, even though in some senses you don't have a separate collection of keys.
There's also Set, which is a (usually unordered) collection of distinct elements.
Map is for Key:Value pair kind of data.for instance if you want to map student roll numbers to their names.
List is for simple ordered collection of elements which allow duplicates.
for instance to represent list of student names.
Map Interface
A Map cares about unique identifiers. You map a unique key (the ID) to a specific
value, where both the key and the value are, of course, objects.
The Map implementations let you do things like search for a
value based on the key, ask for a collection of just the values, or ask for a collection
of just the keys. Like Sets, Maps rely on the equals() method to determine whether
two keys are the same or different.
List Interface
A List cares about the index. The one thing that List has that non-lists don't have
is a set of methods related to the index. Those key methods include things like
get(int index), indexOf(Object o), add(int index, Object obj), and so
on. All three List implementations are ordered by index position—a position that
you determine either by setting an object at a specific index or by adding it without
specifying position, in which case the object is added to the end.
list is a linked list, where every object is connected to the next one via pointers. the time it takes to insert a new object to the list is O(1) but the rest of operations on it take longer.
the good thing about it is that it takes exactly the amount of memory you need and not even on byte more than that.
Maps are a data structure that has an array and each entry to the array is calculated with a hashFunction(key) that calculates the location according to the key. almost every operation in a Map taks O(1) (except inserting when there are 2 identical keys) but the space complexity is fairly large.
for more reading try wikipedia's HashMap and linked list
HashList is a data structure storing objects in a hash table and a list.it is a combination of hashmap and doubly linked list. acess will be faster. HashMap is hash table implementation of map interface it is same as HashTable except that it is unsynchronized and allow null values. List is an ordered collection and it allow nulls and duplicates in it. positional acess is possible. Set is a collection that doesn't allow duplicates, it may allow at most one null element. same as our mathematical set.
List is just an ordered collectiom(a sequence). Check this list documentation .You can access elements by their integer index (position in the list), and search for elements in the list.
Also lists allow duplicate elements and multiple NULL elements.
Map is an object that maps the values to the keys. Check this map documentation. A map cannot contain duplicate keys; each key can map to at most one value.
List - This datastructure is used to contain list of elements.
In case you need list of elements and the list may contain duplicate values,
then you have to use List.
Map - It contains data as key value pair. When you have to store data
in key value pair,so that latter you can retrieve data using the key,
you have to use Map data structure.
List implementation - ArrayList, LinkedList
Map implementation - HashMap, TreeMap
In comparison HashMap to ArrayList -
A hash map is the fastest data structure if you want to get all nodes for a page. The list of nodes can be fetched in constant time (O(1)) while with lists the time is O(n) (n=number of pages, faster on sorted lists but never getting near O(1))

How does Java order items in a HashMap or a HashTable?

I was wondering how Java orders items in the Map (HashMap or Hashtable) when they are added. Are the keys ordered by the hashcode, memory reference or by allocation precedence...?
It's because I've noticed same pairs in the Map are not always in the same order
java.util.HashMap is unordered; you can't and shouldn't assume anything beyond that.
This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time.
java.util.LinkedHashMap uses insertion-order.
This implementation differs from HashMap in that it maintains a doubly-linked list running through all of its entries. This linked list defines the iteration ordering, which is normally the order in which keys were inserted into the map (insertion-order).
java.util.TreeMap, a SortedMap, uses either natural or custom ordering of the keys.
The map is sorted according to the natural ordering of its keys, or by a Comparator provided at map creation time, depending on which constructor is used.
First of all: HashMap specifically doesn't provide a stable and/or defined ordering. So anything you observe is simply an implementation detail and you must not depend on it in any way.
Since it is sometimes useful to know the reason for the seemingly random ordering, here's the basic idea:
A HashMap has number of buckets (implemented as an array) in which to store entries.
When an item is added to the map, it is assigned to a buckets based on a value derived of its hashCode and the bucket size of the HashMap. (Note that it's possible that the bucket is already occupied, which is called a collision. That's handled gracefully and correctly, but I'll ignore that handling for the description because it doesn't change the concept).
The perceived ordering of the entires (such as returned by iterating over the Map) depends on the order of the entries in those buckets.
Whenever the size is rehashed (because the map exceeded its fullness threshold), then the number of buckets changes, which means that the position of each element might change, since the bucket position is derived from the number of buckets as well.
HashMap does not sort at all. For a map that sorts by key values you should use TreeMap instead.
From the JavaDocs for TreeMap:
Red-Black tree based implementation of
the SortedMap interface. This class
guarantees that the map will be in
ascending key order, sorted according
to the natural order for the key's
class (see Comparable), or by the
comparator provided at creation time,
depending on which constructor is
used.
From the documentation of HashMap:
This class makes no guarantees as to
the order of the map; in particular,
it does not guarantee that the order
will remain constant over time.
A Map is not an ordered data structure - you should not rely on entries in a HashMap being in a certain order. Some Map implementations such as LinkedHashMap and TreeMap do guarantee a certain order, but HashMap does not.
If you really want to know what happens internally, lookup the source code of HashMap - you can find it in src.zip which should be in your JDK installation directory.
A HashMap has a number of "buckets" in which it stores its entries. Which bucket an entry is stored in is determined by the hash code of the key of the entry. The order in which you see the entries in the HashMap depends on the hash codes of the keys. But don't write programs that rely on entries being in a certain order in a HashMap - the implementation might change in a future version of Java and your program then would not work anymore.
hashmap has a not defined order of the elements
There is no defined ordering in a hash table. Keys are placed into a slot, based on the hash code, but even that isn't a trivial order-by-hash-code.
HashMap stores the values using the unique hash-value generated using a part of the key. This hash-value maps to the address where it is going to be stored. This is how it ensures an access O(1).
LinkedHashmap on the other hand preserves the order in which you added to the map.

Categories

Resources