What are the differences between a HashMap and a Hashtable in Java? - java

What are the differences between a HashMap and a Hashtable in Java?
Which is more efficient for non-threaded applications?

There are several differences between HashMap and Hashtable in Java:
Hashtable is synchronized, whereas HashMap is not. This makes HashMap better for non-threaded applications, as unsynchronized Objects typically perform better than synchronized ones.
Hashtable does not allow null keys or values. HashMap allows one null key and any number of null values.
One of HashMap's subclasses is LinkedHashMap, so in the event that you'd want predictable iteration order (which is insertion order by default), you could easily swap out the HashMap for a LinkedHashMap. This wouldn't be as easy if you were using Hashtable.
Since synchronization is not an issue for you, I'd recommend HashMap. If synchronization becomes an issue, you may also look at ConcurrentHashMap.

Note, that a lot of the answers state that Hashtable is synchronized. In practice this buys you very little. The synchronization is on the accessor/mutator methods will stop two threads adding or removing from the map concurrently, but in the real world, you will often need additional synchronization.
A very common idiom is to "check then put" — i.e. look for an entry in the Map, and add it if it does not already exist. This is not in any way an atomic operation whether you use Hashtable or HashMap.
An equivalently synchronised HashMap can be obtained by:
Collections.synchronizedMap(myMap);
But to correctly implement this logic you need additional synchronisation of the form:
synchronized(myMap) {
if (!myMap.containsKey("tomato"))
myMap.put("tomato", "red");
}
Even iterating over a Hashtable's entries (or a HashMap obtained by Collections.synchronizedMap) is not thread-safe unless you also guard the Map against being modified through additional synchronization.
Implementations of the ConcurrentMap interface (for example ConcurrentHashMap) solve some of this by including thread safe check-then-act semantics such as:
ConcurrentMap.putIfAbsent(key, value);

Hashtable is considered legacy code. There's nothing about Hashtable that can't be done using HashMap or derivations of HashMap, so for new code, I don't see any justification for going back to Hashtable.

This question is often asked in interviews to check whether the candidate understands the correct usage of collection classes and is aware of alternative solutions available.
The HashMap class is roughly equivalent to Hashtable, except that it is non synchronized and permits nulls. (HashMap allows null values as key and value whereas Hashtable doesn't allow nulls).
HashMap does not guarantee that the order of the map will remain constant over time.
HashMap is non synchronized whereas Hashtable is synchronized.
Iterator in the HashMap is fail-safe while the enumerator for the Hashtable is not and throw ConcurrentModificationException if any other Thread modifies the map structurally by adding or removing any element except Iterator's own remove() method. But this is not a guaranteed behavior and will be done by JVM on best effort.
Note on Some Important Terms:
Synchronized means only one thread can modify a hash table at one point in time. Basically, it means that any thread before performing an update on a Hashtable will have to acquire a lock on the object while others will wait for the lock to be released.
Fail-safe is relevant within the context of iterators. If an iterator has been created on a collection object and some other thread tries to modify the collection object "structurally", a concurrent modification exception will be thrown. It is possible for other threads though to invoke the set method since it doesn't modify the collection "structurally". However, if prior to calling set, the collection has been modified structurally, IllegalArgumentException will be thrown.
Structurally modification means deleting or inserting element which could effectively change the structure of the map.
HashMap can be synchronized by
Map m = Collections.synchronizeMap(hashMap);
Map provides Collection views instead of direct support for iteration via Enumeration objects. Collection views greatly enhance the expressiveness of the interface, as discussed later in this section. Map allows you to iterate over keys, values, or key-value pairs; Hashtable does not provide the third option. Map provides a safe way to remove entries in the midst of iteration; Hashtable did not. Finally, Map fixes a minor deficiency in the Hashtable interface. Hashtable has a method called contains, which returns true if the Hashtable contains a given value. Given its name, you'd expect this method to return true if the Hashtable contained a given key because the key is the primary access mechanism for a Hashtable. The Map interface eliminates this source of confusion by renaming the method containsValue. Also, this improves the interface's consistency — containsValue parallels containsKey.
The Map Interface

HashMap: An implementation of the Map interface that uses hash codes to index an array.
Hashtable: Hi, 1998 called. They want their collections API back.
Seriously though, you're better off staying away from Hashtable altogether. For single-threaded apps, you don't need the extra overhead of synchronisation. For highly concurrent apps, the paranoid synchronisation might lead to starvation, deadlocks, or unnecessary garbage collection pauses. Like Tim Howland pointed out, you might use ConcurrentHashMap instead.

Keep in mind that HashTable was legacy class before Java Collections Framework (JCF) was introduced and was later retrofitted to implement the Map interface. So was Vector and Stack.
Therefore, always stay away from them in new code since there always better alternative in the JCF as others had pointed out.
Here is the Java collection cheat sheet that you will find useful. Notice the gray block contains the legacy class HashTable,Vector and Stack.

There are many good answers already posted. I'm adding few new points and summarizing it.
HashMap and Hashtable both are used to store data in key and value form. Both are using hashing technique to store unique keys.
But there are many differences between HashMap and Hashtable classes that are given below.
HashMap
HashMap is non synchronized. It is not-thread safe and can't be shared between many threads without proper synchronization code.
HashMap allows one null key and multiple null values.
HashMap is a new class introduced in JDK 1.2.
HashMap is fast.
We can make the HashMap as synchronized by calling this code
Map m = Collections.synchronizedMap(HashMap);
HashMap is traversed by Iterator.
Iterator in HashMap is fail-fast.
HashMap inherits AbstractMap class.
Hashtable
Hashtable is synchronized. It is thread-safe and can be shared with many threads.
Hashtable doesn't allow null key or value.
Hashtable is a legacy class.
Hashtable is slow.
Hashtable is internally synchronized and can't be unsynchronized.
Hashtable is traversed by Enumerator and Iterator.
Enumerator in Hashtable is not fail-fast.
Hashtable inherits Dictionary class.
Further reading What is difference between HashMap and Hashtable in Java?

Take a look at this chart. It provides comparisons between different data structures along with HashMap and Hashtable. The comparison is precise, clear and easy to understand.
Java Collection Matrix

In addition to what izb said, HashMap allows null values, whereas the Hashtable does not.
Also note that Hashtable extends the Dictionary class, which as the Javadocs state, is obsolete and has been replaced by the Map interface.

Hashtable is similar to the HashMap and has a similar interface. It is recommended that you use HashMap, unless you require support for legacy applications or you need synchronisation, as the Hashtables methods are synchronised. So in your case as you are not multi-threading, HashMaps are your best bet.

Hashtable is synchronized, whereas HashMap isn't. That makes Hashtable slower than Hashmap.
For single thread applications, use HashMap since they are otherwise the same in terms of functionality.

Another key difference between hashtable and hashmap is that Iterator in the HashMap is fail-fast while the enumerator for the Hashtable is not and throw ConcurrentModificationException if any other Thread modifies the map structurally by adding or removing any element except Iterator's own remove() method. But this is not a guaranteed behavior and will be done by JVM on best effort."
My source: http://javarevisited.blogspot.com/2010/10/difference-between-hashmap-and.html

Beside all the other important aspects already mentioned here, Collections API (e.g. Map interface) is being modified all the time to conform to the "latest and greatest" additions to Java spec.
For example, compare Java 5 Map iterating:
for (Elem elem : map.keys()) {
elem.doSth();
}
versus the old Hashtable approach:
for (Enumeration en = htable.keys(); en.hasMoreElements(); ) {
Elem elem = (Elem) en.nextElement();
elem.doSth();
}
In Java 1.8 we are also promised to be able to construct and access HashMaps like in good old scripting languages:
Map<String,Integer> map = { "orange" : 12, "apples" : 15 };
map["apples"];
Update: No, they won't land in 1.8... :(
Are Project Coin's collection enhancements going to be in JDK8?

HashTable is synchronized, if you are using it in a single thread you can use HashMap, which is an unsynchronized version. Unsynchronized objects are often a little more performant. By the way if multiple threads access a HashMap concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally.
Youn can wrap a unsynchronized map in a synchronized one using :
Map m = Collections.synchronizedMap(new HashMap(...));
HashTable can only contain non-null object as a key or as a value. HashMap can contain one null key and null values.
The iterators returned by Map are fail-fast, if the map is structurally modified at any time after the iterator is created, in any way except through the iterator's own remove method, the iterator will throw a ConcurrentModificationException. Thus, in the face of concurrent modification, the iterator fails quickly and cleanly, rather than risking arbitrary, non-deterministic behavior at an undetermined time in the future. Whereas the Enumerations returned by Hashtable's keys and elements methods are not fail-fast.
HashTable and HashMap are member of the Java Collections Framework (since Java 2 platform v1.2, HashTable was retrofitted to implement the Map interface).
HashTable is considered legacy code, the documentation advise to use ConcurrentHashMap in place of Hashtable if a thread-safe highly-concurrent implementation is desired.
HashMap doesn't guarantee the order in which elements are returned. For HashTable I guess it's the same but I'm not entirely sure, I don't find ressource that clearly state that.

HashMap and Hashtable have significant algorithmic differences as well. No one has mentioned this before so that's why I am bringing it up. HashMap will construct a hash table with power of two size, increase it dynamically such that you have at most about eight elements (collisions) in any bucket and will stir the elements very well for general element types. However, the Hashtable implementation provides better and finer control over the hashing if you know what you are doing, namely you can fix the table size using e.g. the closest prime number to your values domain size and this will result in better performance than HashMap i.e. less collisions for some cases.
Separate from the obvious differences discussed extensively in this question, I see the Hashtable as a "manual drive" car where you have better control over the hashing and the HashMap as the "automatic drive" counterpart that will generally perform well.

Based on the info here, I'd recommend going with HashMap. I think the biggest advantage is that Java will prevent you from modifying it while you are iterating over it, unless you do it through the iterator.

A Collection — sometimes called a container — is simply an object that groups multiple elements into a single unit. Collections are used to store, retrieve, manipulate, and communicate aggregate data. A collections framework W is a unified architecture for representing and manipulating collections.
The HashMap JDK1.2 and Hashtable JDK1.0, both are used to represent a group of objects that are represented in <Key, Value> pair. Each <Key, Value> pair is called Entry object. The collection of Entries is referred by the object of HashMap and Hashtable. Keys in a collection must be unique or distinctive. [as they are used to retrieve a mapped value a particular key. values in a collection can be duplicated.]
« Superclass, Legacy and Collection Framework member
Hashtable is a legacy class introduced in JDK1.0, which is a subclass of Dictionary class. From JDK1.2 Hashtable is re-engineered to implement the Map interface to make a member of collection framework. HashMap is a member of Java Collection Framework right from the beginning of its introduction in JDK1.2. HashMap is the subclass of the AbstractMap class.
public class Hashtable<K,V> extends Dictionary<K,V> implements Map<K,V>, Cloneable, Serializable { ... }
public class HashMap<K,V> extends AbstractMap<K,V> implements Map<K,V>, Cloneable, Serializable { ... }
« Initial capacity and Load factor
The capacity is the number of buckets in the hash table, and the initial capacity is simply the capacity at the time the hash table is created. Note that the hash table is open: in the case of a "hash collision", a single bucket stores multiple entries, which must be searched sequentially. The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased.
HashMap constructs an empty hash table with the default initial capacity (16) and the default load factor (0.75). Where as Hashtable constructs empty hashtable with a default initial capacity (11) and load factor/fill ratio (0.75).
« Structural modification in case of hash collision
HashMap, Hashtable in case of hash collisions they store the map entries in linked lists. From Java8 for HashMap if hash bucket grows beyond a certain threshold, that bucket will switch from linked list of entries to a balanced tree. which improve worst-case performance from O(n) to O(log n). While converting the list to binary tree, hashcode is used as a branching variable. If there are two different hashcodes in the same bucket, one is considered bigger and goes to the right of the tree and other one to the left. But when both the hashcodes are equal, HashMap assumes that the keys are comparable, and compares the key to determine the direction so that some order can be maintained. It is a good practice to make the keys of HashMap comparable. On adding entries if bucket size reaches TREEIFY_THRESHOLD = 8 convert linked list of entries to a balanced tree, on removing entries less than TREEIFY_THRESHOLD and at most UNTREEIFY_THRESHOLD = 6 will reconvert balanced tree to linked list of entries. Java 8 SRC, stackpost
« Collection-view iteration, Fail-Fast and Fail-Safe
+--------------------+-----------+-------------+
| | Iterator | Enumeration |
+--------------------+-----------+-------------+
| Hashtable | fail-fast | safe |
+--------------------+-----------+-------------+
| HashMap | fail-fast | fail-fast |
+--------------------+-----------+-------------+
| ConcurrentHashMap | safe | safe |
+--------------------+-----------+-------------+
Iterator is a fail-fast in nature. i.e it throws ConcurrentModificationException if a collection is modified while iterating other than it’s own remove() method. Where as Enumeration is fail-safe in nature. It doesn’t throw any exceptions if a collection is modified while iterating.
According to Java API Docs, Iterator is always preferred over the Enumeration.
NOTE: The functionality of Enumeration interface is duplicated by the Iterator interface. In addition, Iterator adds an optional remove operation, and has shorter method names. New implementations should consider using Iterator in preference to Enumeration.
In Java 5 introduced ConcurrentMap Interface: ConcurrentHashMap - a highly concurrent, high-performance ConcurrentMap implementation backed by a hash table. This implementation never blocks when performing retrievals and allows the client to select the concurrency level for updates. It is intended as a drop-in replacement for Hashtable: in addition to implementing ConcurrentMap, it supports all of the "legacy" methods peculiar to Hashtable.
Each HashMapEntrys value is volatile thereby ensuring fine grain consistency for contended modifications and subsequent reads; each read reflects the most recently completed update
Iterators and Enumerations are Fail Safe - reflecting the state at some point since the creation of iterator/enumeration; this allows for simultaneous reads and modifications at the cost of reduced consistency. They do not throw ConcurrentModificationException. However, iterators are designed to be used by only one thread at a time.
Like Hashtable but unlike HashMap, this class does not allow null to be used as a key or value.
public static void main(String[] args) {
//HashMap<String, Integer> hash = new HashMap<String, Integer>();
Hashtable<String, Integer> hash = new Hashtable<String, Integer>();
//ConcurrentHashMap<String, Integer> hash = new ConcurrentHashMap<>();
new Thread() {
#Override public void run() {
try {
for (int i = 10; i < 20; i++) {
sleepThread(1);
System.out.println("T1 :- Key"+i);
hash.put("Key"+i, i);
}
System.out.println( System.identityHashCode( hash ) );
} catch ( Exception e ) {
e.printStackTrace();
}
}
}.start();
new Thread() {
#Override public void run() {
try {
sleepThread(5);
// ConcurrentHashMap traverse using Iterator, Enumeration is Fail-Safe.
// Hashtable traverse using Enumeration is Fail-Safe, Iterator is Fail-Fast.
for (Enumeration<String> e = hash.keys(); e.hasMoreElements(); ) {
sleepThread(1);
System.out.println("T2 : "+ e.nextElement());
}
// HashMap traverse using Iterator, Enumeration is Fail-Fast.
/*
for (Iterator< Entry<String, Integer> > it = hash.entrySet().iterator(); it.hasNext(); ) {
sleepThread(1);
System.out.println("T2 : "+ it.next());
// ConcurrentModificationException at java.util.Hashtable$Enumerator.next
}
*/
/*
Set< Entry<String, Integer> > entrySet = hash.entrySet();
Iterator< Entry<String, Integer> > it = entrySet.iterator();
Enumeration<Entry<String, Integer>> entryEnumeration = Collections.enumeration( entrySet );
while( entryEnumeration.hasMoreElements() ) {
sleepThread(1);
Entry<String, Integer> nextElement = entryEnumeration.nextElement();
System.out.println("T2 : "+ nextElement.getKey() +" : "+ nextElement.getValue() );
//java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextNode
// at java.util.HashMap$EntryIterator.next
// at java.util.Collections$3.nextElement
}
*/
} catch ( Exception e ) {
e.printStackTrace();
}
}
}.start();
Map<String, String> unmodifiableMap = Collections.unmodifiableMap( map );
try {
unmodifiableMap.put("key4", "unmodifiableMap");
} catch (java.lang.UnsupportedOperationException e) {
System.err.println("UnsupportedOperationException : "+ e.getMessage() );
}
}
static void sleepThread( int sec ) {
try {
Thread.sleep( 1000 * sec );
} catch (InterruptedException e) {
e.printStackTrace();
}
}
« Null Keys And Null Values
HashMap allows maximum one null key and any number of null values. Where as Hashtable doesn’t allow even a single null key and null value, if the key or value null is then it throws NullPointerException. Example
« Synchronized, Thread Safe
Hashtable is internally synchronized. Therefore, it is very much safe to use Hashtable in multi threaded applications. Where as HashMap is not internally synchronized. Therefore, it is not safe to use HashMap in multi threaded applications without external synchronization. You can externally synchronize HashMap using Collections.synchronizedMap() method.
« Performance
As Hashtable is internally synchronized, this makes Hashtable slightly slower than the HashMap.
#See
A red–black tree is a kind of self-balancing binary search tree
Performance Improvement for HashMap in Java 8

For threaded apps, you can often get away with ConcurrentHashMap- depends on your performance requirements.

1.Hashmap and HashTable both store key and value.
2.Hashmap can store one key as null. Hashtable can't store null.
3.HashMap is not synchronized but Hashtable is synchronized.
4.HashMap can be synchronized with Collection.SyncronizedMap(map)
Map hashmap = new HashMap();
Map map = Collections.SyncronizedMap(hashmap);

Apart from the differences already mentioned, it should be noted that since Java 8, HashMap dynamically replaces the Nodes (linked list) used in each bucket with TreeNodes (red-black tree), so that even if high hash collisions exist, the worst case when searching is
O(log(n)) for HashMap Vs O(n) in Hashtable.
*The aforementioned improvement has not been applied to Hashtable yet, but only to HashMap, LinkedHashMap, and ConcurrentHashMap.
FYI, currently,
TREEIFY_THRESHOLD = 8 : if a bucket contains more than 8 nodes, the linked list is transformed into a balanced tree.
UNTREEIFY_THRESHOLD = 6 : when a bucket becomes too small (due to removal or resizing) the tree is converted back to linked list.

There are 5 basic differentiations with HashTable and HashMaps.
Maps allows you to iterate and retrieve keys, values, and both key-value pairs as well, Where HashTable don't have all this capability.
In Hashtable there is a function contains(), which is very confusing to use. Because the meaning of contains is slightly deviating. Whether it means contains key or contains value? tough to understand. Same thing in Maps we have ContainsKey() and ContainsValue() functions, which are very easy to understand.
In hashmap you can remove element while iterating, safely. where as it is not possible in hashtables.
HashTables are by default synchronized, so it can be used with multiple threads easily. Where as HashMaps are not synchronized by default, so can be used with only single thread. But you can still convert HashMap to synchronized by using Collections util class's synchronizedMap(Map m) function.
HashTable won't allow null keys or null values. Where as HashMap allows one null key, and multiple null values.

My small contribution :
First and most significant different between Hashtable and HashMap is that, HashMap is not thread-safe while Hashtable is a thread-safe collection.
Second important difference between Hashtable and HashMap is performance, since HashMap is not synchronized it perform better than Hashtable.
Third difference on Hashtable vs HashMap is that Hashtable is obsolete class and you should be using ConcurrentHashMap in place of Hashtable in Java.

HashMap: It is a class available inside java.util package and it is used to store the element in key and value format.
Hashtable: It is a legacy class which is being recognized inside collection framework.

Hashtable is synchronized whereas HashMap is not.
Another difference is that iterator in the HashMap is fail-safe
while the enumerator for the Hashtable isn't. If you change the map
while iterating, you'll know.
HashMap permits null values in it, while Hashtable doesn't.

HashTable is a legacy class in the jdk that shouldn't be used anymore. Replace usages of it with ConcurrentHashMap. If you don't require thread safety, use HashMap which isn't threadsafe but faster and uses less memory.

HashMap and HashTable
Some important points about HashMap and HashTable.
please read below details.
1) Hashtable and Hashmap implement the java.util.Map interface
2) Both Hashmap and Hashtable is the hash based collection. and working on hashing.
so these are similarity of HashMap and HashTable.
What is the difference between HashMap and HashTable?
1) First difference is HashMap is not thread safe While HashTable is ThreadSafe
2) HashMap is performance wise better because it is not thread safe. while Hashtable performance wise is not better because it is thread safe. so multiple thread can not access Hashtable at the same time.

HashMap and Hashtable both are used to store data in key and value form. Both are using hashing technique to store unique keys.
ut there are many differences between HashMap and Hashtable classes that are given below.

Hashtable:
Hashtable is a data structure that retains values of key-value pair. It doesn’t allow null for both the keys and the values. You will get a NullPointerException if you add null value. It is synchronized. So it comes with its cost. Only one thread can access HashTable at a particular time.
Example :
import java.util.Map;
import java.util.Hashtable;
public class TestClass {
public static void main(String args[ ]) {
Map<Integer,String> states= new Hashtable<Integer,String>();
states.put(1, "INDIA");
states.put(2, "USA");
states.put(3, null); //will throw NullPointerEcxeption at runtime
System.out.println(states.get(1));
System.out.println(states.get(2));
// System.out.println(states.get(3));
}
}
HashMap:
HashMap is like Hashtable but it also accepts key value pair. It allows null for both the keys and the values. Its performance better is better than HashTable, because it is unsynchronized.
Example:
import java.util.HashMap;
import java.util.Map;
public class TestClass {
public static void main(String args[ ]) {
Map<Integer,String> states = new HashMap<Integer,String>();
states.put(1, "INDIA");
states.put(2, "USA");
states.put(3, null); // Okay
states.put(null,"UK");
System.out.println(states.get(1));
System.out.println(states.get(2));
System.out.println(states.get(3));
}
}

Since Hashtable in Java is a subclass of Dictionary class which is now obsolete due to the existence of Map Interface, it is not used anymore. Moreover, there isn't anything you can't do with a class that implements the Map Interface that you can do with a Hashtable.

Old and classic topic, just want to add this helpful blog that explains this:
http://blog.manishchhabra.com/2012/08/the-5-main-differences-betwen-hashmap-and-hashtable/
Blog by Manish Chhabra
The 5 main differences betwen HashMap and Hashtable
HashMap and Hashtable both implement java.util.Map interface but there
are some differences that Java developers must understand to write
more efficient code. As of the Java 2 platform v1.2, Hashtable class
was retrofitted to implement the Map interface, making it a member of
the Java Collections Framework.
One of the major differences between HashMap and Hashtable is that HashMap is non-synchronized whereas Hashtable is synchronized, which
means Hashtable is thread-safe and can be shared between multiple
threads but HashMap cannot be shared between multiple threads without
proper synchronization. Java 5 introduced ConcurrentHashMap which is
an alternative of Hashtable and provides better scalability than
Hashtable in Java.Synchronized means only one thread can modify a hash
table at one point of time. Basically, it means that any thread before
performing an update on a hashtable will have to acquire a lock on the
object while others will wait for lock to be released.
The HashMap class is roughly equivalent to Hashtable, except that it permits nulls. (HashMap allows null values as key and value whereas
Hashtable doesn’t allow nulls).
The third significant difference between HashMap vs Hashtable is that Iterator in the HashMap is a fail-fast iterator while the
enumerator for the Hashtable is not and throw
ConcurrentModificationException if any other Thread modifies the map
structurally by adding or removing any element except Iterator’s own
remove() method. But this is not a guaranteed behavior and will be
done by JVM on best effort. This is also an important difference
between Enumeration and Iterator in Java.
One more notable difference between Hashtable and HashMap is that because of thread-safety and synchronization Hashtable is much slower
than HashMap if used in Single threaded environment. So if you don’t
need synchronization and HashMap is only used by one thread, it out
perform Hashtable in Java.
HashMap does not guarantee that the order of the map will remain constant over time.
Note that HashMap can be synchronized by
Map m = Collections.synchronizedMap(hashMap);
In Summary there are significant differences between Hashtable and
HashMap in Java e.g. thread-safety and speed and based upon that only
use Hashtable if you absolutely need thread-safety, if you are running
Java 5 consider using ConcurrentHashMap in Java.

Related

thread safe data structure to preserve order of insertion [duplicate]

I need a data structure that is a LinkedHashMap and is thread safe.
How can I do that ?
You can wrap the map in a Collections.synchronizedMap to get a synchronized hashmap that maintains insertion order. This is not as efficient as a ConcurrentHashMap (and doesn't implement the extra interface methods of ConcurrentMap) but it does get you the (somewhat) thread safe behavior.
Even the mighty Google Collections doesn't appear to have solved this particular problem yet. However, there is one project that does try to tackle the problem.
I say somewhat on the synchronization, because iteration is still not thread safe in the sense that concurrent modification exceptions can happen.
There's a number of different approaches to this problem. You could use:
Collections.synchronizedMap(new LinkedHashMap());
as the other responses have suggested but this has several gotchas you'll need to be aware of. Most notably is that you will often need to hold the collections synchronized lock when iterating over the collection, which in turn prevents other threads from accessing the collection until you've completed iterating over it. (See Java theory and practice: Concurrent collections classes). For example:
synchronized(map) {
for (Object obj: map) {
// Do work here
}
}
Using
new ConcurrentHashMap();
is probably a better choice as you won't need to lock the collection to iterate over it.
Finally, you might want to consider a more functional programming approach. That is you could consider the map as essentially immutable. Instead of adding to an existing Map, you would create a new one that contains the contents of the old map plus the new addition. This sounds pretty bizarre at first, but it is actually the way Scala deals with concurrency and collections
There is one implementation available under Google code. A quote from their site:
A high performance version of java.util.LinkedHashMap for use as a software cache.
Design
A concurrent linked list runs through a ConcurrentHashMap to provide eviction ordering.
Supports insertion and access ordered eviction policies (FIFO, LRU, and Second Chance).
You can use a ConcurrentSkipListMap, only available in Java SE/EE 6 or later. It is order presevering in that keys are sorted according to their natural ordering. You need to have a Comparator or make the keys Comparable objects. In order to mimik a linked hash map behavior (iteration order is the order in time in which entries were added) I implemented my key objects to always compare to be greater than a given other object unless it is equal (whatever that is for your object).
A wrapped synchronized linked hash map did not suffice because as stated in
http://www.ibm.com/developerworks/java/library/j-jtp07233.html: "The synchronized collections wrappers, synchronizedMap and synchronizedList, are sometimes called conditionally thread-safe -- all individual operations are thread-safe, but sequences of operations where the control flow depends on the results of previous operations may be subject to data races. The first snippet in Listing 1 shows the common put-if-absent idiom -- if an entry does not already exist in the Map, add it. Unfortunately, as written, it is possible for another thread to insert a value with the same key between the time the containsKey() method returns and the time the put() method is called. If you want to ensure exactly-once insertion, you need to wrap the pair of statements with a synchronized block that synchronizes on the Map m."
So what only helps is a ConcurrentSkipListMap which is 3-5 times slower than a normal ConcurrentHashMap.
Collections.synchronizedMap(new LinkedHashMap())
Since the ConcurrentHashMap offers a few important extra methods that are not in the Map interface, simply wrapping a LinkedHashMap with a synchronizedMap won't give you the same functionality, in particular, they won't give you anything like the putIfAbsent(), replace(key, oldValue, newValue) and remove(key, oldValue) methods which make the ConcurrentHashMap so useful.
Unless there's some apache library that has implemented what you want, you'll probably have to use a LinkedHashMap and provide suitable synchronized{} blocks of your own.
I just tried synchronized bounded LRU Map based on insertion order LinkedConcurrentHashMap; with Read/Write Lock for synchronization.
So when you are using iterator; you have to acquire WriteLock to avoid ConcurrentModificationException. This is better than Collections.synchronizedMap.
public class LinkedConcurrentHashMap<K, V> {
private LinkedHashMap<K, V> linkedHashMap = null;
private final int cacheSize;
private ReadWriteLock readWriteLock = null;
public LinkedConcurrentHashMap(LinkedHashMap<K, V> psCacheMap, int size) {
this.linkedHashMap = psCacheMap;
cacheSize = size;
readWriteLock=new ReentrantReadWriteLock();
}
public void put(K key, V value) throws SQLException{
Lock writeLock=readWriteLock.writeLock();
try{
writeLock.lock();
if(linkedHashMap.size() >= cacheSize && cacheSize > 0){
K oldAgedKey = linkedHashMap.keySet().iterator().next();
remove(oldAgedKey);
}
linkedHashMap.put(key, value);
}finally{
writeLock.unlock();
}
}
public V get(K key){
Lock readLock=readWriteLock.readLock();
try{
readLock.lock();
return linkedHashMap.get(key);
}finally{
readLock.unlock();
}
}
public boolean containsKey(K key){
Lock readLock=readWriteLock.readLock();
try{
readLock.lock();
return linkedHashMap.containsKey(key);
}finally{
readLock.unlock();
}
}
public V remove(K key){
Lock writeLock=readWriteLock.writeLock();
try{
writeLock.lock();
return linkedHashMap.remove(key);
}finally{
writeLock.unlock();
}
}
public ReadWriteLock getLock(){
return readWriteLock;
}
public Set<Map.Entry<K, V>> entrySet(){
return linkedHashMap.entrySet();
}
}
The answer is pretty much no, there's nothing equivalent to a ConcurrentHashMap that is sorted (like the LinkedHashMap). As other people pointed out, you can wrap your collection using Collections.synchronizedMap(-yourmap-) however this will not give you the same level of fine grained locking. It will simply block the entire map on every operation.
Your best bet is to either use synchronized around any access to the map (where it matters, of course. You may not care about dirty reads, for example) or to write a wrapper around the map that determines when it should or should not lock.
How about this.
Take your favourite open-source concurrent HashMap implementation. Sadly it can't be Java's ConcurrentHashMap as it's basically impossible to copy and modify that due to huge numbers of package-private stuff. (Why do the Java authors always do that?)
Add a ConcurrentLinkedDeque field.
Modify all of the put methods so that if an insertion is successful the Entry is added to the end of the deque. Modify all of the remove methods so that any removed entries are also removed from the deque. Where a put method replaces the existing value, we don't have to do anything to the deque.
Change all iterator/spliterator methods so that they delegate to the deque.
There's no guarantee that the deque and the map have exactly the same contents at all times, but concurrent hash maps don't make those sort of promises anyway.
Removal won't be super fast (have to scan the deque). But most maps are never (or very rarely) asked to remove entries anyway.
You could also achieve this by extending ConcurrentHashMap, or decorating it (decorator pattern).

Java's hashmap: Is keys() indeed missing?

Java's HashTable is a synchronized hashtable (and exists for quite a while) while HashMap is an unsynchronized.
In HashTable there are 2 ways to get the keys of the hashtable:
Keys which:
key
public Enumeration keys()Returns an enumeration of the keys in
this hashtable.
and
public Set keySet()
Returns a Set view of the keys contained in this
Hashtable. The Set is backed by the Hashtable, so changes to the
Hashtable are reflected in the Set, and vice-versa. The Set supports
element removal (which removes the corresponding entry from the
Hashtable), but not element addition.
In the latter it is explicitely stated that the keys are direct references to the hashtable (so beware of modifications etc).
But there is no such mention for the keys().
So my question is:
Does the keys() using an enumerator return a copy of the keys (unlike keyset() which return the actual keys)?
And if yes why there is no such method in HashMap and only keyset() is provided?
Hashtable.keys returns references to the real keys. It does not copy them.
The method does not exist in HashMap because keySet already does the job. It exists in hashtable because this class has been around since java 1.0. The collections framework that defines the keySet method wasn't added until 1.2.
In general Iterators on unsynchronized collections don't behave particularly well (they tend to throw ConcurrentModificationException or behave in an unspecified manner)
By looking at the source code for Hashtable, you can see that the key set's iterator and keys() enumeration are in fact implemented by the same inner class, which will attempt to throw a ConcurrentModificationException if the Hashtable changes. So, no, it is not going to make a copy of the keys.

Java Collection: Does concrete class HashSet which uses Hash table implemented using LinkedList as its data structure?

In the book 'A Programmer's guide to Java SCJP certification by Khalid Mughal - 3rd ed.', on page 782, i noticed that it says that concrete class HashSet is implemented using hash table and linked list. When i browse through the main java tutorial website http://download.oracle.com/javase/tutorial/collections/implementations/index.html, it doesn't seem to be true. Please advice. Thanks.
HashSet is a wrapper for HashMap which in turn use an array. HashMap is a hash table but not the Hashtable class. HashSet doesn't have anything to do with a list except to resolve collisions.
LinkedHashSet also has a linked list of its own, but does not use the LinkedList class.
According to code of HashSet,
public HashSet() {
map = new HashMap<>();
}
Hash table based implementation of the Map interface. This implementation provides all of the optional map operations, and permits null values and the null key. (The HashMap class is roughly equivalent to Hashtable, except that it is unsynchronized and permits nulls.) This class makes no guarantees as to the order of the map, in particular, it does not guarantee that the order will remain constant over time.

Why HashSet internally implemented as HashMap [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Why does HashSet implementation in Sun Java use HashMap as its backing?
I know what a hashset and hashmap is - pretty well versed with them.
There is 1 thing which really puzzled me.
Example:
Set <String> testing= new HashSet <String>();
Now if you debug it using eclipse right after the above statements, under debugger variables tab, you will noticed that the set 'testing' internally is implemented as a hashmap.
Why does it need a hashmap since there is no key,value pair involved in sets collection
It's an implementation detail. The HashMap is actually used as the backing store for the HashSet. From the docs:
This class implements the Set interface, backed by a hash table (actually a HashMap instance). It makes no guarantees as to the iteration order of the set; in particular, it does not guarantee that the order will remain constant over time. This class permits the null element.
(emphasis mine)
The answer is right in the API docs
"This class implements the Set interface, backed by a hash table (actually a HashMap instance). It makes no guarantees as to the iteration order of the set; in particular, it does not guarantee that the order will remain constant over time. This class permits the null element.
This class offers constant time performance for the basic operations (add, remove, contains and size), assuming the hash function disperses the elements properly among the buckets. Iterating over this set requires time proportional to the sum of the HashSet instance's size (the number of elements) plus the "capacity" of the backing HashMap instance (the number of buckets). Thus, it's very important not to set the initial capacity too high (or the load factor too low) if iteration performance is important."
So you don't even need the debugger to know this.
In answer to your question: it is an implementation detail. It doesn't need to use a HashMap, but it is probably just good code re-use. If you think about it, in this case the only difference is that a Set has different semantics from a Map. Namely, maps have a get(key) method, and Sets do not. Sets do not allow duplicates, Maps allow duplicate values, but they must be under different keys.
It is probably really easy to use a HashMap as the backing of a HashSet, because all you would have to do would be to use hashCode (defined on all objects) on the value you are putting in the Set to determine if a dupe, i.e., it is probably just doing something like
backingHashMap.put(toInsert.hashCode(), toInsert);
to insert items into the Set.
In most cases the Set is implemented as wrapper for the keySet() of a Map. This avoids duplicate implementations. If you look at the source you will see how it does this.
You might find the method Collections.newSetFromMap() which can be used to wrap ConcurrentHashMap for example.
The very first sentence of the class's Javadoc states that it is backed by a HashMap:
This class implements the Set interface, backed by a hash table (actually a HashMap instance).
If you'll look at the source code of HashSet you'll see that what it stores in the map is as the key is the entry you are using, and the value is a mere marker Object (named PRESENT).
Why is it backed by a HashMap? Because this is the simplest way to store a set of items in a (conceptual) hashtable and there is no need for HashSet to re-invent an implementation of a hashtable data structure.
It's just a matter of convenience that the standard Java class library implements HashSet using a HashMap -- they only need to implement one data structure and then HashSet stores its data in a HashMap with the actual set objects as the key and a dummy value (typically Boolean.TRUE) as the value.
HashMap has already all the functionality that HashSet requires. There would be no sense to duplicate the same algorithms.
it allows you to easily and quickly determine whether an object is already in the set or not.

Implementing a concurrent LinkedHashMap

I'm trying to create a concurrent LinkedHashMap for a multithreaded architecture.
If I use Collections#synchronizedMap(), I would have to use synchronized blocks for iteration. This implementation would lead to sequential addition of elements.
If I use ConcurrentSkipListMap is there any way to implement a Comparator to store sequentially, as stored in Linked List or queue.
I would like to use java's built in instead of third party packages.
EDIT:
In this concurrent LinkedHashMap, if the keys are the name, I wish to put the keys in sequence of their arrival. i.e. new value would be appended to either at start or end, but sequentially.
While iterating, the LinkedHashMap could be added with new entries, or removed. but the iteration should be the sequence in which the entries were added.
I understand that by using Collections#synchronizedMap(), an synchronized block for iteration would have to be implemented, but would the map be modifiable (entries could be added/removed) while it is being iterated.
If you use synchronizedMap, you don't have to synchronize externally, except for iteration. If you need to preserve the ordering of the map, you should use a SortedMap. You could use ConcurrentSkipListMap, which is thread-safe, or another SortedMap in combination with synchronizedSortedMap.
A LinkedHashMap has a doubly linked list running through a hashtable. A FIFO only mutates the links on a write (insertion or removal). This makes implementing a version fairly straightforward.
Write a LHM with only insertion order allowed.
Switch to a ConcurrentHashMap as the hashtable.
Protect #put() / #putIfAbsent() / #remove() with a lock.
Make the "next" field volatile.
On iteration, no lock is needed as you can safely follow the "next" field. Reads can be lock-free by just delegating to the CHM on a #get().
Use Collections#synchronizedMap().
As per my belief, if I use Collections.synchronizedMap(), I would have to use synchronized blocks for getter/setter.
This is not true. You only need to synchronize the iteration on any of the views (keyset, values, entryset). Also see the abovelinked API documentation.
Until now, my project used LRUMap from Apache Collections but it is based on SequencedHashMap. Collections proposes ListOrderedMap but none are thread-safe.
I have switched to MapMaker from Google Guava. You can look at CacheBuilder too.
Um, simple answer would be to use a monotonically increasing key provider that your Comparator operates on. Think AtomicInteger, and every time you insert, you create a new key to be used for comparisons. If you pool your real key, you can make an internal map of OrderedKey<MyRealKeyType>.
class OrderedKey<T> implements Comparable<OrderedKey<T>> {
T realKey;
int index;
OrderedKey(AtomicInteger source, T key) {
index = source.getAndIncrement();
realKey = key;
}
public int compareTo(OrderedKey<T> other) {
if (Objects.equals(realKey, other.realKey)) {
return 0;
}
return index - other.index;
}
}
This would obviate the need for a custom comparator, and give you a nice O(1) method to compute size (unless you allow removes, in which case, count those as well, so you can just subtract "all successful removes" from "all successful adds", where successful means an entry was actually created or removed).

Categories

Resources