HashMaps and Lists reconciliation in Java - java

My issue is that I need a HashMap which returns a reference to an internal LinkedList when hashMap.get(key) is called— not simply return the value that corresponds to the key.
From what I've gathered, a LinkedHashMap enables a doubly-linked list to occupy each map entry for collision handling. However, I want to be able to get a reference to the overarching LinkedList that encapsulates all values mapped into it (each object that shares a LinkedList also share a particular feature I'm very interested in due to my overridden hash code function).
Put differently, I aim to avoid the linked list auto-traversal built into the LinkedHashMap class and just want the reference of the list itself to operationalize.
I want this reference to be returned in addition to having the capacity to add new values to the end of the LinkedLists with linkedHashMap.put(key, value) invocation.
Any pointers would be appreciated.

LinkedHashMap just stores its keys in a defined order (A LinkedList backs the KeySet). It isn't anything about how it handles collisions.
For what you've described, I think you'll have to implement things yourself. You're basically making a Map<KeyType, List<EntryType>>, with a "put" function that appends to the associated list. It isn't too much code.
I probably wouldn't make it actually extend Map, though, because what you've described doesn't really match the interface for that.

Related

HashMap has backing array, then why is it unordered

I read that HashMap has a backing array, where entries are stored (marked with bucket number, initial size 16). Arrays are ordered, and I can call get(n) to get the element at nth position. Then why is HashMap unordered and has no get(n) method?
It depends on your view of what ordered means.
Indeed HashMapss internally use an array or another collection that has a fixed ordering. However the order has nothing to do with insertion order or something like that. The elements are ordered, for example, in increasing size of their hash-values and they have nothing to do with some actual ordering on the elements themselves.
So HashMaps indeed have something like a get(n) method if you think of n being the hash-value of the key-element. The method is called get(*key*) and it first computes the hash-value of the given key-element and then looks the value up on the internal structure by using get(*hash-value*) on it.
Here is an image a quick search yield that shows the structure of HashSets:
Note that HashSets are kinda the same than HashMaps, they use the same technique and the same image applies. But instead of just inserting an element a map inserts a container that is identified by the key and additionally holds a value.
Just as a small overview. A hash-function is a function that given an object computes a small value, the hash-value out of it, using its properties. The computation usually can be done fast and a lookup on the internal array at the position given by the hash-value is thus also fast.
To your specific question, as an user of a HashMap you generally are not interested in what elements specifically hide behind hash-value 1 or 2 and so on, that is why they did not include such a method. However if you truly need to do that for a special application or so than you can always try to use Reflection to access the internals of your HashMap or you could also just write a small wrapper around the class that provides such a method.
A HashMap is divided into individual buckets. Buckets are initially backed by an array, however if the buckets get too large then they are converted to tree structures which are sorted based on hash codes. That fact alone destroys any guarantee it could make about preserving insertion order.
If you'd like to know more about how it's implemented, you can look at my answer to this question: HashMap Java 8 implementation

Is it possible to implement a MyHashMap backed by a given HashSet implementation?

As we all known, in Sun(Oracle) JDK, HashSet is implemented backed by a HashMap, to reuse the complicated algorithm and data structure.
But, is it possible to implement a MyHashMap using java.util.HashSet as its back?
If possible, how? If not, why?
Please note that this question is only a discussion of coding skill, not applicable for production scenarios.
Trove bases it's Map on it's Set implementation. However, it has one critical method which is missing from Set which is a get() method.
Without a get(Element) method, HashSet you cannot perform a lookup which is a key function of a Map. (pardon the pun) The only option Set has is a contains which could be hacked to perform a get() but it would not be ideal.
You can have;
a Set where the Entry is a key and a value.
you define entries as being equal when the keys are the same.
you hack the equals() method so when there is a match, that on a "put" the value portion of an entry is updated, and on a "get" the value portion is copied.
Set could have been designed to be extended as Map, but it wasn't and it wouldn't be a good idea to use HashSet or the existing Set implementations to create a Map.

java constantly sorted list with quick retrieval

I'm looking for a constantly sorted list in java, which can also be used to retrieve an object very quickly. PriorityQueue works great for the "constantly sorted" requirement, and HashMap works great for the fast retrieval by key, but I need both in the same list. At one point I had wrote my own, but it does not implement the collections interfaces (so can't be used as a drop-in replacement for a java.util.List etc), and I'd rather stick to standard java classes if possible.
Is there such a list out there? Right now I'm using 2 lists, a priority queue and a hashmap, both contain the same objects. I use the priority queue to traverse the first part of the list in sorted order, the hashmap for fast retrieval by key (I need to do both operations interchangeably), but I'm hoping for a more elegant solution...
Edit: I should add that I need to have the list sorted by a different comparator then what is used for retrieval by key; the list is sorted by a long value, the key retrieval is a String.
Since you're already using HashMap, that implies that you have unique keys. Assuming that you want to order by those keys, TreeMap is your answer.
It sounds like what you're talking about is a collection with an automatically-maintained index.
Try looking at GlazedLists which use "list pipelines" to efficiently propagate changes -- their SortedList class should do the job.
edit: missed your retrieval-by-key requirement. That can be accomplished with GlazedLists.syncEventListToMap and GlazedLists.syncEventListToMultimap -- syncEventListToMap works if there are no duplicate keys, and syncEventListToMultimap works if there are duplicate keys. The nice part about this approach is that you can create multiple maps based on different indices.
If you want to use TreeMaps for indices -- which may give you better performance -- you need to keep your TreeMaps privately encapsulated within a custom class of your choosing, that exposes the interfaces/methods you want, and create accessors/mutators for that class to keep the indices in sync with the collection. Be sure to deal with concurrency issues (via synchronized methods or locks or whatever) if you access the collection from multiple threads.
edit: finally, if fast traversal of the items in sorted order is important, consider using ConcurrentSkipListMap instead of TreeMap -- not for its concurrency, but for its fast traversal. Skip lists are linked lists with multiple levels of linkage, one that traverses all items, the next that traverses every K items on average (for a given constant K), the next that traverses every K2 items on average, etc.
TreeMap
http://download.oracle.com/javase/6/docs/api/java/util/TreeMap.html
Go with a TreeSet.
A NavigableSet implementation based on a TreeMap. The elements are ordered using their natural ordering, or by a Comparator provided at set creation time, depending on which constructor is used.
This implementation provides guaranteed log(n) time cost for the basic operations (add, remove and contains).
I haven't tested this so I might be wrong, so consider this just an attempt.
Use TreeMap, wrap the key of this map as an object which has two attributes (the string which you use as the key in hashmap and the long which you use to maintain the sort order in PriorityQueue). Now for this object, override the equals and hashcode method using the string. Implement the comparable interface using the long.
Why don't you encapsulate your solution to a class that implements Collection or Map?
This way you could simply delegate the retrieval methods to the faster/better suiting collection. Just make sure that calls to write-methods (add/remove/put) will be forwarded to both collections. Remember indirect accesses, like iterator.remove(). Most of these methods are optional to implement, but you have to deactivate them (Collections.unmodifiableXXX will help here in most cases).

Why HashSet internally implemented as HashMap [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Why does HashSet implementation in Sun Java use HashMap as its backing?
I know what a hashset and hashmap is - pretty well versed with them.
There is 1 thing which really puzzled me.
Example:
Set <String> testing= new HashSet <String>();
Now if you debug it using eclipse right after the above statements, under debugger variables tab, you will noticed that the set 'testing' internally is implemented as a hashmap.
Why does it need a hashmap since there is no key,value pair involved in sets collection
It's an implementation detail. The HashMap is actually used as the backing store for the HashSet. From the docs:
This class implements the Set interface, backed by a hash table (actually a HashMap instance). It makes no guarantees as to the iteration order of the set; in particular, it does not guarantee that the order will remain constant over time. This class permits the null element.
(emphasis mine)
The answer is right in the API docs
"This class implements the Set interface, backed by a hash table (actually a HashMap instance). It makes no guarantees as to the iteration order of the set; in particular, it does not guarantee that the order will remain constant over time. This class permits the null element.
This class offers constant time performance for the basic operations (add, remove, contains and size), assuming the hash function disperses the elements properly among the buckets. Iterating over this set requires time proportional to the sum of the HashSet instance's size (the number of elements) plus the "capacity" of the backing HashMap instance (the number of buckets). Thus, it's very important not to set the initial capacity too high (or the load factor too low) if iteration performance is important."
So you don't even need the debugger to know this.
In answer to your question: it is an implementation detail. It doesn't need to use a HashMap, but it is probably just good code re-use. If you think about it, in this case the only difference is that a Set has different semantics from a Map. Namely, maps have a get(key) method, and Sets do not. Sets do not allow duplicates, Maps allow duplicate values, but they must be under different keys.
It is probably really easy to use a HashMap as the backing of a HashSet, because all you would have to do would be to use hashCode (defined on all objects) on the value you are putting in the Set to determine if a dupe, i.e., it is probably just doing something like
backingHashMap.put(toInsert.hashCode(), toInsert);
to insert items into the Set.
In most cases the Set is implemented as wrapper for the keySet() of a Map. This avoids duplicate implementations. If you look at the source you will see how it does this.
You might find the method Collections.newSetFromMap() which can be used to wrap ConcurrentHashMap for example.
The very first sentence of the class's Javadoc states that it is backed by a HashMap:
This class implements the Set interface, backed by a hash table (actually a HashMap instance).
If you'll look at the source code of HashSet you'll see that what it stores in the map is as the key is the entry you are using, and the value is a mere marker Object (named PRESENT).
Why is it backed by a HashMap? Because this is the simplest way to store a set of items in a (conceptual) hashtable and there is no need for HashSet to re-invent an implementation of a hashtable data structure.
It's just a matter of convenience that the standard Java class library implements HashSet using a HashMap -- they only need to implement one data structure and then HashSet stores its data in a HashMap with the actual set objects as the key and a dummy value (typically Boolean.TRUE) as the value.
HashMap has already all the functionality that HashSet requires. There would be no sense to duplicate the same algorithms.
it allows you to easily and quickly determine whether an object is already in the set or not.

confusing java data structures

Maybe the title is not appropriate but I couldn't think of any other at this moment. My question is what is the difference between LinkedList and ArrayList or HashMap and THashMap .
Is there a tree structure already for Java(ex:AVL,red-black) or balanced or not balanced(linked list). If this kind of question is not appropriate for SO please let me know I will delete it. thank you
ArrayList and LinkedList are implementations of the List abstraction. The first holds the elements of the list in an internal array which is automatically reallocated as necessary to make space for new elements. The second constructs a doubly linked list of holder cells, each of which refers to a list element. While the respective operations have identical semantics, they differ considerably in performance characteristics. For example:
The get(int) operation on an ArrayList takes constant time, but it takes time proportional to the length of the list for a LinkedList.
Removing an element via the Iterator.remove() takes constant time for a LinkedList, but it takes time proportional to the length of the list for an ArrayList.
The HashMap and THashMap are both implementations of the Map abstraction that are use hash tables. The difference is in the form of hash table data structure used in each case. The HashMap class uses closed addressing which means that each bucket in the table points to a separate linked list of elements. The THashMap class uses open addressing which means that elements that hash to the same bucket are stored in the table itself. The net result is that THashMap uses less memory and is faster than HashMap for most operations, but is much slower if you need the map's set of key/value pairs.
For more detail, read a good textbook on data structures. Failing that, look up the concepts in Wikipedia. Finally, take a look at the source code of the respective classes.
Read the API docs for the classes you have mentioned. The collections tutorial also explains the differences fairly well.
java.util.TreeMap is based on a red-black tree.
Regarding the lists:
Both comply with the List interface, but their implementation is different, and they differ in the efficiency of some of their operations.
ArrayList is a list stored internally as an array. It has the advantage of random access, but a single item addition is not guaranteed to run in constant time. Also, removal of items is inefficient.
A LinkedList is implemented as a doubly connected linked list. It does not support random access, but removing an item while iterating through it is efficient.
As I remember, both (LinkedList and ArrayList) are the lists. But they have defferent inner realization.

Categories

Resources