Lots of classes in Java are prefixed with the "Linked" identifier, i.e. LinkedList, LinkedBlockingQueue, LinkedHashmap, etc. What does the term "linked" mean?
A Java LinkedList, is a List implementation that uses linked lists. In contrast, one could also implement them using for instance dynamic arrays, which is what ArrayList does.
A LinkedBlockingQueue follows much the same idea as a LinkedList.
A LinkedHashMap, is a normal hash table (which provides efficient random access), combined with a doubly-linked list (which provides consistent iteration order).
Thus, the Linked prefix means, that a linked structure (such as singly- or doubly-linked lists) are a key part of their underlying implementation.
Here, the term "linked" means that each member of the collection is aware of the next member in the collection via a "link"; therefore, each member can be stored in a non-sequential location in memory.
The above, very simple contribution is derived from the Linked List Wikipedia article mentioned in a comment made on the original question.
There are some Collection in java which starts with Link. Regardless what is followed by Link there is some common property of these collections -
1. These collections are always ordered
2. Can insert element at any position. For example you can insert item at a LinkdeList.
3. mantains a link to connect with the next/previous item where each item is called node. You ma consider a simplified version of node like this -
class Node{
int value;
Node next;
Node previous;
}
Here Node next and Node previous works as a link to the next/previous node from the current node.
Related
If we are implementing a LRU cache using HashMap and DoublyLinkedList, What is the best way to implement evict() method with O(1) time complexity?
LinkedList from Java didn't expose the Node type (which is a private static inner class).
So you can't remove it in in O(1), because a sequential scan is required.
To get O(1), you need to be able to access the Node type, so that could remove it without scan.
You have to write it by yourself. Fortunately, a doubly linked list is relatively easy to write, and it's a pretty beneficial & fun task to do.
How to remove with a given Node?
Refer to this answer: https://stackoverflow.com/a/54593530
The method LinkedList.java -> removeNode() remove a given node, without sequential scan.
The code in this answer is for a singly linked list, the remove for a doubly linked list is even simpler in some case.
Tips:
If the given node is the end node in linked list, then you need the previous node too.
But that's for singly linked list, for a doubly linked node, the node itself contains the previous node, so you don't have to pass previous node to the removeNode() method.
BTW
Why it's beneficial?
linked list is the most basic structure (except array and bits), that some other very basic structures could built base on.
e.g both queue and stack could be implemented easily with a linked list.
Concurrent access
java.util.LinkedList is not thread-safe, your LRU might needs some concurrent control, but I'm not sure.
If need, then java.util.concurrent.ConcurrentLinkedDeque is a good example to refer to.
#Update LinkedHashMap
java.util.LinkedHashMap, is a combination of hashtable & doubly linked list.
Mechanism:
It extends HashMap to get the O(1) complexity for the common operations.
And use doubly linked list to keep track of insertion order.
head is the eldest item, and tail is the newest item.
It can be used to impl some kind of cache, though I am not sure will it be fully qualified for your requirement.
Java offers a LinkedList implementation of the List interface which is actually a doubly-linked list. If we do the following:
linkedlist.remove(obj);
and then:
linkedlist.add(obj);
we actually remove the object obj from the linkedlist and reinsert it from the right-end (tail).
We can also implement manually a linkedlist with nodes, head, and tail. In languages such as C++ which have some low-level characteristics, one can use pointers for the next and previous object of the obj object. Thus we don't have actually to remove the item but just to update the previous and next pointers.
Is there any data structure in Java with which we can have the same effect (and thus the same performance gain of removing only "pointers" insted of the objects themselves)?
Note that I would like to use a ready-to-go data structure instead of manually writting my one linkedlist implementation (and perhaps reinventing the wheel). Moreover, please note that it has not neccessarily to be a linkedlist - it might be, for example, some kind of queue such as an ArrayDeque.
EDIT: To put it a little differently, if internally the LinkedList implementation of the List interface in Java makes use of prev and next pointers, then why l.remove(obj) is O(n) and not O(1)? And thus in practice, when you have a LinkedList with many millions of objects (as in my case), it takes so long time to do this removal and re-insertion? (Same with ArrayList, same with ArrayDeque - very long time).
Java does exactly the same thing as C++. All references to objects are pointers. So, in a Node like
public class Node [
private Object value;
private Node next;
private Node previous;
}
value, next and previous are pointers (called references in Java) respectively to the value of the node, the next node and the previous node.
The difference with C++ is that you don't have pointer arithmetics: value++, for example, doesn't compile and doesn't make the pointer reference what is located at the next memory address.
EDIT:
The LinkedList class doesn't expose its nodes to the outside. They're completely private to the LinkedList. So, removing an object consists in iterating over all the nodes to find the one having a value which is equal (in terms of Object.equals()) to the given object, and to remove the found node from the list. Removing the node consists in making the previous point to the next, and vice-versa. This is why removing an object is O(n). Of course, if you had access to the Node and were able to remove it, the operation would be O(1). If you need that, you'll have to implement your own LinkedList, exactly the same way as you would do it in C++. References are pointers.
To answer the question about why remove(Object obj) is O(n):
First, in order for it to be O(1), you'd need to give remove the actual Node reference, not the Object. The Object won't point back to the Node. In order to find the correct Node to return, the code must either search the list to find the Node that contains the object, or keep a hash map that would let it find the Node. The actual LinkedList implementation does a simple linear search. However, even a hash map would, technically, be O(n) since there's the possibility that all the objects on the list have the same hash code, although it would still be much faster than a linear search.
Second, remove(Object obj) is defined in terms of equals. If there is a different object obj2 that was added to the list, and obj2.equals(obj) is true, then remove(obj) will remove obj2 if the obj reference itself was never added to the list.
To really do this right, you'd need either an add method that returns a node reference, so that your program could keep track of the node reference and use that as the remove argument; or you could require that objects on the list implement some sort of NodePointer interface:
interface NodePointer {
void setNodePointer(Object node);
Object getNodePointer();
}
that the list would then use to stuff the node pointers into the objects. (But that would probably mean an object could only live on one linked list at a time, a restriction that LinkedList doesn't impose.) In either case, I don't think this is something the Java library supports.
In this question How can I efficiently select a Standard Library container in C++11? is a handy flow chart to use when choosing C++ collections.
I thought that this was a useful resource for people who are not sure which collection they should be using so I tried to find a similar flow chart for Java and was not able to do so.
What resources and "cheat sheets" are available to help people choose the right Collection to use when programming in Java? How do people know what List, Set and Map implementations they should use?
Since I couldn't find a similar flowchart I decided to make one myself.
This flow chart does not try and cover things like synchronized access, thread safety etc or the legacy collections, but it does cover the 3 standard Sets, 3 standard Maps and 2 standard Lists.
This image was created for this answer and is licensed under a Creative Commons Attribution 4.0 International License. The simplest attribution is by linking to either this question or this answer.
Other resources
Probably the most useful other reference is the following page from the oracle documentation which describes each Collection.
HashSet vs TreeSet
There is a detailed discussion of when to use HashSet or TreeSet here:
Hashset vs Treeset
ArrayList vs LinkedList
Detailed discussion: When to use LinkedList over ArrayList?
Summary of the major non-concurrent, non-synchronized collections
Collection: An interface representing an unordered "bag" of items, called "elements". The "next" element is undefined (random).
Set: An interface representing a Collection with no duplicates.
HashSet: A Set backed by a Hashtable. Fastest and smallest memory usage, when ordering is unimportant.
LinkedHashSet: A HashSet with the addition of a linked list to associate elements in insertion order. The "next" element is the next-most-recently inserted element.
TreeSet: A Set where elements are ordered by a Comparator (typically natural ordering). Slowest and largest memory usage, but necessary for comparator-based ordering.
EnumSet: An extremely fast and efficient Set customized for a single enum type.
List: An interface representing a Collection whose elements are ordered and each have a numeric index representing its position, where zero is the first element, and (length - 1) is the last.
ArrayList: A List backed by an array, where the array has a length (called "capacity") that is at least as large as the number of elements (the list's "size"). When size exceeds capacity (when the (capacity + 1)-th element is added), the array is recreated with a new capacity of (new length * 1.5)--this recreation is fast, since it uses System.arrayCopy(). Deleting and inserting/adding elements requires all neighboring elements (to the right) be shifted into or out of that space. Accessing any element is fast, as it only requires the calculation (element-zero-address + desired-index * element-size) to find it's location. In most situations, an ArrayList is preferred over a LinkedList.
LinkedList: A List backed by a set of objects, each linked to its "previous" and "next" neighbors. A LinkedList is also a Queue and Deque. Accessing elements is done starting at the first or last element, and traversing until the desired index is reached. Insertion and deletion, once the desired index is reached via traversal is a trivial matter of re-mapping only the immediate-neighbor links to point to the new element or bypass the now-deleted element.
Map: An interface representing an Collection where each element has an identifying "key"--each element is a key-value pair.
HashMap: A Map where keys are unordered, and backed by a Hashtable.
LinkedhashMap: Keys are ordered by insertion order.
TreeMap: A Map where keys are ordered by a Comparator (typically natural ordering).
Queue: An interface that represents a Collection where elements are, typically, added to one end, and removed from the other (FIFO: first-in, first-out).
Stack: An interface that represents a Collection where elements are, typically, both added (pushed) and removed (popped) from the same end (LIFO: last-in, first-out).
Deque: Short for "double ended queue", usually pronounced "deck". A linked list that is typically only added to and read from either end (not the middle).
Basic collection diagrams:
Comparing the insertion of an element with an ArrayList and LinkedList:
Even simpler picture is here. Intentionally simplified!
Collection is anything holding data called "elements" (of the same type). Nothing more specific is assumed.
List is an indexed collection of data where each element has an index. Something like the array, but more flexible.
Data in the list keep the order of insertion.
Typical operation: get the n-th element.
Set is a bag of elements, each elements just once (the elements are distinguished using their equals() method.
Data in the set are stored mostly just to know what data are there.
Typical operation: tell if an element is present in the list.
Map is something like the List, but instead of accessing the elements by their integer index, you access them by their key, which is any object. Like the array in PHP :)
Data in Map are searchable by their key.
Typical operation: get an element by its ID (where ID is of any type, not only int as in case of List).
The differences
Set vs. Map: in Set you search data by themselves, whilst in Map by their key.
N.B. The standard library Sets are indeed implemented exactly like this: a map where the keys are the Set elements themselves, and with a dummy value.
List vs. Map: in List you access elements by their int index (position in List), whilst in Map by their key which os of any type (typically: ID)
List vs. Set: in List the elements are bound by their position and can be duplicate, whilst in Set the elements are just "present" (or not present) and are unique (in the meaning of equals(), or compareTo() for SortedSet)
It is simple: if you need to store values with keys mapped to them go for the Map interface, otherwise use List for values which may be duplicated and finally use the Set interface if you don’t want duplicated values in your collection.
Here is the complete explanation http://javatutorial.net/choose-the-right-java-collection , including flowchart etc
Map
If choosing a Map, I made this table summarizing the features of each of the ten implementations bundled with Java 11.
Common collections, Common collections
In the implementation of HashMap, linked lists are used to represent elements in buckets.
Each Entry has a element to the next Entry. See: Ref. However, in the implementation for the LinkedList class, each element has a reference to its previous element and its next element see Ref. Just trying to figure out why previous is important in one linked list and not another?
Entry (internal class of HashMap) is not a part of general-use linked list (as LinkedList is). It's sole purpose is to iterate over it in forward direction looking for an element. So it does not need a previous reference.
the previous reference make the LinkedList a bidirectional List,this makes it possible to reversely iterate on a List .
The reference to the previous element is not needed in a linked list, stricly speaking. The java.util.LinkedList is actually a doubly-linked list. This is needed for an efficient implementation of the following operations:
add(E), which append at the end of the list;
getLast(), which retrieves the last element of the list;
ListIterator.previous() which allow traversal of the list in reverse order.
Said operations are of no use for the linked list of Map.Entry.
Note that while getLast() is a LinkedList adition to the list interface, the two other are required by the said interface.
The LinkedList is a general-purpose implementation. You may want to iterate over it backwards. For Maps, when searching a bucket it only iterates forward. Since there is not need to iterate backward, it is not implemented.
I am looking for a library with a Red-black tree and Linked list implementation offering iterators which are not fail-fast. I would like to have the same functionality as I do in C++ using STL and that is:
insertion in the tree/list does not invalidate any iterators
removal invalidates only the iterator that points at the element being removed
it is possible to somehow store the "position" of the iterator and refer to the value it is pointing at
This implementation would be nice as it would offer the ability to modify the list/tree while using the a part of it. Here are some examples:
obtaining adjacent element in linked list / red-black tree to some stored value in O(1)
batch insertions / removals (no constraints such as one removal per position increment)
splitting linked list in O(1) via position of the iterator
more efficient removals when having stored position of the iterator (e.g. by keeping iterators to a position in linked list, removal is O(1), not O(N))
I would also like that library/source code/implementation to have some Apache/GPL-friendly licence and is fairly extensible (so I can make my own modifications to implement some operations such as the ones from the examples above).
If there exists no such library, is there some other library which could help me in implementing those two data structures on my own?
These are both pretty easy to implement yourself. The iterator has to do three things:
Only store two references, one to the "current" element and one to the outer container object (the blob storing the root reference (tree) or the head/tail references (list)).
Be able to determine whether the referred-to element is valid (easy, if parent pointer is null (tree) or prev/next pointers are null (list) then this better be the tree root or the head/tail of the list). Throw something when any operations at all are attempted on an invalid iterator.
Be able to find the prev/next element.
There are a couple gotchas: for the list, it would be a pain to support the java.util.ListIterator's nextIndex() and prevIndex() efficiently, and, of course, now you have the problem of dealing with concurrent modifications! Here's an example of how things could go bad:
while (it.hasNext()) {
potentially_delete_the_last_element()
it.next() // oops, may throw NoSuchElementException.
}
The way to avoid this would be never to modify the list between checking if it has next/prev and actually retrieving next/prev.