My main question is if ListIterator or Iterator class reduces the time taken for removal of the elements from a given LinkedList and the same can be said while adding elements in the same given LinkedList using any one of the following classes above. What's the point of using the inbuilt functions of LinkedList class itself? Why should we perform any of the operations through LinkedList functions when we can use the ListIterator functions for better performance?
A ListIterator can indeed efficiently remove the node on which it is positioned. You can thus create a ListIterator, use next() two times to move the cursor, and then remove the node instantly. But evidently you did a lot of work before the actual removal.
Using ListIterator.remove is not more efficient "time complexity"-wise than removing through the LinkedList.remove(int index) if you need to construct the iterator. The LinkedList.remove method takes O(k) time, with k the index of the item you wish to remove. Removing this element with the ListIterator has the same timecomplexity since: (a) we create a ListIterator in constant time; (b) we call .next() k times, each operation in O(1); and (c) we call .remove() which is again O(1). But since we call .next() k times, this is thus an O(k) operation as well.
A similar situation happens for .add(..) on an arbitrary location (an "insert"), except that we here of course insert a node, not remove one.
Now since the two have the same time complexity, one might wonder why a LinkedList has such remove(int index) objects in the first place. The main reason is programmer's convenience. It is more convenient to call mylist.remove(5), than to create an iterator, use a loop to move five places, and then call remove. Furthermore the methods on a linked list guard against some edge-cases like a negative index, etc. By doing this manually you might end removing the first element, which might not be the intended behaviour. Finally code written is sometimes read multiple times. If a future reader reads mylist.remove(5), they understand that it removes the fifth element, wheres a solution with looping will require some extra brain cycles to understand what that part is doing.
As #Andreas says, furthermore the List interface defines these methods, and hence the LinkedList<T> should implement these.
Related
I have had this question for a while but I have been unsatisfied with the answers because the distinctions appear to be arbitrary and more like conventional wisdom that is sort of blindly accepted rather than assessed critically.
In an ArrayList it is said that insertion cost (for a single element) is linear. If we are inserting at index p for 0 <= p < n where n is the size of the list, then the remaining n-p elements are shifted over first before the new element is copied into position p.
In a LinkedList, it is said that insertion cost (for a single element) is constant. For instance if we already have a node and we want to insert after it, we rearrange some pointers and it's done quickly. But getting this node in the first place, I don't see how it can be done other than a linear search first (assuming it isn't a trivial case like prepending at the start of the list or appending at the end).
And yet in the case of the LinkedList, we don't count that initial search time. To me this is confusing because it's sort of like saying "The ice cream is free... after you pay for it." It's like, well, of course it is... but that sort of skips the hard part of paying for it. Of course inserting in a LinkedList is going to be constant time if you already have the node you want, but getting that node in the first place may take some extra time! I could easily say that inserting in an ArrayList is constant time... after I move the remaining n-p elements.
So I don't understand why this distinction is made for one but not the other. You could argue that insertion is considered constant for LinkedLists because of the cases where you insert at the front or back where linear time operations are not required, whereas in an ArrayList, insertion requires copying of the suffix array after position p, but I could easily counter that by saying if we insert at the back of an ArrayList, it is amortized constant time and doesn't require extra copying in most cases unless we reach capacity.
In other words we separate the linear stuff from the constant stuff for LinkedList, but we don't separate them for the ArrayList, even though in both cases, the linear operations may not be invoked or not invoked.
So why do we consider them separate for LinkedList and not for ArrayList? Or are they only being defined here in the context where LinkedList is overwhelmingly used for head/tail appends and prepends as opposed to elements in the middle?
This is basically a limitation of the Java interface for List and LinkedList, rather than a fundamental limitation of linked lists. That is, in Java there is no convenient concept of "a pointer to a list node".
Every type of list has a few different concepts loosely associated with the idea of pointing to a particular item:
The idea of a "reference" to a specific item in a list
The integer position of an item in the list
The value of a item that may be in the list (possibly multiple times)
The most general concept is the first one, and is usually encapsulated in the idea of an iterator. As it happens, the simple way to implement an iterator for an array backed list is simply to wrap an integer which refers to the position of the item in a list. So for array lists only, the first and second ways of referring to items are pretty tightly bound.
For other list types, however, and even for most other container types (trees, hashes, etc) that is not the case. The generic reference to an item is usually something like a pointer to the wrapper structure around one item (e.g., HashMap.Entry or LinkedList.Entry). For these structures the idea of accessing the nth element isn't necessary natural or even possible (e.g., unordered collections like sets and many hash maps).
Perhaps unfortunately, Java made the idea of getting an item by its index a first-class operation. Many of the operations directly on List objects are implemented in terms of list indexes: remove(int index), add(int index, ...), get(int index), etc. So it's kind of natural to think of those operations as being the fundamental ones.
For LinkedList though it's more fundamental to use a pointer to a node to refer to an object. Rather than passing around a list index, you'd pass around the pointer. After inserting an element, you'd get a pointer to the element.
In C++ this concept is embodied in the concept of the iterator, which is the first class way to refer to items in collections, including lists. So does such a "pointer" exist in Java? It sure does - it's the Iterator object! Usually you think of an Iterator as being for iteration, but you can also think of it as pointing to a particular object.
So the key observation is: given an pointer (iterator) to an object, you can remove and add from linked lists in constant time, but from an array-like list this takes linear time in general. There is no inherent need to search for an object before deleting it: there are plenty of scenarios where you can maintain or take as input such a reference, or where you are processing the entire list, and here the constant time deletion of linked lists does change the algorithmic complexity.
Of course, if you need to do something like delete the first entry containing the value "foo" that implies both a search and a delete operation. Both array-based and linked lists taken O(n) for search, so they don't vary here - but you can meaningfully separate the search and delete operations.
So you could, in principle, pass around Iterator objects rather than list indexes or object values - at least if your use case supports it. However, at the top I said that "Java has no convenient notion of a pointer to a list node". Why?
Well because actually using Iterator is actually very inconvenient. First of all, it's tough to get an Iterator to an object in the first place: for example, and unlike C++, the add() methods don't return an Iterator - so to get a pointer to the item you just added, you need to go ahead and iterate over the list or use the listIterator(int index) call, which is inherently inefficient for linked lists. Many methods (e.g., subList()) support only a version that takes indexes, but not Iterators - even when such a method could be efficiently supported.
Add to that the restrictions around iterator invalidation when the list is modified, and they actually become pretty useless for referring to elements except in immutable lists.
So Java's support of pointers to list elements is pretty half-hearted an so it's tough to leverage the constant time operations that linked list offers, except in cases such as adding to the front of a list, or deleting items during iteration.
It's not limited to lists, either - the ConcurrentQueue is also a linked structure which supports constant time deletes, but you can't reliably use that ability from Java.
If you're using a LinkedList, chances are you're not going to use it for a random access insert. LinkedList offers constant time for push (insert at the beginning) or add (because it has a ref to the final element IIRC). You are correct in your suspicion that an insert into a random index (e.g. insert sorted) will take linear time - not constant.
ArrayList, by contrast, is worst case linear. Most of the time it simply does an arraycopy to shift the indices (which is a low-level shift that is constant time). Only when you need to resize the backing array will it take linear time.
.Net's LinkedList has a nice basic linked list feature that allows me to keep a node reference, a "pointer" into a linked list so to speak, and use that reference to navigate and manipulate the linked list from there in an O(1) fashion. To wit:
LinkedList<string> linkedList = new LinkedList<string>();
LinkedListNode<string> cur = linkedList.First;
LinkedListNode<string> rememberThis = null;
do
{
if (...)
rememberThis = cur;
} while ((cur = cur.Next) != null);
if (rememberThis != null)
linkedList.AddAfter(rememberThis, "added-value");
I'm failing to see how I can do the same in Java, namely
Iterating through a LinkedList (this of course is O(n))
Making note of a list node
Use that node reference even after further iteration for O(1) insertion
Java does give me access to a ListIterator, which allows me to do manipulation of the list around the item where I'm at, but I cannot seem to iterate on, while holding on to a previous node.
Am I missing something?
Am I missing something?
No. LinkedList#ListItr class doesn't have bookmark. So you cannot keep iterating on, while holding on to a previous node.
There's no O(1) method addAfter(Node node, E element) in LinkedList, because LinkedList#Node is private. There's add(int index, E element) which is O(n). Too sad.
A workaround is to use 2 ListIterator. One keep iterating on, the other one stops at the position you want to remember. Then you can use ListIterator#add(E e) in the end, which is O(1). But the first one cannot modify the list otherwise it'll break the second one.
No, don't do that, that will breaks. If you ever modified the LinkedList structurally later, a ConcurrentModificationException will be thrown next time you move the ListIterator's crusor, and there is no way around it. This is known as fail-fast behavior.
Anyway, Iterators aren't meant to hold a cursor in a list for a long time. And currently there is no way to hold a cursor to a certain position in a list, including LinkedLists, for a long time, even for openjdk 9 ea IIRC. The reason behind it may be the ambiguity of how to move the existing cursors. This may be obvious in manyq situations, but not always.
After all, it's (almost) impossible to add it to a superinterface of LinkedList (Queue,Deque,List) now.(This is clearly a API design fault!) You can create your own version of LinkedList to implement that.
If you really want to keep a reference, somehow. You will have to hack the internals with reflections, which may doesn't worth it at all.
I am implementing a public method that needs a data structure that needs to be able to handle insertion at two ends. Since ArrayList.add(0,key) will take O(N) time, I decide to use a LinkedList instead - the add and addFirst methods should both take O(1) time.
However, in order to work with existing API, my method needs to return an ArrayList.
So I have two approaches:
(1) use LinkedList,
do all the addition of N elements where N/2 will be added to the front and N/2 will be added to the end.
Then convert this LinkedList to ArrayList by calling the ArrayList constructor:
return new ArrayList<key>(myLinkedList);
(2) use ArrayList and call ArrayList.add(key) to add N/2 elements to the back and call ArrayList.add(0,key) to add N/2 elements to the front. Return this ArrayList.
Can anyone comment on which option is more optimized in terms of time complexity? I am not sure how Java implements the constructor of ArrayList - which is the key factor that decides which option is better.
thanks.
The first method iterates across the list:
http://docs.oracle.com/javase/1.5.0/docs/api/java/util/ArrayList.html#ArrayList(java.util.Collection)
Constructs a list containing the elements of the specified collection, in the order they are returned by the collection's iterator.
which, you can reasonably infer, uses the iterator interface.
The second method will shift elements every time you add to the front (and resize every once in a while):
http://docs.oracle.com/javase/1.5.0/docs/api/java/util/ArrayList.html#add(int, E)
Inserts the specified element at the specified position in this list. Shifts the element currently at that position (if any) and any subsequent elements to the right (adds one to their indices).
Given the official assumptions regarding the functions, the first method is more efficient.
FYI: you may get more mileage using LinkedList.toArray
I would suggest that you use an ArrayDeque which is faster than a LinkedList to insert elements at two ends and consumes less memory. Then convert it to an ArrayList using method #1.
Main Question:
I'm seeking some way to give an object within a LinkedList a reference to itself within the list so that it can (efficiently) remove itself from said list (Without sorting through the list looking for itself. I'd like it to just directly cut itself from the list and tie the previous and next items together.).
Less Necessary Details:
I've done a reasonable amount of googling and not found anything other than people advising not to use circular references.
I'd like to do this as I'm designing a game, and in the game objects can implement various interfaces which allow them to be in various lists which are looped through in a prioritized manner. A single object might be in a draw loop, a loop which steps it through the frames of its animation, a high priority logic loop, and a low priority logic loop all at the same time. I would like to implement a removeFrom|TypeOfLoop| method in each appropriate interface so that if an object decides that it no longer needs to be in a loop it can directly remove itself. This keeps the objects that do the actual looping pleasantly simple.
Alternatively, If there is no way to do this, I'm thinking of implementing a flagging system where the list checks to see if each item wants to be removed based on a variable within the item. However, I dislike the idea of doing this enough to possibly just make my own LinkedList that is capable of removing by reference.
I did this recently. I was looking for an O(1) add O(1) remove lock-free Collection. Eventually I wrote my own Ring because I wanted a fixed-size container but you may find the technique I used for my first attempt of value.
I don't have the code in front of me but if memory serves:
Take a copy of Doug Lea's excellent Concurrent Doubly LinkedList and:
Expose the Node class. I used an interface but that is up to you.
Change the add, offer ... methods to return a Node instead of boolean. It is now no longer a java Collection, but see my comment later.
Expose the delete method of the Node class or add a remove method that takes a Node.
You can now remove elements from the list in O(1) time, and it is Lock Free.
Added
Here's an implementation of the remove(Node) method taken from his Iterator implementation. Note that you have to keep trying until you succeed.
public void remove(Node<E> n) {
while (!n.delete() && !n.isDeleted())
;
}
I think your alternative is much better than letting the item remove itself from the loop. It reduces the responsibilities of the objects in the list, and avoids circular references.
Moreover, You could use Guava's Iterables.filter() method and iterate over a filtered list, rather than checking explicitely if the object should be rendered or not at each iteration.
Even if what you want to do was possible, you would get a ConcurrentModificationException when removing an object from the list while iterating on it. The only way to do that is to remove the current object from the iterator.
If you're using LinkedList, there's no more efficient way to remove an item than to iterate over it and do iterator.remove() when you find your element.
If you're using google collections or guava, you can do it in a oneliner:
Iterables.removeIf(list.iterator(), Predicates.equalTo(this));
The easiest way would be changing your algorithm to use Iterator to iterate over List objects and use Iterator.remove() method to remove current element.
If I am using a for loop (the standard for loop, not an enhanced for statement), I fail to see how an iterator increases efficiency when searching through a collection. If I have a statement such as:
(Assuming that aList is a List of generic objects, type E, nextElement refers to the next element within the list)
for (int index = 0; index < aList.size(); index++){
E nextElement = aList.get(index);
// do something with nextElement...
}
and I have the get method that looks something like:
Node<E> nodeRef = head;
for (int i = 0; i < index; i++){
nodeRef = nodeRef.next;
// possible other code
}
this would essentially be searching through the List, one element at a time. However, if I use an iterator, will it not be doing the same operation? I know an iterator is supposed to be O(1) speed, but wouldn't it be O(n) if it has to search through the entire list anyway?
It's not primarily about efficiency, IMO. It's about abstraction. Using an index ties you to collections which can retrieve an item for a given index efficiently (so it won't work well with a linked list, say)... and it doesn't express what you're trying to do, which is iterate over the list.
With an iterator, you can express the idea of iterating over a sequence of items whether that sequence can easily be indexed or not, whether the size is known in advance or not, and even in cases where it's effectively infinite.
Your second case is still written using a for loop which increments an index, which isn't the idiomatic way of thinking about it - it should simply be testing whether or not it's reached the end. For example, it might be:
for (Node<E> nodeRef = head; nodeRef != null; nodeRef = nodeRef.next)
{
}
Now we have the right abstraction: the loop expresses where we start (the head), when we stop (when there are no more elements) and how we go from one element to the next (using the next field). This expresses the idea of iterating more effectively than "I've got a counter starting at 0, and I'm going to ask for the value at the particular counter on each iteration until the value of the counter is greater than some value which happens to be the length of the list."
We're fairly used to the latter way of expressing things, but it doesn't really say what we mean nearly as may as the iterator approach.
Iterators are not about increasing efficiency, they're about abstraction in the object-oriented sense. Implementation-wise, the iterator is doing something similar to what you're doing, going through your collection one element at a time, at least if the collection is index-based. It's supposed to be O(1) when retrieving the next element, not the entire list. Iterators help mask what collection is underneath as well, it could be a linked list or a set, etc, but you don't have to know.
Also, notice how connected your for loop is to your specific logic that you want to do on each element, while with an iterator you can abstract out the looping logic from whatever action you want to do.
I think the question you are asking refers to the efficiency of iterators vs. a for-loop using an explicit get on the collection object.
If you write code with a naive version of get, and you iterate through your list using it, then it takes you
one step to "get" the first element
two steps to "get" the second
three steps to get the third
...
n steps to get the last
for a total of n(n-1)/2 operations, which is O(n^2).
But if you used an iterator which internally kept track of the next element (i.e. one step to advance), then iterating the whole list is O(n), a big improvement.
Like Jon said, iterators have nothing to do with efficiency they just abstract the concept of being able to iterate over a collection. So you are right, if you are just searching through a list there is no real benefit to an iterator over a for loop, but in some cases iterators provide convenient ways for doing things that would be difficult with a simple for loop. For example:
while(itr.hasNext()) {
if(itr.next().equals(somethingBad);
itr.remove();
}
In other cases iterators provide a way to traverse the elements of a collection, that you can not fetch by index (eg a hashset). In this case a for loop is not an option.
Remember that it's also a Design Pattern.
"The Iterator Pattern allows traversal of the elements of an aggregate without exposing the underlying implementation. It also places the task of traversal on the iterator object, not on the aggregate, which simplifies the aggregate interface and implementation, and places the responsibility where it should be." (From: Head First Design Pattern)
It's about encapsulation and also the 'single responsibility' principle.
Cheers,
Wim
You are using a linked list here. Iterating over that list without an iterator takes O(n^2) steps, where n is the size of the list. O(n) for iterating over the list and O(n) each time for finding the next element.
The iterator, on the other hand, remembers the node it has visited the last time, and therefore needs only O(1) to find the next element. So eventually the complexity is O(n), which is faster.