Which is faster in accessing elements from Java collections [closed] - java

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
Improve this question
I am trying to understand which is faster in accessing elements from collections in Java like ArrayList, LinkedList, HashSet, TreeSet, HashMap, TreeMap etc.
From this question: Suitable java collection for fast get and fast removal, I got to know that ArrayList takes O(1) and TreeMap as O(log n)
where as this: Map/ArrayList: which one is faster to search for an element shows that ArryList is O(n), HashMap as O(1) and TreeMap as O(log n)
where as this: Why is it faster to process a sorted array than an unsorted array? says that sorted array is faster than unsorted array. As the elements in TreeMap are sorted then can I assume all sorted collections are faster than un-sorted collections?
Please help me in understanding which is faster to use in accessing elements from java collections of list, set, map etc implementations.

Every collection type is suitable for a particular scenario. There is no fastest or best collection.
If you need fast access to elements using index, ArrayList is your answer.
If you need fast access to elements using a key, use HashMap.
If you need fast add and removal of elements, use LinkedList (but it has a very poor index access performance).
and so on.

It depends whether you want to access an element as index based(in case of list) or see if an Object exists in the Collection
If you want to access an element index based,then arraylist is faster as it implements RandomAccess Marker interface and is internally backed by an array.
Sets are internally backed by Map ,so performance of Map and Set is same(Set use a dummy Object as value in key-value pair).I would suggest you to use a HashSet.
The problem that many programmers dont notice is that performance of Hashset or HashMap is best O(1) when the hashing function of Key Object is good,ie. it produces different values for different Objects (though this is not a strict requirement).
NOTE :- If you are Hashing funciton is not good,it degrades to a LinkedList internally and its performance degrades to O(n)
My personal preference is to Use EnumMap or EnumSet.It simply uses the Enum values for its functioning and programmers dont have to worry about the Enum's hashcode/equals function.For rest other cases,use HashSet or HashMap(if you dont have to make it ordered)

Related

Benefits of linked list over array [duplicate]

This question already has answers here:
When to use LinkedList over ArrayList in Java?
(33 answers)
Closed 10 months ago.
I really don't see how linked lists are better than array , the insertion and deletion complexities are same , eg. , In array the insertion at rear is O(1) while for linked lists the insertion at head is O(1) , and simillarly insertion in arrays at front is O(n) but for the later it is O(n) to insert at the rear end.
Apart from the only fact that linked lists are dynamic in nature ,I dont see any benefits of linked lists over arrays. Moreover , I can use a dynamic array to counter that problem.
Again Array also have better results when we want to access an element.
So can anybody please tell me why are linked lists better than array? And if they are not better , then why do we use it?
No data structure is universally better than another data structure. There are benefits and drawbacks and which is better depends on what benefits and drawbacks are more important for your use case.
the insertion and deletion complexities are same
Firstly, it isn't possible to insert or delete elements of arrays at all. Their size remains constant. But I'll assume that you meant "dynamic array" data structure i.e. std::vector and java.util.ArrayList (not to be confused with dynamically allocated array which is also called "dynamic array").
The insertion and deletion complexities between linked lists and dynamic array are not the same.
but for the later it is O(n) to insert at the rear end.
Given a an iterator to a linked linked list, you can insert after the pointed element in constant time. Hence, if you maintain an iterator to the end of the list, you can insert there in constant time.
An important advantage of linked list over the dynamic array, besides the complexity, is that the elements remain stable in memory. In vector, if adding an element exceeds the capacity, then all iterators and references to existing elements are invalidated. Iterators as well as references to linked list elements remain valid until the element is erased. If you need this property, then using a dynamic array is not an option.
Linked lists, as independent data structures, are rarely used.
Quite often, however, objects are linked into a list by incorporating next and maybe prev pointers directly into the objects themselves. As a data structure, that is often referred to as an "intrusive" linked list.
Sometimes this is used to add features to existing data structures. In Java you can see this in LinkedHashMap, which links the map entries together into a list, preserving their insertion or access order and allowing you to use it as an LRU cache. Similarly, leaf nodes in a B+tree are often linked onto a list to simplify traversal. There are many examples.

Stack implementation java - LinkedList vs Vector [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I wanted to know why Stack is implemented using Vector and not with LinkedList. As far as I know, LinkedList provides more efficient structure for deletion and insertion of elements. So, why stack is implemented using vector and not LinkedList. Java implements Queue interface with LinkedList and since in both stack and queue, insertion and deletion is the primary function, why not linkedlist for Stack.
Stack and Vector are both old classes.
If you read the Javadoc of Stack, you'll see that it suggests using Deque instead:
A more complete and consistent set of LIFO stack operations is provided by the Deque interface and its implementations, which should be used in preference to this class.
And LinkedList does implement the Deque interface.
As far as I know, LinkedList provides more efficient structure for deletion and insertion of elements.
That is not actually true ... in the context of stack operations. (Or in general).
A Vector is a form of array list. It works by allocating an array to hold a number of elements, and using array indexing to access and update the list. Insertion and deletion at a random position in the Vector is expensive because it entails copying multiple element references.
However, that not what a Stack requires. It actually requires insertion and deletion exclusively at the end of the Vector, and that is cheap. In most cases, insertion and deletion simply involves assigning an element into an array cell and adjusting the Vector object's length field. It only gets expensive if there is not enough space in the array. Then the array has to be "grown" by creating a new one and copying the elements. But when an array list grows the array exponentially (e.g. by doubling its size), the math says that the amortized cost is O(1) over the lifetime of the array list.
By contrast, every time you insert an element into a LinkedList, it involves allocating a new internal "node" object to hold the element. That is more expensive on average than a Vector insertion, especially when you take into account the GC costs incurred over the lifetime of the "node" object.
It also turns out a LinkedList uses up to 4 times as much memory per element as a Vector does, assuming we are using 64 bit references.
In short, a Vector is more efficient and uses less space than a LinkedList for a stack data structure. The correct design choice was made1.
1 - As you would expect. We can assume that the engineers who designed and maintained Java over the last ~25 years knew what they were doing. Or that the tens of thousands of other people who have looked at that code since it was written would also have noticed a (hypothetical!) mistake of that magnitude and logged a bug report.

sort while inserting or copy and sort

I have an iterator that gives me n elements. Currently I copy them one by one into an ArrayList and then call Collections.Sort() on that list to obtain a sorted ArrayList. This takes nlog(n)+n operations. Is there a faster way to do it, i.e. can I already use the insertion operation to a certain degree?
The iterator does not give any sorting, the elements occur pretty much randomly.
if you have only that iterator, I don't see faster solutions. note that nlogn+n is also O(nlogn).
if you want to "sort while inserting", you need do binary search on each insertion, it would be O(nlogn) too. I don't think it would be much faster than what you have.
TreeSet can save you from the binary search implementation, but basically it is the same logic.
Since an iterator is not a collection nor container, it is not possible to sort directly in the iterator, like you already noticed. The method that you are using seems to be the best solution in this case.
If your elements are unique you could drop them into a TreeSet and then copy them out of the TreeSet into an ArrayList. That may not actually be any faster than what you are already doing though.
Beyond that you are unlikely to be able to optimise further than you already have. Writing your own insertion sort would almost certainly be slower than just using the highly optimised Java sort routines.
You could consider looking at the new Java Streams API in Java 8 though. That would allow you to do this by opening the iterator as a stream, sorting it, then collating it to your final collection.
http://docs.oracle.com/javase/8/docs/api/java/util/stream/package-summary.html
If you have an object rather than raw data type (such as int , double) in your array, the cost of the object copy must be considered. In this situation, sort the array index may be a better way. Use search data structure map/set is better only when you need to process sorting and inserting simultaneously.

When to use each Java Collections data structure [duplicate]

This question already has answers here:
When to use LinkedList over ArrayList in Java?
(33 answers)
Closed 8 years ago.
I see that there are a ton of generic data structures provided in Java. They all implement List, so they can be used almost interchangeably, but when would I want to use each? Personally, I stick to LinkedList because it's something I'm "familiar" with. I'm not asking for an explanation of every single structure, but can you explain some of the more common ones and give their uses, as well as compare and contrast the uses of "Vector-like" structures?
It depends on the performance characteristics and behavior you are looking for.
For example in a LinkedList add, delete, and retrieve are O(1), O(1), and O(n), whereas for an ArrayList, the same operations are O(n), O(n), and O(1) if using get(int) and O(n) if using get(Object). However ArrayList uses less memory than LinkedList per entry.
One often uses Vector<type> to add elements to the structure that are part of the same collection, but do not have any relationship to other members (other than being part of the same collection). A LinkedList indicates that there is some sort of ordering that is important among the members of the collection.

What are the pros and cons of a TreeSet [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
Just wondering what the pros and cons of a TreeSet is, if anyone could tell me please? Thanks!
One of the Collection classes. It lets you access the elements in your collection by key, or sequentially by key. It has considerably more overhead than ArrayList or HashMap. Use HashSet when you don’t need sequential access, just lookup by key. Use an ArrayList and use Arrays. sort if you just want the elements in order. TreeSet keeps the elements in order at all times. With ArrayList you just sort when you need to.
With TreeSets the key must be embedded in the object you store in the collection. Often you might have TreeSet of Strings. All you can do then is tell if a given String is in the Set. It won’t find you an associated object he way a Treemap will. With a TreeMap the keys and the objects they are associated with are separate.
TreeSet and its brother TreeMap oddly have nothing to do with representing trees. Internally they use a tree organisation to give you an alphabetically sorted Set/Map, but you have no control over links between parents and children.
Internally TreeSet uses red-black trees. There is no need to presort the data to get a well-balanced tree. On the other hand, if the data are sorted (ascending or descending), it won’t hurt as it does with some other types of tree.
If you don’t supply a Comparator to define the ordering you want, TreeSet requires a Comparable implementation on the item class to define the natural order.
Cons: One pitfall with TreeSet is that it implements the Set interface in an unexpected way.
If a TreeSet contains object a, then object b is considered part of the set if a.compareTo(b) returns 0, even if a.equals(b) is false, so if compareTo and equals isn't implemented in a consistent way, you are in for a bad ride.
This is especially a problem when a method returns a Set, and you don't know if the implementation is a TreeSet or, for instance, a HashSet.
The lesson to learn here is, always avoid implementing compareTo and equals inconsistently. If you need to order objects in a way that is inconsistent with equals, use a Comparator.
TreeSet:
Pros: sorted, based on a red/black tree algorithm, provides O(log(N)) complexity for operations.
Cons: value must either be Comparable or you need to provide Comparator in the constructor. Moreover, the HashSet implementation provides better performance as it provides ~O(1) complexity.
TreeSet fragments memory and has additional memory overheads. You can look at the sources and calculate amount of additional memory and amount of additional objects it creates. Of course it depends on the nature of stored objects and you also can suspect me to be paranoiac about memory :) but it's better to not spend it here and there - you have GC, you have cache misses and all of these things are slooow.
Often you can use PriorityQueue instead of TreeSet. And in your typical use case it's better just to sort the array of strings.
I guess this datastructure would be using binary tree to maintain data so that ascending order retrieval is possible. In that case, if it tries to keep the tree in balance then the remove operation would be bit costly.

Categories

Resources