What is the difference between Collection and List in Java? - java

What is the difference between Collection and List in Java? When should I use which?

First off: a List is a Collection. It is a specialized Collection, however.
A Collection is just that: a collection of items. You can add stuff, remove stuff, iterate over stuff and query how much stuff is in there.
A List adds the information about a defined sequence of stuff to it: You can get the element at position n, you can add an element at position n, you can remove the element at position n.
In a Collection you can't do that: "the 5th element in this collection" isn't defined, because there is no defined order.
There are other specialized Collections as well, for example a Set which adds the feature that it will never contain the same element twice.

Collection is the root interface to the java Collections hierarchy. List is one sub interface which defines an ordered Collection, other sub interfaces are Queue which typically will store elements ready for processing (e.g. stack).
The following diagram demonstrates the relationship between the different java collection types:

Java API is the best to answer this
Collection
The root interface in the collection
hierarchy. A collection represents a
group of objects, known as its
elements. Some collections allow
duplicate elements and others do not.
Some are ordered and others unordered.
The JDK does not provide any direct
implementations of this interface: it
provides implementations of more
specific subinterfaces like Set and
List. This interface is typically used
to pass collections around and
manipulate them where maximum
generality is desired.
List (extends Collection)
An ordered collection (also known as a
sequence). The user of this interface
has precise control over where in the
list each element is inserted. The
user can access elements by their
integer index (position in the list),
and search for elements in the list.
Unlike sets, lists typically allow
duplicate elements. More formally,
lists typically allow pairs of
elements e1 and e2 such that
e1.equals(e2), and they typically
allow multiple null elements if they
allow null elements at all. It is not
inconceivable that someone might wish
to implement a list that prohibits
duplicates, by throwing runtime
exceptions when the user attempts to
insert them, but we expect this usage
to be rare.

List and Set are two subclasses of Collection.
In List, data is in particular order.
In Set, it can not contain the same data twice.
In Collection, it just stores data with no particular order and can contain duplicate data.

Collection is the Super interface of List so every Java list is as well an instance of collection. Collections are only iterable sequentially (and in no particular order) whereas a List allows access to an element at a certain position via the get(int index) method.

Collection is the main interface of Java Collections hierarchy and List(Sequence) is one of the sub interfaces that defines an ordered collection.

Collection is a high-level interface describing Java objects that can contain collections of other objects. It's not very specific about how they are accessed, whether multiple copies of the same object can exist in the same collection, or whether the order is important. List is specifically an ordered collection of objects. If you put objects into a List in a particular order, they will stay in that order.
And deciding where to use these two interfaces is much less important than deciding what the concrete implementation you use is. This will have implications for the time and space performance of your program. For example, if you want a list, you could use an ArrayList or a LinkedList, each of which is going to have implications for the application. For other collection types (e.g. Sets), similar considerations apply.

Related

List backed up by a HashSet

I need to store objects in a collection in the order they were added, that's why I need a List. However, the list should contain no duplicates. I also need to quickly determine if an object already exists in the collection. Instead of iterating the list every time, it would be better to have something like a HashSet. I can quickly both find and add elements and preserve the insertion order.
The question is - should I:
extend ArrayList by adding a HashSet field?
implement one of the Java collection interfaces (List or Set)?
simply create a new class with two fields - ArrayList and
HashSet?
The 1st option has the disadvantage - I don't need all of the ArrayList methods, so I'd have to override all of them so that users of my class don't call base class methods that would simply mess things up (for instance, one could remove an object from the list but the object would still exist in the set). And there's no way to remove the base class methods (except from overriding it and throwing an exception).
Similarly for 2, I'd really have to implement all methods of the interface.
The 3rd option looks the best to me, but it makes the code implementation dependent, because my class doesn't implement any interface.
What should I do in this case? I'd like to have all add methods the List interface has. - LinkedHashSet is not an option.
You could use a LinkedHashSet, which a Set implementation that ensures that iteration order is the same order you added elements in.
Hash table and linked list implementation of the Set interface, with predictable iteration order. ... This linked list defines the iteration ordering, which is the order in which elements were inserted into the set (insertion-order).
No need to implements anything on your own. Use LinkedHashSet which maintains encounter order.

Properly defining collection in java

While reading collections in java and browsing some of the questions on stackoverflow, I came across this question:
Method for adding Objects into a fixed collection(Array) in Java
Here an Array has been referred to as fixed collection. Conceptually, is it legitimate to call an Array a 'fixed collection' or is it a self-contradicting phrase?
A collection framework is basically a framework to store and retrieve the collection of java objects efficiently.
A very good link about overview of data structure is here
As per this link
There are fourteen collection interfaces. The most basic interface is Collection. These interfaces extend Collection: Set, List, SortedSet, NavigableSet, Queue, Deque,
BlockingQueue and BlockingDeque.
The other collection interfaces, Map, SortedMap, NavigableMap, ConcurrentMap and ConcurrentNavigableMap do not extend Collection, as they represent mappings
rather than true collections. However, these interfaces contain collection-view operations, which allow them to be manipulated as collections.
Now coming back to array its not part of collection framework but logically its collection as it can store collection of objects. Even if you develop your custom class that can store bunch of objects you can logically call it collection object.
An array is a collection if you define a collection as a container of elements.
Of course an array does not implement the Collection interface, but calling Arrays.asList(arr) on an array actually gives you a fixed size List view of that array, so you can say an array is almost equivalent to a fixed length random access List (A List is a Collection).

Which Java Collection should I use?

In this question How can I efficiently select a Standard Library container in C++11? is a handy flow chart to use when choosing C++ collections.
I thought that this was a useful resource for people who are not sure which collection they should be using so I tried to find a similar flow chart for Java and was not able to do so.
What resources and "cheat sheets" are available to help people choose the right Collection to use when programming in Java? How do people know what List, Set and Map implementations they should use?
Since I couldn't find a similar flowchart I decided to make one myself.
This flow chart does not try and cover things like synchronized access, thread safety etc or the legacy collections, but it does cover the 3 standard Sets, 3 standard Maps and 2 standard Lists.
This image was created for this answer and is licensed under a Creative Commons Attribution 4.0 International License. The simplest attribution is by linking to either this question or this answer.
Other resources
Probably the most useful other reference is the following page from the oracle documentation which describes each Collection.
HashSet vs TreeSet
There is a detailed discussion of when to use HashSet or TreeSet here:
Hashset vs Treeset
ArrayList vs LinkedList
Detailed discussion: When to use LinkedList over ArrayList?
Summary of the major non-concurrent, non-synchronized collections
Collection: An interface representing an unordered "bag" of items, called "elements". The "next" element is undefined (random).
Set: An interface representing a Collection with no duplicates.
HashSet: A Set backed by a Hashtable. Fastest and smallest memory usage, when ordering is unimportant.
LinkedHashSet: A HashSet with the addition of a linked list to associate elements in insertion order. The "next" element is the next-most-recently inserted element.
TreeSet: A Set where elements are ordered by a Comparator (typically natural ordering). Slowest and largest memory usage, but necessary for comparator-based ordering.
EnumSet: An extremely fast and efficient Set customized for a single enum type.
List: An interface representing a Collection whose elements are ordered and each have a numeric index representing its position, where zero is the first element, and (length - 1) is the last.
ArrayList: A List backed by an array, where the array has a length (called "capacity") that is at least as large as the number of elements (the list's "size"). When size exceeds capacity (when the (capacity + 1)-th element is added), the array is recreated with a new capacity of (new length * 1.5)--this recreation is fast, since it uses System.arrayCopy(). Deleting and inserting/adding elements requires all neighboring elements (to the right) be shifted into or out of that space. Accessing any element is fast, as it only requires the calculation (element-zero-address + desired-index * element-size) to find it's location. In most situations, an ArrayList is preferred over a LinkedList.
LinkedList: A List backed by a set of objects, each linked to its "previous" and "next" neighbors. A LinkedList is also a Queue and Deque. Accessing elements is done starting at the first or last element, and traversing until the desired index is reached. Insertion and deletion, once the desired index is reached via traversal is a trivial matter of re-mapping only the immediate-neighbor links to point to the new element or bypass the now-deleted element.
Map: An interface representing an Collection where each element has an identifying "key"--each element is a key-value pair.
HashMap: A Map where keys are unordered, and backed by a Hashtable.
LinkedhashMap: Keys are ordered by insertion order.
TreeMap: A Map where keys are ordered by a Comparator (typically natural ordering).
Queue: An interface that represents a Collection where elements are, typically, added to one end, and removed from the other (FIFO: first-in, first-out).
Stack: An interface that represents a Collection where elements are, typically, both added (pushed) and removed (popped) from the same end (LIFO: last-in, first-out).
Deque: Short for "double ended queue", usually pronounced "deck". A linked list that is typically only added to and read from either end (not the middle).
Basic collection diagrams:
Comparing the insertion of an element with an ArrayList and LinkedList:
Even simpler picture is here. Intentionally simplified!
Collection is anything holding data called "elements" (of the same type). Nothing more specific is assumed.
List is an indexed collection of data where each element has an index. Something like the array, but more flexible.
Data in the list keep the order of insertion.
Typical operation: get the n-th element.
Set is a bag of elements, each elements just once (the elements are distinguished using their equals() method.
Data in the set are stored mostly just to know what data are there.
Typical operation: tell if an element is present in the list.
Map is something like the List, but instead of accessing the elements by their integer index, you access them by their key, which is any object. Like the array in PHP :)
Data in Map are searchable by their key.
Typical operation: get an element by its ID (where ID is of any type, not only int as in case of List).
The differences
Set vs. Map: in Set you search data by themselves, whilst in Map by their key.
N.B. The standard library Sets are indeed implemented exactly like this: a map where the keys are the Set elements themselves, and with a dummy value.
List vs. Map: in List you access elements by their int index (position in List), whilst in Map by their key which os of any type (typically: ID)
List vs. Set: in List the elements are bound by their position and can be duplicate, whilst in Set the elements are just "present" (or not present) and are unique (in the meaning of equals(), or compareTo() for SortedSet)
It is simple: if you need to store values with keys mapped to them go for the Map interface, otherwise use List for values which may be duplicated and finally use the Set interface if you don’t want duplicated values in your collection.
Here is the complete explanation http://javatutorial.net/choose-the-right-java-collection , including flowchart etc
Map
If choosing a Map, I made this table summarizing the features of each of the ten implementations bundled with Java 11.
Common collections, Common collections

Two java.util.Iterators to the same collection: do they have to return elements in the same order?

This is more of a theoretical question. If I have an arbitrary collection c that isn't ordered and I obtain two java.util.Iterators by calling c.iterator() twice, do both iterators have to return c's elements in the same order?
I mean, in practice they probably always will, but are they forced to do so by contract?
Thanks,
Jan
No they are not.
"There are no guarantees concerning the order in which the elements are
returned (unless this collection is an instance of some class that
provides a guarantee)."
See the Collection#iterator api contract.
That includes from one iterator to the next (as it doesn't say anything about requiring that).
Also consider that something could have changed in the underlying collection between getting those two iterators! Something added or removed.
Implementation of Iterators are provided by the specific Collection class. Iterator for List will give the ordered element while Set will not
Because most Data structures are not ordered by default so it is not certain that they will iterate in same order.
If you want same order you have to sort the collection first.

Do Maps belong to the Collection Framework?

Maps in Java do not inherit from the interface "Collection" though in the most online "Tutorials" Maps are explained in the same category as Sets, Lists and Queues.
Do the Maps nevertheless belong to the Collection Framework?
The best description of the Collection is at the beginning of Java Collection Tutorial.
A collection — sometimes called a container — is simply an object that groups multiple elements into a single unit. Collections are used to store, retrieve, manipulate, and communicate aggregate data. Typically, they represent data items that form a natural group, such as a poker hand (a collection of cards), a mail folder (a collection of letters), or a telephone directory (a mapping of names to phone numbers).
Furthermore the tutorial lists the core collection interfaces, which all of them follow the paradigm stated above:
The following list describes the core collection interfaces:
Collection — the root of the collection hierarchy. A collection represents a group of objects known as its elements. The Collection interface is the least common denominator that all collections implement and is used to pass collections around and to manipulate them when maximum generality is desired. Some types of collections allow duplicate elements, and others do not. Some are ordered and others are unordered. The Java platform doesn't provide any direct implementations of this interface but provides implementations of more specific subinterfaces, such as Set and List. Also see The Collection Interface section.
Set — a collection that cannot contain duplicate elements. This interface models the mathematical set abstraction and is used to represent sets, such as the cards comprising a poker hand, the courses making up a student's schedule, or the processes running on a machine. See also The Set Interface section.
List — an ordered collection (sometimes called a sequence). Lists can contain duplicate elements. The user of a List generally has precise control over where in the list each element is inserted and can access elements by their integer index (position). If you've used Vector, you're familiar with the general flavor of List. Also see The List Interface section.
Queue — a collection used to hold multiple elements prior to processing. Besides basic Collection operations, a Queue provides additional insertion, extraction, and inspection operations.
Queues typically, but do not necessarily, order elements in a FIFO (first-in, first-out) manner. Among the exceptions are priority queues, which order elements according to a supplied comparator or the elements' natural ordering. Whatever the ordering used, the head of the queue is the element that would be removed by a call to remove or poll. In a FIFO queue, all new elements are inserted at the tail of the queue. Other kinds of queues may use different placement rules. Every Queue implementation must specify its ordering properties. Also see The Queue Interface section.
Map — an object that maps keys to values. A Map cannot contain duplicate keys; each key can map to at most one value. If you've used Hashtable, you're already familiar with the basics of Map. Also see The Map Interface section.
So Map is a Collection although it doesn't really have to implement the Collection interface.
The Map interface is not an extension of Collection interface. Instead the interface starts of it’s own interface hierarchy for maintaining key-value associations.
Check out the official tutorial, especially the Lesson: Interfaces:
[...]Core collection interfaces are the foundation of the Java Collections Framework. As you can see in the following figure, the core collection interfaces form a hierarchy.
and further:
The following list describes the core collection interfaces:
Collection [...]
Set [...]
List [...]
Queue [...]
Map [...]
Conceptually maps are definitely collections, have been since Smalltalk. Java's type-hierarchy is not meant to manage conceptual relations but rather a pragmatic relationship, specifically to say which methods have to be implemented.
For map-like collections these are very different than for non-map-like. For example with maps you have have to have put(key, value) and get(key) (or similar, if you are working with asscociation-objects), whereas non-map-like have to have iterator() and add().
The reason is that the Collections work with the set of values where as the Map work in the form of key-value pairs.

Categories

Resources