I have data of which the sequence is as important as its unique elements. Meaning if something has already been added it should not be added again and the sequence must be remembered.
Set does not remember the sequence in which it was added (either hash or sort), and List is not unique.
What is the best solution to this problem?
Should one have a list and loop through it to test for uniqueness - which I'm trying to avoid?
Or should one have two collections, one a List and one a Set - which I'm also trying to avoid?
Or is there a different solution to this problem altogether.
In the bellow code was your reference
LinkedHashSet<String> al=new LinkedHashSet<String>();
al.add("guru");
al.add("karthik");
al.add("raja");
al.add("karthik");
Iterator<String> itr=al.iterator();
while(itr.hasNext()){
System.out.println(itr.next());
}
output
guru
karthik
raja
Use LinkedHashSet. It serves as both a List and a Set. It has the uniqueness quality of a set but still remembers the order in which you inserted items to it which allows you to iterate it by order of insertion.
From the Docs:
Hash table and linked list implementation of the Set interface, with predictable iteration order. This implementation differs from HashSet in that it maintains a doubly-linked list running through all of its entries. This linked list defines the iteration ordering, which is the order in which elements were inserted into the set (insertion-order). Note that insertion order is not affected if an element is re-inserted into the set. (An element e is reinserted into a set s if s.add(e) is invoked when s.contains(e) would return true immediately prior to the invocation.)
You can use SortedSet
or LinkedHashSet
LinkedHashSet is the best possible way out
Related
Does a Java Set retain order? A method is returning a Set to me and supposedly the data is ordered but iterating over the Set, the data is unordered. Is there a better way to manage this? Does the method need to be changed to return something other than a Set?
The Set interface does not provide any ordering guarantees.
Its sub-interface SortedSet represents a set that is sorted according to some criterion. In Java 6, there are two standard containers that implement SortedSet. They are TreeSet and ConcurrentSkipListSet.
In addition to the SortedSet interface, there is also the LinkedHashSet class. It remembers the order in which the elements were inserted into the set, and returns its elements in that order.
LinkedHashSet is what you need.
As many of the members suggested use LinkedHashSet to retain the order of the collection.
U can wrap your set using this implementation.
SortedSet implementation can be used for sorted order but for your purpose use LinkedHashSet.
Also from the docs,
"This implementation spares its clients from the unspecified, generally chaotic ordering provided by HashSet, without incurring the increased cost associated with TreeSet. It can be used to produce a copy of a set that has the same order as the original, regardless of the original set's implementation:"
Source : http://docs.oracle.com/javase/6/docs/api/java/util/LinkedHashSet.html
Set is just an interface. In order to retain order, you have to use a specific implementation of that interface and the sub-interface SortedSet, for example TreeSet or LinkedHashSet. You can wrap your Set this way:
Set myOrderedSet = new LinkedHashSet(mySet);
To retain the order use List or a LinkedHashSet.
Here is a quick summary of the order characteristics of the standard Set implementations available in Java:
keep the insertion order: LinkedHashSet and CopyOnWriteArraySet (thread-safe)
keep the items sorted within the set: TreeSet, EnumSet (specific to enums) and ConcurrentSkipListSet (thread-safe)
does not keep the items in any specific order: HashSet (the one you tried)
For your specific case, you can either sort the items first and then use any of 1 or 2 (most likely LinkedHashSet or TreeSet). Or alternatively and more efficiently, you can just add unsorted data to a TreeSet which will take care of the sorting automatically for you.
A LinkedHashSet is an ordered version of HashSet that maintains a doubly-linked List across all elements. Use this class instead of HashSet when you care about the iteration order.
From the javadoc for Set.iterator():
Returns an iterator over the elements in this set. The elements are returned in no particular order (unless this set is an instance of some class that provides a guarantee).
And, as already stated by shuuchan, a TreeSet is an implemention of Set that has a guaranteed order:
The elements are ordered using their natural ordering, or by a Comparator provided at set creation time, depending on which constructor is used.
Normally set does not keep the order, such as HashSet in order to quickly find a emelent, but you can try LinkedHashSet it will keep the order which you put in.
There are 2 different things.
Sort the elements in a set. For which we have SortedSet and similar implementations.
Maintain insertion order in a set. For which LinkedHashSet and CopyOnWriteArraySet (thread-safe) can be used.
The Set interface itself does not stipulate any particular order. The SortedSet does however.
Iterator returned by Set is not suppose to return data in Ordered way.
See this Two java.util.Iterators to the same collection: do they have to return elements in the same order?
Only SortedSet can do the ordering of the Set
When I start to add value into Set<Integer> I get sorting elements.
Please refer to this example:
Set<Integer> generated = new HashSet<Integer>();
generated.add(2);
generated.add(1);
generated.add(0);
Here I get sorting Set [0, 1, 2]. I would like to get value as I add to generated object.
A HashSet does not have a predictable order for elements. Use a LinkedHashSet to preserve insertion order of elements in a set:
Hash table and linked list implementation of the Set interface, with predictable iteration order.
Set<Integer> generated = new LinkedHashSet<Integer>();
generated.add(2);
generated.add(1);
generated.add(0);
Firstly it's just a co-incidence that you get sorted value first time. If you run that code multiple time, you'll see the output in some random order. That's because a HashSet doesn't enforce any ordering on elements you add.
Now to get the elements in the order you inserted, you can use LinkedHashSet, that maintains the insertion order.
The HashSet does not guarantee the order of the elements. From the JavaDoc:
It makes no guarantees as to the iteration order of the set; in particular, it does not guarantee that the order will remain constant over time.
So, in order to keep guarantee the order a LinkedHashSet can be used. From the JavaDoc:
Hash table and linked list implementation of the Set interface, with predictable iteration order.
This linked list defines the iteration ordering, which is the order in which elements were inserted into the set (insertion-order).
Simply instantiate your Set like this:
Set<Integer> generated = new LinkedHashSet<>();
First, regarding the title of your question, Set<Integer> is only the declaration type and its not responsible of any sorting / unsorting behavior, the main reason for using the Set interface is when caring about uniqueness — it doesn't allow duplicates, additional informations from Javadocs:
A Set is a Collection that cannot contain duplicate elements.
Second, it's pure concidence that you got sorted set, use HashSet when you don't care about order when iterating through it, more infos from javadocs:
It makes no guarantees as to the iteration order of the set; in
particular, it does not guarantee that the order will remain constant
over time. This class permits the null element.
Third, regarding what you are looking for:
I would like to get value as I add to generated object.
then you need to use LinkedHashSet which takes care of the order in which elements were inserted, again from javadocs:
This linked list defines the iteration ordering, which is the order in
which elements were inserted into the set (insertion-order). Note that
insertion order is not affected if an element is re-inserted into the
set
you may use it simply like this:
Set<Integer> generated = new LinkedHashSet<Integer>();
Fourth and Last, as additional information, another important collection that you need to be aware of it, is the TreeSetwhich guarantees that the elements will be sorted in ascending order, according to natural order, javadocs:
The elements are ordered using their natural ordering, or by a
Comparator provided at set creation time, depending on which
constructor is used
This question already has answers here:
Any implementation of Ordered Set in Java?
(11 answers)
Closed 7 years ago.
In Java collection which collection will doesn't allow duplicates and which also preserve insertion order of data?
LinkedHashSet
As per the documentation
This implementation differs from HashSet in that it maintains a
doubly-linked list running through all of its entries. This linked
list defines the iteration ordering, which is the order in which
elements were inserted into the set (insertion-order)
LinkedHashSet does both of them
Set set = new LinkedHashSet();
A LinkedHashSet should fit the bill.
Hash table and linked list implementation of the Set interface, with predictable iteration order. This implementation differs from HashSet in that it maintains a doubly-linked list running through all of its entries. This linked list defines the iteration ordering, which is the order in which elements were inserted into the set (insertion-order).
You can check LinkedHashSet for this purpose.
A Set will not allow duplicate values. And LinkedHashSet will preserve insertion order.
Hash table and linked list implementation of the Set interface, with
predictable iteration order. This implementation differs from HashSet
in that it maintains a doubly-linked list running through all of its
entries. This linked list defines the iteration ordering, which is the
order in which elements were inserted into the set (insertion-order).
Note that insertion order is not affected if an element is re-inserted
into the set. (An element e is reinserted into a set s if s.add(e) is
invoked when s.contains(e) would return true immediately prior to the
invocation.)
Use
public class LinkedHashSet<E> extends HashSet<E>
Basically Set won't allow duplicates and
This linked list defines the iteration ordering, which is the order in which elements were inserted into the set (insertion-order)
http://docs.oracle.com/javase/6/docs/api/java/util/LinkedHashSet.html
You want an ordered set, which is implemented by LinkedHashSet.
I have a java ArrayList to which I add 5 objects.
If I iterate over the list and print them out, then iterate over the list and print them out again.
Will the retrieval order in these 2 cases be the same? (I know it may be different from the insertion order)
Yes, assuming you haven't modified the list in-between. From http://docs.oracle.com/javase/6/docs/api/java/util/List.html:
iterator
Iterator<E> iterator()
Returns an iterator over the elements in this list in proper sequence.
A bit vague, perhaps, but in other portions of that page, this term is defined:
proper sequence (from first to last element)
(I know it may be different from the insertion order)
No it won't. The contract of List requires that the add order is the same as the iteration order, since add inserts at the end, and iterator produces an iterator that iterates from start to end in order.
Set doesn't require this, so you may be confusing the contract of Set and List regarding iteration order.
From the Javadoc:
Iterator<E> iterator()
Returns an iterator over the elements in this list in proper sequence.
It's in the specification of the List interface to preserve order.
It's the Set classes that don't preserve order.
If you're not mutating the list, then the iteration order will stay the same. Lists have a contractually specified ordering, and the iterator specification guarantees that it iterates over elements in that order.
Yes, an ArrayList guarantees iteration order over its elements - that is, they will come out in the same order you inserted them, provided that you don't make any insertions while iterating over the ArrayList.
Retrieval does not vary unless you change the iterator you are using. As long as you are using the same method for retrieval and have not changed the list itself then the items will be returned in the same order.
When you add an element to an ArrayList using add(E e), the element is appended to the end of the list. Consequently, if all you do is call the single-argument add method a number of times and then iterate, the iteration will be in exactly the same order as the calls to add.
The iteration order will be the same everytime you iterate over the same unmodified list.
Also, assuming you add the elements using the add() method, the iteration order will be the same as the insertion order since this method appends elements to the end of the list.
Yes the retrieval order is guaranteed to be the same as long as list is not mutated and you use the same iterator, but having to need to rely on retrieval order indicated something fishy with the design. It is generally not a good idea to base business logic upon certain retrieval order.
Even Sets will return the same result, if you don't modify them (adding or removing items to them).
Is there any way to know what was the last new entries that were added to a hashset ? In my program the first cycle adds [Emmy, Carl] and than on my second cycle it adds [Emmy, Dan, Carl] is there anyway I can just use dan and not the rest of them for cycle three ?
java.util.HashSet does not preserve order, but java.util.LinkedHashSet does. Can you use that instead? From the Javadoc:
This implementation differs from HashSet in that it maintains a doubly-linked list running through all of its entries. This linked list defines the iteration ordering, which is the order in which elements were inserted into the set (insertion-order). Note that insertion order is not affected if an element is re-inserted into the set.
HashSets do not carry information about the order in which you add elements. You need to replace it with a Collection that does (e.g. ArrayList).
Hashset are backed by hash tables and there is no guarantee on the order of retrieval. The order of retrieval will not be the same as the order of insertion. So, no it's not possible to know which item was added last.
Workarounds - may be use two hashsets, compare old with new and get the new entries or have some sort of indicators to distinguish the perticular iteration it was added or use ArrayList or anything that fits in your design.
HashSet<String> oldpeople = new HashSet<String>();P
HashSet<String> newPeople;
for (Set<String> cycle : input)
{
newPeople = new HashSet<String>();
newPeople.addAll(cycle);
newPeople.removeAll(oldPeople);
oldPeople.addAll(cycle);
}
now you have the last new one always contained in newPeople.
Well if I understand your post and the comments correctly (well that's quite hard, try to be bit more precise :) ) what you actually want is: a) not add any items several times to the HashSet and b) see if the set already contains the given item when trying to add it.
a) is trivially true for every set and for b) you can just use the return value of add: If it returns false is already contained in the set.