LinkedHashSet - insertion order and duplicates - keep newest "on top"

LinkedHashSet - insertion order and duplicates - keep newest "on top" - java

I need a collection that keeps insertion order and has unique values. LinkedHashSet looks like the way to go, but there's one problem - when two items are equal, it removes the newest one (which makes sense), here's an example:
set.add("one");
set.add("two");
set.add("three");
set.add("two");
The LinkedHashSet will print:
one, two, three
But what I need is:
one, three, two
What would be the best solution here? Is there any collection/collections method that can do this or should I implement it manually?

Most of the Java Collections can be extended for tweaking.
Subclass LinkedHashSet, overriding the add method.
class TweakedHashSet<T> extends LinkedHashSet<T> {
#Override
public boolean add(T e) {
// Get rid of old one.
boolean wasThere = remove(e);
// Add it.
super.add(e);
// Contract is "true if this set did not already contain the specified element"
return !wasThere;
}
}

You can simply use a special feature of LinkedHashMap:
Set<String> set = Collections.newSetFromMap(new LinkedHashMap<>(16, 0.75f, true));
set.add("one");
set.add("two");
set.add("three");
set.add("two");
System.out.println(set); // prints [one, three, two]
In Oracle’s JRE the LinkedHashSet is backed by a LinkedHashMap anyway, so there’s not much functional difference, but the special constructor used here configures the LinkedHashMap to change the order on every access not only on insertion. This might sound as being too much, but in fact affects the insertion of already contained keys (values in the sense of the Set) only. The other affected Map operations (namely get) are not used by the returned Set.
If you’re not using Java 8, you have to help the compiler a bit due to the limited type inference:
Set<String> set
= Collections.newSetFromMap(new LinkedHashMap<String, Boolean>(16, 0.75f, true));
but the functionality is the same.

When initializing you're LinkedHashSet you could override the add method.
Set<String> set = new LinkedHashSet<String>(){
#Override
public boolean add(String s) {
if(contains(s))
remove(s);
return super.add(s);
}
};
Now it gives you:
set.add("1");
set.add("2");
set.add("3");
set.add("1");
set.addAll(Collections.singleton("2"));
// [3, 1 ,2]
even the addAll method is working.

All solution provided above are excellent but if we don't want to override already implemented collections. We can solve this problem simply by using an ArrayList with a little trick
We can create a method which you will use to insert data into your list
public static <T> void addToList(List<T> list, T element) {
list.remove(element); // Will remove element from list, if list contains it
list.add(element); // Will add element again to the list
}
And we can call this method to add element to our list
List<String> list = new ArrayList<>();
addToList(list, "one");
addToList(list, "two");
addToList(list, "three");
addToList(list, "two");
Only disadvantage here is we need to call our custom addToList() method everytime instead of list.add()

Related

Is Set sorted in some manner by default?

I have this set with elements added in the given order.
Set<String> nations = new HashSet<String>();
nations.add("Australia");
nations.add("Japan");
nations.add("Taiwan");
nations.add("Cyprus");
nations.add("Cuba");
nations.add("India");
When I print the record -
for (String s : nations) {
System.out.print(s + " ");
}
It always gives this output in the order
Cuba Cyprus Japan Taiwan Australia India
As far as I know a Set is not sorted by default, but why do I get the same result in a particular sorted manner?
Update : Here is the actual question -
public static Function<String,String> swap = s -> {
if(s.equals("Australia"))
return "New Zealand";
else
return s;
};
Set<String> islandNations = Set.of("Australia", "Japan", "Taiwan", "Cyprus", "Cuba");
islandNations = islandNations.stream()
.map(swap)
.map(n -> n.substring(0, 1))
.collect(Collectors.toSet());
for(String s : islandNations){
System.out.print(s);
}
and answers one of these
CTJN
TJNC
TCNJ

HashSet's documentation says:
It makes no guarantees as to the iteration order of the set.
No guarantees means no guarantees. For example, it could be sorted order, reverse sorted order, random order, or sorted order except on Tuesdays when it's random.
(In practice, the iteration order is usually always the same for the same Java version, or at least for the same run of the JVM, and that order is produced by a deliberately convoluted algorithm based on the hash codes of the elements. However, if you depend on that behavior, it will usually change at the worst possible time.)

HashSet does not preserve the order of insertion of elements, as the order is maintained based on the hashing mechanism like Map because the add() method internally inserts the element as a key in a Map.
HashSet
//add method implementation for HashSet
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
here, map is a private transient HashMap<E,Object> map;
As the map is HashMap here, so no sorting will be done.
TreeSet
//add method implementation for TreeSet
public boolean add(E e) {
return m.put(e, PRESENT)==null;
}
here, map is a private transient NavigableMap<E,Object> m;
As the map is NavigableMap here, so sorting will be done.

The HashSet uses the hash value of each element for storage.
The important points about Java HashSet class are:
HashSet stores the elements by using a mechanism called hashing.
HashSet contains unique elements only.
HashSet allows null value.
HashSet class is non-synchronized.
HashSet doesn't maintain the insertion order. Here, elements are inserted on the basis of their hashcode.
HashSet is the best approach for search operations.
The initial default capacity of HashSet is 16, and the load factor is 0.75.
if you need to store (and display) in the order you can use the SortedSet interface, like this:
SortedSet<String> orderedList = new TreeSet<String>();
orderedList.add("C");
orderedList.add("D");
orderedList.add("E");
orderedList.add("A");
orderedList.add("B");
orderedList.add("Z");
for (String value : orderedList)
System.out.print(value + ", ");
Output:
A, B, C, D, E, Z,
Remembering: SortedSet uses the Comparable interface and the compareTo() method to sort the String values. If you have a customized class you should implement this interface/method to use in this approach.
Or, you can define the comparator that must be used:
SortedSet<Person> persons = new TreeSet<>Comparator.comparing(Person::getName));

Check for duplicate in an ArrayList [duplicate]

I am novice to java. I have an ArrayList and I want to avoid duplicates on insertion. My ArrayList is
ArrayList<kar> karList = new ArrayList<kar>();
and the the field I want to check is :
kar.getinsertkar().
I have read that I can use HashSet or HashMap but I have no clue.

Whenever you want to prevent duplicates, you want to use a Set.
In this case, a HashSet would be just fine for you.
HashSet karSet = new HashSet();
karSet.add(foo);
karSet.add(bar);
karSet.add(foo);
System.out.println(karSet.size());
//Output is 2
For completeness, I would also suggest you use the generic (parameterized) version of the class, assuming Java 5 or higher.
HashSet<String> stringSet = new HashSet<String>();
HashSet<Integer> intSet = new HashSet<Integer>();
...etc...
This will give you some type safety as well for getting items in and out of your set.

A set is simply a collection that can contain no duplicates so it sounds perfect for you.
It is also very simple to implement. For example:
Set<String> mySet = new HashSet<String>();
This would provide you a set that can hold Objects of type String.
To add to the set is just as simple:
mySet.add("My first entry!");
By definition of a set, you can add whatever you want and never run into a duplicate.
Have fun!
EDIT : If you decide you are dead-set on using an ArrayList, it is simple to see if an object is already in the list before adding it. For example:
public void addToList(String newEntry){
if(!myList.contains(newEntry))
myList.add(newEntry);
}
Note: All my examples assume you are using String objects but they can easily be swapped to any other Object type.

Use a HashSet instead of an ArrayList. But, to really make the HashSet really work well, you must override the equals() and hashCode() methods of the class/objects that are inserted into the HashSet.
Foe example:
Set<MyObject> set = new HashSet<MyObject>();
set.add(foo);
set.add(bar);
public class MyObject {
#Override
public boolean equals(Object obj) {
if (obj instanceof MyObject)
return (this.id = obj.id)
else
return false;
}
// now override hashCode()
}
Please see the following documentation for overriding hashCode() and equals().

You can use LinkedHashSet, to avoid duplicated elements and keep the insertion order.
http://docs.oracle.com/javase/7/docs/api/java/util/LinkedHashSet.html

You need to use any Set implementation, e.g you can use HashSet.
If you want to add custom object kar into your HashSet, you need to override equals and hashcode method.
You can read more about equals and hashcode, see

You can implement own List which extends LinkedList and override its add methods:
public boolean add(E e)
public void add(int index, E element)
public boolean addAll(Collection collection)
public boolean addAll(int index, Collection collection)

An example removing repeated Strings in an ArrayList:
var list = new ArrayList<>(List.of(
"hello",
"java",
"test",
"hello"
));
System.out.println(list);
System.out.println(new ArrayList<>(new HashSet<>(list)));
Output:
[hello, java, test, hello]
[java, test, hello]

How to sort a HashSet?

For lists, we use the Collections.sort(List) method. What if we want to sort a HashSet?

A HashSet does not guarantee any order of its elements. If you need this guarantee, consider using a TreeSet to hold your elements.
However if you just need your elements sorted for this one occurrence, then just temporarily create a List and sort that:
Set<?> yourHashSet = new HashSet<>();
...
List<?> sortedList = new ArrayList<>(yourHashSet);
Collections.sort(sortedList);

Add all your objects to the TreeSet, you will get a sorted Set. Below is a raw example.
HashSet myHashSet = new HashSet();
myHashSet.add(1);
myHashSet.add(23);
myHashSet.add(45);
myHashSet.add(12);
TreeSet myTreeSet = new TreeSet();
myTreeSet.addAll(myHashSet);
System.out.println(myTreeSet); // Prints [1, 12, 23, 45]
Update
You can also use TreeSet's constructor that takes a HashSet as a parameter.
HashSet myHashSet = new HashSet();
myHashSet.add(1);
myHashSet.add(23);
myHashSet.add(45);
myHashSet.add(12);
TreeSet myTreeSet = new TreeSet(myHashSet);
System.out.println(myTreeSet); // Prints [1, 12, 23, 45]
Thanks #mounika for the update.

Java 8 way to sort it would be:
fooHashSet.stream()
.sorted(Comparator.comparing(Foo::getSize)) //comparator - how you want to sort it
.collect(Collectors.toList()); //collector - what you want to collect it to
*Foo::getSize it's an example how to sort the HashSet of YourItem's naturally by size.
*Collectors.toList() is going to collect the result of sorting into a List the you will need to capture it with List<Foo> sortedListOfFoo =

You can use a TreeSet instead.

Use java.util.TreeSet as the actual object. When you iterate over this collection, the values come back in a well-defined order.
If you use java.util.HashSet then the order depends on an internal hash function which is almost certainly not lexicographic (based on content).

Just in-case you don't wanna use a TreeSet you could try this using java stream for concise code.
set = set.stream().sorted().collect(Collectors.toCollection(LinkedHashSet::new));

You can use Java 8 collectors and TreeSet
list.stream().collect(Collectors.toCollection(TreeSet::new))

Based on the answer given by #LazerBanana i will put my own example of a Set sorted by the Id of the Object:
Set<Clazz> yourSet = [...];
yourSet.stream().sorted(new Comparator<Clazz>() {
#Override
public int compare(Clazz o1, Clazz o2) {
return o1.getId().compareTo(o2.getId());
}
}).collect(Collectors.toList()); // Returns the sorted List (using toSet() wont work)

Elements in HashSet can't be sorted. Whenever you put elements into HashSet, it can mess up the ordering of the whole set. It is deliberately designed like that for performance. When you don't care about the order, HashSet will be the most efficient set for frequent insertions and queries.
TreeSet is the alternative that you can use. When you iterate on the tree set, you will get sorted elements automatically.
But it will adjust the tree to try to remain sorted every time you insert an element.
Perhaps, what you are trying to do is to sort just once. In that case, TreeSet is not the most efficient option because it needs to determine the placing of newly added elements all the time. Use TreeSet only when you want to sort often.
If you only need to sort once, use ArrayList. Create a new list and add all the elements then sort it once. If you want to retain only unique elements (remove all duplicates), then put the list into a LinkedHashSet, it will retain the order you have already sorted.
List<Integer> list = new ArrayList<>();
list.add(6);
list.add(4);
list.add(4);
list.add(5);
Collections.sort(list);
Set<Integer> unique = new LinkedHashSet<>(list); // 4 5 6
Now, you've gotten a sorted set if you want it in a list form then convert it into list.

You can use TreeSet as mentioned in other answers.
Here's a little more elaboration on how to use it:
TreeSet<String> ts = new TreeSet<String>();
ts.add("b1");
ts.add("b3");
ts.add("b2");
ts.add("a1");
ts.add("a2");
System.out.println(ts);
for (String s: ts)
System.out.println(s);
Output:
[a1, a2, a3, a4, a5]
a1
a2
b1
b2
b3

In my humble opinion , LazerBanana's answer should be the top rated answer & accepted because all the other answers pointing to java.util.TreeSet ( or first convert to list then call Collections.sort(...) on the converted list ) didn't bothered to ask OP as what kind of objects your HashSet has i.e. if those elements have a predefined natural ordering or not & that is not optional question but a mandatory question.
You just can't go in & start putting your HashSet elements into a TreeSet if element type doesn't already implement Comparable interface or if you are not explicitly passing Comparator to TreeSet constructor.
From TreeSet JavaDoc ,
Constructs a new, empty tree set, sorted according to the natural
ordering of its elements. All elements inserted into the set must
implement the Comparable interface. Furthermore, all such elements
must be mutually comparable: e1.compareTo(e2) must not throw a
ClassCastException for any elements e1 and e2 in the set. If the user
attempts to add an element to the set that violates this constraint
(for example, the user attempts to add a string element to a set whose
elements are integers), the add call will throw a ClassCastException.
That is why only all Java8 stream based answers - where you define your comparator on the spot - only make sense because implementing comparable in POJO becomes optional. Programmer defines comparator as and when needed. Trying to collect into TreeSet without asking this fundamental question is also incorrect ( Ninja's answer). Assuming object types to be String or Integer is also incorrect.
Having said that, other concerns like ,
Sorting Performance
Memory Foot Print ( retaining original set and creating new sorted sets each time sorting is done or wish to sort the set in - place etc etc )
should be the other relevant points too. Just pointing to API shouldn't be only intention.
Since Original set already contains only unique elements & that constraint is also maintained by sorted set so original set needs to be cleared from memory since data is duplicated.

1. Add all set element in list -> al.addAll(s);
2. Sort all the elements in list using -> Collections.sort(al);
public class SortSetProblem {
public static void main(String[] args) {
ArrayList<String> al = new ArrayList();
Set<String> s = new HashSet<>();
s.add("ved");
s.add("prakash");
s.add("sharma");
s.add("apple");
s.add("ved");
s.add("banana");
System.out.println("Before Sorting");
for (String s1 : s) {
System.out.print(" " + s1);
}
System.out.println("After Sorting");
al.addAll(s);
Collections.sort(al);
for (String set : al) {
System.out.print(" " + set);
}
}
}
input - ved prakash sharma apple ved banana
Output - apple banana prakash sharma ved

If you want want the end Collection to be in the form of Set and if you want to define your own natural order rather than that of TreeSet then -
Convert the HashSet into List
Custom sort the List using Comparator
Convert back the List into LinkedHashSet to maintain order
Display the LinkedHashSet
Sample program -
package demo31;
import java.util.*;
public class App26 {
public static void main(String[] args) {
Set<String> set = new HashSet<>();
addElements(set);
List<String> list = new LinkedList<>();
list = convertToList(set);
Collections.sort(list, new Comparator<String>() {
#Override
public int compare(String s1, String s2) {
int flag = s2.length() - s1.length();
if(flag != 0) {
return flag;
} else {
return -s1.compareTo(s2);
}
}
});
Set<String> set2 = new LinkedHashSet<>();
set2 = convertToSet(list);
displayElements(set2);
}
public static void addElements(Set<String> set) {
set.add("Hippopotamus");
set.add("Rhinocerous");
set.add("Zebra");
set.add("Tiger");
set.add("Giraffe");
set.add("Cheetah");
set.add("Wolf");
set.add("Fox");
set.add("Dog");
set.add("Cat");
}
public static List<String> convertToList(Set<String> set) {
List<String> list = new LinkedList<>();
for(String element: set) {
list.add(element);
}
return list;
}
public static Set<String> convertToSet(List<String> list) {
Set<String> set = new LinkedHashSet<>();
for(String element: list) {
set.add(element);
}
return set;
}
public static void displayElements(Set<String> set) {
System.out.println(set);
}
}
Output -
[Hippopotamus, Rhinocerous, Giraffe, Cheetah, Zebra, Tiger, Wolf, Fox, Dog, Cat]
Here the collection has been sorted as -
First - Descending order of String length
Second - Descending order of String alphabetical hierarchy

you can do this in the following ways:
Method 1:
Create a list and store all the hashset values into it
sort the list using Collections.sort()
Store the list back into LinkedHashSet as it preserves the insertion order
Method 2:
Create a treeSet and store all the values into it.
Method 2 is more preferable because the other method consumes lot of time to transfer data back and forth between hashset and list.

We can not decide that the elements of a HashSet would be sorted automatically. But we can sort them by converting into TreeSet or any List like ArrayList or LinkedList etc.
// Create a TreeSet object of class E
TreeSet<E> ts = new TreeSet<E> ();
// Convert your HashSet into TreeSet
ts.addAll(yourHashSet);
System.out.println(ts.toString() + "\t Sorted Automatically");

You can use guava library for the same
Set<String> sortedSet = FluentIterable.from(myHashSet).toSortedSet(new Comparator<String>() {
#Override
public int compare(String s1, String s2) {
// descending order of relevance
//required code
}
});

SortedSet has been added Since java 7
https://docs.oracle.com/javase/8/docs/api/java/util/SortedSet.html

You can wrap it in a TreeSet like this:
Set mySet = new HashSet();
mySet.add(4);
mySet.add(5);
mySet.add(3);
mySet.add(1);
System.out.println("mySet items "+ mySet);
TreeSet treeSet = new TreeSet(mySet);
System.out.println("treeSet items "+ treeSet);
output :
mySet items [1, 3, 4, 5]
treeSet items [1, 3, 4, 5]
Set mySet = new HashSet();
mySet.add("five");
mySet.add("elf");
mySet.add("four");
mySet.add("six");
mySet.add("two");
System.out.println("mySet items "+ mySet);
TreeSet treeSet = new TreeSet(mySet);
System.out.println("treeSet items "+ treeSet);
output:
mySet items [six, four, five, two, elf]
treeSet items [elf, five, four, six, two]
requirement for this method is that the objects of the set/list should be comparable (implement the Comparable interface)

The below is my sample code and its already answered by pointing the code in comments , am still sharing because it contains the complete code
package Collections;
import java.util.*;
public class TestSet {
public static void main(String[] args) {
Set<String> objset = new HashSet<>();
objset.add("test");
objset.add("abc");
objset.add("abc");
objset.add("mas");
objset.add("vas");
Iterator itset = objset.iterator();
while(itset.hasNext())
{
System.out.println(itset.next());
}
TreeSet<String> treeobj = new TreeSet(objset);
System.out.println(treeobj);
}
}
TreeSet treeobj = new TreeSet(objset); here we are invoking the treeset constructor which will call the addAll method to add the objects .
See this below code from the TreeSet class how its mentioned ,
public TreeSet(Collection<? extends E> c) {
this();
addAll(c);
}

Convert HashSet to List then sort it using Collection.sort()
List<String> list = new ArrayList<String>(hset);
Collections.sort(List)

This simple command did the trick for me:
myHashSet.toList.sorted
I used this within a print statement, so if you need to actually persist the ordering, you may need to use TreeSets or other structures proposed on this thread.

Adding items to empty List at specific locations in java

Is there any way I can make the below code work without commenting the 3rd line.
List<Integer> list = new ArrayList<Integer>();
list.add(0,0);
//list.add(1,null);
list.add(2,2);
I want to add items to list at specific locations. But if I don't change the index to Nth position I am not being able to add at Nth as told in this answer.
I can't use a map because I don't want to miss a value when the keys are same. Also adding null values to a list for large lists will be an overhead. When there is a collision I want the item to take the next position(nearest to where it should have been).
Is there any List implementation that shifts index before it tries to add the item?

Use something like a MultiMap if your only concern is not "missing a value" if the keys are the same.
I'm not sure how doing a shift/insert helps if I understand your problem statement--if the "key" is the index, inserting will lose the same information.

You can use Vector and call setSize to prepopulate with null elements.
However, your comment about the overhead of the nulls speaks to an associative container as the right solution.

This still smells like you should be using a Map. Why not use a Map<Integer, List<Integer>>?
something like,
private Map<Integer, List<Integer>> myMap = new HashMap<Integer, List<Integer>>();
public void addItem(int key, int value) {
List<Integer> list = myMap.get(key);
if (list == null) {
list = new ArrayList<Integer>();
myMap.put(key, list);
}
list.add(value);
}
public List<Integer> getItems(int key) {
return myMap.get(key);
}

Well, There are a couple of ways I would think to do this, if you are not adding items too frequently, then it might be a good idea to simply do a check to see if there is an item at that location before adding it.
if(list.get(X) == null)
{
list.add(X,Y);
}
Otherwise if you are going to be doing this too often...then I would recommend creating your own custom List class, and extending ArrayList or whatever you are using, and simply override the add method, to deal with collisions.

Find objects in a list where some attributes have equal values

Given a list of objects (all of the same type), how can I make sure that it contains only one element for each value of a certain attribute, even though equals() may return false for such elements due to more attributes being checked? In code:
private void example() {
List<SomeType> listWithDuplicates = new ArrayList<SomeType>();
/*
* create the "duplicate" objects. Note that both attributes passed to
* the constructor are used in equals(), though for the purpose of this
* question they are considered equal if the first argument was equal
*/
SomeType someObject1 = new SomeObject1("hello", "1");
SomeType someObject2 = new SomeObject1("hello", "2");
List<SomeType> listWithoutDuplicates = removeDuplicates(listWithDuplicates)
//listWithoutDuplicates should not contain someObject2
}
private List<SomeType> removeDuplicates(List<SomeType> listWithDuplicates) {
/*
* remove all but the first entry in the list where the first constructor-
* arg was the same
*/
}

Could use a Set as an intermediary placeholder to find the duplicates as Bozho suggested. Here's a sample removeDuplicates() implementation.
private List<SomeType> removeDuplicates(List<SomeType> listWithDuplicates) {
/* Set of all attributes seen so far */
Set<AttributeType> attributes = new HashSet<AttributeType>();
/* All confirmed duplicates go in here */
List duplicates = new ArrayList<SomeType>();
for(SomeType x : listWithDuplicates) {
if(attributes.contains(x.firstAttribute())) {
duplicates.add(x);
}
attributes.add(x.firstAttribute());
}
/* Clean list without any dups */
return listWithDuplicates.removeAll(duplicates);
}

Maybe a HashMap can be used like this:
private List<SomeType> removeDuplicates(List<SomeType> listWithDuplicates) {
/*
* remove all but the first entry in the list where the first constructor-
* arg was the same
*/
Iterator<SomeType> iter = listWithDuplicates.iterator();
Map<String, SomeType> map = new HashMap<String, SomeType>();
while(iter.hasnext()){
SomeType i = iter.next();
if(!map.containsKey(i.getAttribute())){
map.put(i.getAttribute(), i);
}
}
//At this point the map.values() is a collection of objects that are not duplicates.
}

If equals() were suitable, I could recommend some "standard" Collections classes/methods. As it is, I think your only option will be to either
copy each element to another list after first checking all preceding elements in the original list for duplicates; or
delete from your list any element for which you've found a duplicate at a preceding location. For in-list deletion, you'd be best off with using a LinkedList, where deletion isn't so expensive.
In either case, checking for duplicates will be an O(n^2) operation, alas.
If you're going to be a lot of this kind of operation, it might be worthwhile to wrap your list elements inside another class that returns a hashcode based on your own defined criteria.

I'd look at implementing the Comparator interface for something like this. If there's a simple attribute or two that you wish to use for your comparison, that makes it pretty straightforward.
Related question: How Best to Compare Two Collections in Java and Act on Them?

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.