Remove List<String> duplicates using equals

Remove List<String> duplicates using equals - java

I'm fairly new to Java and I've been trying to solve the following problem unsuccessfully.
Write a Java method that will remove duplicates from a given list.
Assuming:
Method accepts type List
Return type is void
Duplicates are determined using equals()
Main:
Creates an instant of List and loads it with duplicate String values
Invoke removeDuplicates(), pass in this list
Outputs modified list to the console.
I can solve the problem by passing in my list to a new HashSet and copy it back.
But the problem is:
Question is asking me to solve it using equals()...
If the return type is void, how can i output it in the main ?
import java.util.*;
public class Question1 {
public static void main(String[] args) {
String[] words = {"good","better", "best", "best", "first" , "last", "last", "last", "good"};
List<String> list = new ArrayList<String>();
for (String s : words) {
list.add(s);
}
removeDuplicates(list);
}
static void removeDuplicates(List<String> array){
HashSet<String> hs = new HashSet<>();
hs.addAll(array);
array.clear();
array.addAll(hs);
for (String x : array){
System.out.println(x);
}
}
}
EDIT: well, this one works, but as you can see i'm not using equals() and i'm printing out from my static method, not from main.
Also, is there any way I can populate the List faster than using String[] ?

java.util.HashSet uses Object.equals(Object) in its implementation of Set.add(Object) to determine that the element being inserted is unique (as defined by not being equal to another element). HashSet also has the advantage of allowing you to do the de-duping process in O(n) time vs a more naive approach of comparing every element to every other element in O(n^2) time.
The code in main will see the modified list, because the List object is mutable. When a method changes the state of a passed in argument the calling code will see those changes.

removeDuplicates creates a set, then iterates over the input list. If it encounters an element in the input list, that is also in the set, removeDuplicates removes the element from the input list, otherwise it adds the element to the set.
Java is a call-by-reference language (sort of). This means, the method removeDuplicates can modify the List<String> array that it receives and the caller will see that modified list after the call to removeDuplicates returned.

Here's how you would do the same thing without using a Set and just using equals() (also to somewhat answer your "EDIT" question about initializing a List) :
public static void main(String[] args) {
List<String> list = new ArrayList<String>(Arrays.asList(new String[] {
"good", "better", "best", "best", "first", "last", "last", "last",
"good"}));
removeDuplicates(list);
for (String x : list) {
System.out.println(x);
}
}
static void removeDuplicates(List<String> array) {
for (int i = 0; i < array.size(); i++) {
String next = array.get(i);
// check if this has already appeared before
for (int j = 0; j < i; j++) {
// if it has, stop the search and remove it
if (next.equals(array.get(j))) {
array.remove(i);
// decrement i since we just removed the i'th element
i--;
// stop the search
break;
}
}
}
}
That said, using HashSet is a better idea since as has already been pointed out it is more efficient.
If you want the efficiency of HashSet but still preserve the order of the List you could do something like this :
static void removeDuplicates(List<String> array) {
Set<String> set = new HashSet<String>();
for (int i = 0; i < array.size(); i++) {
String next = array.get(i);
// check if this has already appeared before
if (!set.add(next)) {
// if it has then remove it
array.remove(i);
// decrement i since we just removed the i'th element
i--;
}
}
}

The probably easiest way would be to use a Set in the first place, which by definition does not allow duplicates.
For your actual problem, you can do several approaches:
The easy but slow approach: compare each element A with each other element N in the list. If A.equals(N) remove N. Hint: you only need to compare A to each further element, as you have already checked each element before A.
The faster approach: sort the list using the natural comperator. Now you no longer need to compare each element A vs N, but only A vs the next few elements. To be exact: until you find the first element that is not equal to A. In this case you can assume that there is no further duplicate of A (thanks to the sorting) and continue with this next element as A.
The Map approach (fast but takes more memory): for each element put into the list, put the same element into a Map with any Object as value. Now you can just look-up whether or not that element is already in the map, and if it is, it is a duplicate.
The best way would be the 2nd approach as the sorting is very fast, you only need to get each element once, and there is no 2nd list necessary.
Edit: The 2nd approach in code:
static void removeDuplicates(List<String> array) {
if (array.size() <= 1) {
return;
}
Collections.sort(array);
final Iterator<String> it = array.iterator();
String a = it.next(), n;
while (it.hasNext()) {
n = it.next();
if (((a == null) && (n != null))
|| ((a != null) && (a.equals(n) == false))) {
a = n;
} else {
it.remove();
}
}
}

import java.util.ArrayList;
import java.util.List;
import java.util.ListIterator;
public class Main {
public static void main(String[] args) {
List<String> list = new ArrayList<String>();
list.add("good");
list.add("better");
list.add("best");
list.add("best");
list.add("first");
list.add("last");
list.add("last");
list.add("last");
list.add("good");
removeDuplicates(list);
System.out.println(list.toString());
}
public static void removeDuplicates(List<String> list) {
if (list != null) {
for (int i = 0; i < list.size(); i++) {
ListIterator<String> listIterator = list.listIterator();
while (listIterator.hasNext()) {
int nextIndex = listIterator.nextIndex();
String nextElement = listIterator.next();
if (list.get(i).equals(nextElement) && i != nextIndex)
listIterator.remove();
}
}
}
}
}

Related

How can I tell if the elements in an Array List are same or different?

I'm trying to verify if all the elements in an array list are same or not. This is my code:
ArrayList<Integer> arr = new ArrayList<>(Arrays.asList(2,2,4,2));
for (int z = 0; z < arr.size(); z++) {
if(!arr.get(z++).equals(arr.get(z--))) {
System.out.println("same");
}else {
System.out.println("differnt");
}
}

Put the elements into a Set. If the resulting set has a size of 1, then all elements have been the same. One line of code, no loops, no indices, works with every collection:
boolean allTheSame = new HashSet<Integer>(list).size() == 1;
System.out.println(allTheSame ? "same" : "different");
(Edited:)
It might be worth noting that if the list is large, and likely contains many different elements, then constructing a Set will impose some memory overhead that can be avoided, if desired. In this case, you'd iterate over the list and compare all elements to the first one. But you should not check the elements for identity with ==. Instead, you should compare them using their equals method, or, if you graciously want to handle null entries, using Objects#equals.
An example of how to solve this efficiently and generically is given in the answer by Zabuza

There are various solutions to this.
Compare any with others
You just need to pick any element (the first, for example) and then compare this to all other elements. A single simple loop is enough:
public static <E> areElementsEquals(List<E> list) {
// Edge cases
if (list == null || list.size() <= 1) {
return true;
}
// Pick any element
E any = list.get(0);
// Compare against others
for (E other : list) {
// Use Objects#equals for null-safety
if (!Objects.equals(any, other)) {
return false;
}
}
return true;
}
Or a Stream-API version:
return list.stream()
.allMatch(other -> Objects.equals(any, other));
If you checked that any is not null, you could also use a method reference:
return list.stream()
.allMatch(any::equals);
Set
Sets do not have duplicates. You can put all your elements into a Set and check if the size is 1, then all other elements were duplicates.
return new HashSet<>(list).size() == 1;
While this code is pretty compact, I would favor the more straightforward solution of iterating. It is a bit more readable and also more efficient, since it does not have the additional overhead of setting up a set.

You only have to compare the 1st item against all the others:
int a = arr.get(0);
boolean allSame = true;
for (int z = 1; z < arr.size(); z++) {
allSame = (a == arr.get(z));
if (!allSame) break;
}
if (allSame)
System.out.println("Same");
else
System.out.println("Different");

. . . and does your code work? What sort of output do you get? Are you suffering any exceptions?
Don't declare your List as ArrayList; declare it as List. Don't call a List arr; it isn't an array. Call it numbers or something like that.
Why have you got the bang sign/not operator in line 3? I think that shouldn't be there.
If you think about the different kinds of collection/data structure available, which you can read about here, you will find a collection type whose size() method will tell you how many distinct elements you have.

You just have to compare the current element with the next, if they are different that means you don't have all elements the same:
for(int i = 0; i < list.size() - 1; i++) {
if (list.get(i) != list.get(i + 1)) {
return false; // elements are different
}
}
return true; // all element are the same

Try this :
String first = arr.get(0);
boolean allTheSame = true;
if (arr.size() > 1) {
for (int z = 1; z < arr.size(); z++) {
if (!arr.get(z).equals(first)) {
allTheSame = false;
break;
}
}
}

A method use BitSet to judge are all elements in list is same or not,it need less memory and run faster.
public static boolean areAllElementsSame(List<Integer> numbers) {
BitSet set = new BitSet();
numbers.forEach(new Consumer<Integer>() {
#Override
public void accept(Integer integer) {
set.set(integer);
}
});
return set.cardinality() == 1;
}
This method can also used to figure out how many different elements.

same is a flag that stores the result we intend.
uv is the uniformality variable.
Object is the type of object you stored in list (the arraylist)
import java.util.*;
class Main{
public static void main(String args[]){
ArrayList<Integer> arr = new ArrayList<>(Arrays.asList(2,2,2,2));
boolean same=true;
Object uv=arr.get(0);
for (Object i: arr){
if(!i.equals(uv)){
same=false;
break;
}
}
System.out.print("Result:"+same);
}
}

You will have to check for each element, if all the elements on later indexes are same as that one or different than it.
You can do it using a nested loop like this:
public static void main(String[] args) {
// write your code here
ArrayList<Integer> arr = new ArrayList<Integer>(Arrays.asList(2,2,4,2));
boolean result=true;
for (int i = 0; i < arr.size(); i++) {
for (int j=i; j<arr.size(); j++){
if (!arr.get(i).equals(arr.get(j))){
result=false;
}
}
}
System.out.println(result);
}
the 2nd loop starts from j=i and goes till the right end of the array because you don't need to check the left side of that index as it is already checked in the previous iterations and the result would already have been updated to false.

If you want to ensure that the list contains at least two different elements, you have to "walk" the array once: you compare the first element against all others, and stop on the first mismatch. On mismatch: not all elements are the same, otherwise they are all the same!
But the initial question was a bit unclear. If you want to determine if there are no two equal elements in the array, you have to compare all entries against all others! Then you need two loops: you pick all elemenst in order, to compare them to all others (respectively to all following ones: you already compared slot 1 to all other slots, so you would only have to compare slot 2 to slot3 ... til end).
Another approach would be to use a Set implementation, for example HashSet! Sets have unique members. So when you turn your list into a set, and the set has less entries than the list, you know that the list contains duplicates.

How to check multiple contains operations faster?

I have a String list as below. I want to do some calculations based on if this list has multiple elements with same value.
I got nearly 120k elements and when I run this code it runs too slow. Is there any faster approach than contains method?
List<String> words= getWordsFromDB(); //words list has nearly 120k elements
List<String> tempWordsList = new LinkedList<String>(); //empty list
String[] keys = getKeysFromDB();
List<String> tempKeysList = new LinkedList<String>();
for (int x = 0; x < words.size(); x++) {
if (!tempWordsList.contains(words.get(x))) {
tempWordsList.add(words.get(x));
String key= keys[x];
tempKeysList.add(key);
} else {
int index = tempWordsList.indexOf(words.get(x));
String m = tempKeysList.get(index);
String n = keys[x];
if (!m.contains(n)) {
String newWord = m + ", " + n;
tempKeysList.set(index, newWord);
}
}
}
EDIT: words list comes from database and problem is there is a service continuously updating and inserting data to this table. I don't have any access to this service and there are other applications who is using the same table.
EDIT2: I have updated for full code.

You are searching the list twice per word: once for contains() and once for indexOf(). You could replace contains() by indexOf(), test the result for -1, otherwise reuse the result instead of calling indexOf() again. But you are certainly using the wrong data structure. What exactly do you need a for? Do you need a? I would use a HashSet, or a HashMap if you need to associate other data with each word.

//1) if you can avoid using linked list use below solution
List<String> words= getWordsFromDB(); //words list has nearly 120k elements
//if you can avoid using linked list, use set instead
Set<String> set=new HashSet<>();
for (String s:words) {
if (!set.add(s)) {
//do some calculations
}
}
//2) if you can't avoid using linked list use below code
List<String> words= getWordsFromDB(); //words list has nearly 120k elements
List<String> tempList = new LinkedList<String>(); //empty list
//if you can't avoid LinkedListv (tempList) you need to use a set
Set<String> set=new HashSet<>();
for (String s:words) {
if (set.add(s)) {
tempList.add(s);
} else {
int a = tempList.indexOf(s);
//do some calculations
}
}

LinkedList.get() runs in O(N) time. Either use ArrayList with O(1) lookup time, or avoid indexed lookups altogether by using an iterator:
for (String word : words) {
if (!tempList.contains(word)) {
tempList.add(word);
} else {
int firstIndex = tempList.indexOf(word);
//do some calculations
}
}
Disclaimer: The above was written under the questionable assumption that words is a LinkedList. I would still recommend the enhanced-for loop, since it's more conventional and its time complexity is not implementation-dependent. Either way, the suggestion below still stands.
You can further improve by replacing tempList with a HashMap. This will avoid the O(N) cost of contains() and indexOf():
Map<String, Integer> indexes = new HashMap<>();
int index = 0;
for (String word : words) {
Integer firstIndex = indexes.putIfAbsent(word, index++);
if (firstIndex != null) {
//do some calculations
}
}
Based on your latest update, it looks like you're trying to group "keys" by their corresponding "word". If so, you might give streams a spin:
List<String> words = getWordsFromDB();
String[] keys = getKeysFromDB();
Collection<String> groupedKeys = IntStream.range(0, words.size())
.boxed()
.collect(Collectors.groupingBy(
words::get,
LinkedHashMap::new, // if word order is significant
Collectors.mapping(
i -> keys[i],
Collectors.joining(", "))))
.values();
However, as mentioned in the comments, it would probably be best to move this logic into your database query.

Acutally, tempList use linear complexity time methods :
if (!tempList.contains(words.get(x))) {
and
int a = tempList.indexOf(words.get(x));
It means that at each invocation of them, the list is in average iterate at half.
Besides, these are redundant.
indexOf() only could be invoked :
for (int x = 0; x < words.size(); x++) {
int indexWord = tempList.indexOf(words.get(x));
if (indexWord != -1) {
tempList.add(words.get(x));
} else {
//do some calculations by using indexWord
}
}
But to improve all accesses, you should change your structure : wrapping or replacing LinkedList by LinkedHashSet.
LinkedHashSet would keep the actual behavior because as List, it defines the iteration ordering, which is the order in which elements were inserted into the set but it also uses hashing feature to improve time access to its elements.

java.util.ArrayList$Itr.checkForComodification exception thrown

when running the code below i get the above exception but i don't know why or how to fix it.
Im pretty sure it comes from
for(int node : adjacent(currentnode))
{
//System.out.println(adjacent(currentnode));
//System.out.println(node);
if (remainingnodes.contains(getNode(node)))
{
adjacent.add(node);
remainingnodes.remove(getNode(node));
//System.out.println(remainingnodes);
}
}
getNode just takes an integer and returns the corresponding node. I used to not get the exception before i used getNode in remainingnodes.contains but at the time it was removing the components so i had to change it and now i get the exception.
public int distance(int target, List<Integer> detectives)
{
List<Integer> adjacent = new ArrayList<>();
Set<Node<Integer>> remainingnodes = new HashSet<Node<Integer>>();
List<Integer> currentnodes = new ArrayList<>();
int distance = 0;
int i = 0;
currentnodes.add(target);
remainingnodes = graph.getNodes();
remainingnodes.remove(getNode(target));
while (detectives.size() != 0)
{
for (int currentnode : currentnodes)
{
for(int node : adjacent(currentnode))
{
//System.out.println(adjacent(currentnode));
//System.out.println(node);
if (remainingnodes.contains(getNode(node)))
{
adjacent.add(node);
remainingnodes.remove(getNode(node));
//System.out.println(remainingnodes);
}
}
for (int detective : detectives)
{
if (currentnode == detective)
{
distance = distance + i;
detectives.remove(detective);
}
}
}
currentnodes.clear();
currentnodes = adjacent;
i++;
}
Thanks
Arthur

You cant modify the List in for each loop. If you want to remove any elements in loop use iterator. You can remove elements using iterator.remove(); which deletes current element in the iterator.

As you are itterating over a list and trying to remove element from the same list, a ConcurrentModificationException shall occur.
You first iterate over the list then you determine the position of the element you want to remove from list and store it in a temporary variable, then after the iteration is complete, just remove the particular element based on stored position.

Your assumption is correct. While iterating over a List, you cannot alter it, like you are doing when calling adjacent.add(node) inside the for loop.
This is called concurrent modification, thus the exception ConcurrentModificationException.

Linked List Constructor with initial value

I'm working on a Linked List project, and i'm having a great trouble with the constructor.
i already implemented the default constructor (creates empty list. AKA data = null, size = 0) but the other constructor is really confusing me !!!!
i want to implement a constructor that creates a linked list with valueS/elementS in it (String[]). My first thought was "Piece of cake, all i have to do is :
Use the default constructor to create an empty linked list
Use a for-each loop within a for loop.
The for-each loop is to iterate the string array and add them to my empty linked list.
The for loop is needed to keep a track of the index."
Here is my Code:
public LinkedList(String[] data)
{
LinkedList l = new LinkedList();
for (int i = 0; i <= data.length; i++)
{
for (String d : data)
{
l.add(d, i);
i++;
}
}
}
i tested my code by using this constructor but it does not work.
i know there is a silly mistake somewhere but my logic/mind is blind to see it.

Well you're not really referring to "this" anymore in the constructor you've written. You create a linked list l and modify that one, but you never actually work on "this". Also I agree with the others, the second for loop is unnecessary.
This also lets you use this(), which is a cool functionality to get to know. Helps you keep your code DRY and bug free.
public LinkedList(String[] data){
this(); //Call the default constructor to set up default properties
for (String d : data){
add(d); //Call on this
}
}

You need to change the arguments to the add method, it expects the index to be the first argument. See http://docs.oracle.com/javase/7/docs/api/java/util/LinkedList.html#add(int,%20E).
for (String d : data)
{
l.add(i, d);
i++;
}

Another way of doing this would be to just add l.add(d) in the loop. With this new elements are guaranteed to be inserted at the end of the list.
I would use l.add(i, d) when I want to insert specifically at a given location.

Does your class extend LinkedList? If so, here is what I would do:
public class MyLinkedList extends LinkedList<String> {
...
public MyLinkedList(String... array) {
super();
if (array != null && array.length > 0) {
for (String s : array) {
add(s);
}
}
}
...
}
It isn't a great idea to extend LinkedList. If you want an easy way to create a new LinkedList with elements use the following method:
public static <E> LinkedList<E> newLinkedList(
#SuppressWarnings("unchecked") final E... elements) {
final LinkedList<E> list = new LinkedList<E>();
Collections.addAll(list, elements);
return list;
}
....
LinkedList<String> yourList = newLinkedList("foo", "bar", "baz");

trying to remove the first occurence of the number in the arraylist

public static void main(String [] args){
Scanner input = new Scanner(System.in);
System.out.println("Enter some numbers (all on one line, separated by spaces):");
String line = input.nextLine();
String[] numbers = line.split(" +");
ArrayList<Integer> a = new ArrayList<Integer>();
for(int i=0; i<numbers.length; i++)
a.add(new Integer(numbers[i]));
System.out.println("The numbers are stored in an ArrayList");
System.out.println("The ArrayList is "+a);
System.out.print("\nEnter a number: ");
int p = input.nextInt();
System.out.println(removeNumber(a,p));
System.out.println(removeNumber2(a,p));
}
public static <T> ArrayList<T> removeNumber(ArrayList<T> a, Integer e)
{
ArrayList<T> b = new ArrayList<T>();
for(int i = 0; i< a.size();i++)
{
if (a.get(i).equals(e))
a.remove(e);
}
return a;
}
if ex.value = 4, I want to remove 4 from the arrayList. If my arraylist contains [5,12,4,16,4], I want to remove the first occurence of four from it, and save it to another arraylist.
Don't want to use Iterators

Without using an iterator, here's what you could do to fix your code :
for(int i = 0; i< a.size();i++) {
if (a.get(i).equals(e.value)) {
a.remove(e.value);
i--;
}
}
Beyond the change of == to equals, you have to decrement i whenever you remove an element from the ArrayList. The reason for that is that removing the ith element from an ArrayList decreases the indices of all the elements that follow it by one. Therefore, the i+1th element will become the new ith element, so you must decrement i in order not to skip the next element.
EDIT : For some reason I was sure you wanted to remove all occurences of the number from the list, and not just the first one. If you only want to remove one element from the list, you don't have to worry about iterating over the rest of the list after removing that element.

You cannot iterate over an ArrayList and modify it at the same time, without an Iterator
Do like this :
for(Iterator<T> i = a.iterator(); i.hasNext();)
{
if (i.next().equals(e.value))
{
i.remove();
}
}
BTW, your b ArrayList is useless.

Along with the suggestion of using iterators, you should not be using the operator == which checks for reference equality. This will always be false in your scenario. You want to use the equals method instead which checks for value equality.
if (a.get(i).equals((Integer)e.value)))

You don't need to iterate through the list, either with an iterator or an index. ArrayList has a remove method that searches for and removes an element, and returns true or false depending on whether it found an element to remove. So you can say
while (true) {
boolean found = a.remove(e.value);
if (!found) {
break;
}
}
or
boolean found;
do {
found = a.remove(e.value);
} while(found);
or if you value compactness over readability,
do { } while(a.remove(e.value));
Note that a.remove(e), as you have in your original code, won't work at all. The ArrayList is an ArrayList of Integer, but e is an EX6. Thus, a.remove(e) won't find e in the list at all, since it isn't even the correct type. It should compile fine, since remove is defined to allow any Object as a parameter, but it will never find anything (since the equals method of Integer always returns false for a non-Integer).

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Remove List<String> duplicates using equals - java

Related

How can I tell if the elements in an Array List are same or different?

How to check multiple contains operations faster?

java.util.ArrayList$Itr.checkForComodification exception thrown

Linked List Constructor with initial value

trying to remove the first occurence of the number in the arraylist

Categories

Resources