removing duplicates from an arraylist

removing duplicates from an arraylist - java

I am trying to remove duplicate objects from an arraylist
see code below:
ArrayList<Customer> customers=new ArrayList<Customer>();
for(int i=0;i<accounts.size();i++){
customers.add(accounts.get(i).getCustomer());
}
for(int i=0;i<customers.size();i++){
for(int j=i+1;j<customers.size();j++){
if(customers.get(i).getSocialSecurityNo().compareTo(customers.get(j).getSocialSecurityNo())==0){
if(customers.get(i).getLastName().compareToIgnoreCase(customers.get(j).getLastName())==0){
if(customers.get(i).getFirstName().compareToIgnoreCase(customers.get(j).getFirstName())==0){
customers.remove(j);
}
}
}
}
}
However, it seems that the last object in the list is not being processed. Perhaps someone can pinpoint the error

Try adding j--; after removing an item. That will reindex for you and solve your issue.

The basic flaw is that since the ListArray is mutable, once you remove one element your indexes have to be readjusted.
if(customers.get(i).getFirstName().compareToIgnoreCase(customers.get(j).getFirstName())==0){
customers.remove(j--);
}
also try subtracting one from your i loop:
for(int i=0;i<customers.size()-1;i++){
for(int j=i+1;j<customers.size();j++){

public static void removeDuplicates(ArrayList list) {
HashSet set = new HashSet(list);
list.clear();
list.addAll(set);
}
override equals and hashcode appropriatley

custormers = new ArrayList(new HashSet(customers))
ensure the equals and hashmethod are correctly implemented

The code below worked for me. Give it a try. You can manipulate the compare method to suit your taste
ArrayList customers = .....;
Set customerlist = new TreeSet(new Comparator(){
#Override
public int compare(Customer c1, Customer c2) {
return c1.getSocialSecurityNo().compareTo(c2.getSocialSecurityNo());
}
});
customerlist.addAll(customers);
customers.clear();
customers.addAll(customerlist);

It's your int j=i+1 that causes trouble. You need to test with the last value of the customers list for each iteration.

Before you add them to the list in the above loop, why don't you check
if(!cutomers.contains(accounts.get(i).getCustomer())
{
//add them if it doesn't contain
}
It should save you from doing the second loop
Edit: Need to override the equals method.

So, about doing this right:
Your Customer objects should have an equals() and hashCode() method, which do the comparison. (Or you simply would have only one Customer object for each customer, which would mean your data model would have to be adjusted. Then the default hashCode/equals would do.)
If you have this, you can replace your three nested ifs with one:
if(customers.get(i).equals(customers.get(j)) {
customers.remove(j);
}
This would not yet solve your problem, but make it easier to have a clearer look on it. If
you look at which objects are compared to which others, you will see that after each removal
of an object from the list, the next one has the same index as the one which you just removed,
and you will not compare the current object to it. As said, j-- after the removal will solve this.
A more performant solution would be using a Set (which is guaranteed not to contain duplicates).
In your case, a HashSet<Customer> or LinkedHashSet<Customer> (if you care about the order)
will do fine.
Then your whole code comes down to this:
Set<Customer> customerSet = new HashSet<Customer>();
for(Account acc : accounts){
customerSet.add(acc.getCustomer());
}
List<Customer> customers = new ArrayList<Customer>(customerSet);
If you don't really need a list (i.e. indexed access), ommit the last line and simply
use the set instead.

My first thought was to use Sets, as others have mentioned. Another approach would be to use Java's version of the foreach, instead of using indexes. A general approach:
public static ArrayList removeDuplicates(ArrayList origList) {
ArrayList newList = new ArrayList();
for (Object m : origList) {
if (!newList.contains(m)) {
newList.add(m);
}
}
return newList;
}
In testing, I just used Strings; I'd recommend inserting Customer into the code where appropriate for type safety.

Related

Java, Sorting an ArrayList by using Comparator

As I used Comparator for sorting a library after the author's name, I just coincidentally "found" something, which actually works perfectly, but I don't understand why. Firstly please have a look at my code:
public class Bookshelf{
Collection<Literature> shelf = new ArrayList<Literature>();
ArrayList<Literature> unsorted = (ArrayList<Literature>)shelf;
public void printShelf() {
Comparator<Literature> compareBySurname= new Comparator<Literature>() {
#Override
public int compare(Literature o1, Literature o2) {
return o1.author.surname.compareTo(o2.author.surname);
}
};
unsorted.sort(compareBySurname);
for (Literature c : shelf)
System.out.println(c);
}
}
As you can see, I am sorting the ArrayList "unsorted". But after I sort it, I am iterating through the Collection "shelf" and printing the elements of the Collection "shelf".And the output is a list of sorted elements by surname.
To achive my intention, I actually should iterate through the ArrayList "unsorted" and print the elements (of course this option works too). So my question is, why the first methode actually works too? :D So I am not sorting the Collection "shelf" directly, but I get a sorted list.
Thanks in advance!

ArrayList<Literature> unsorted = (ArrayList<Literature>)shelf; does not create a new ArrayList. It simply makes unsorted refer to the same ArrayList as shelf. They are not different objects. You want something like
ArrayList<Literature> unsorted = new ArrayList<>(shelf); // <-- a different List.

Because both lists share the same memory reference when you assign the list with the "=" operator. To have a new list with another reference, you must use the key name "new".

Using Set<Integer> to remove indexes in an List<String>

I have an ArrayList of strings TrackArray which define unique ids of tracks.
I also have a Set<Integer> indexSet to save the indexes of the tracks ids I need to remove from the TrackArray.
I tried the following:
public void deleteAllTracks(){
if (!indexSet.isEmpty() ) {
TrackArray.removeAll(indexSet);
}
notifyDataSetChanged();
indexSet.clear();
}
The code is not working, probably because it doesn't cast the Integer into int in the RemoveAll. I haven't found another workaround except deleting one by one.

The ideal solution would be to change indexSet from Set<Integer> to a Set<String>. Otherwise you can't use it to remove elements from your ArrayList<String>.
If you can't change it, you can convert it to a Set<String> with:
Set<String> strIndexSet = indexSet.stream().map(Integer::toString).collect(Collectors.toSet());
So you can write:
public void deleteAllTracks(){
if (!indexSet.isEmpty() ) {
TrackArray.removeAll(indexSet.stream().map(Integer::toString).collect(Collectors.toSet());
}
notifyDataSetChanged();
indexSet.clear();
}
However, it would be less efficient to run this conversion every time your method is called.

Adding items to empty List at specific locations in java

Is there any way I can make the below code work without commenting the 3rd line.
List<Integer> list = new ArrayList<Integer>();
list.add(0,0);
//list.add(1,null);
list.add(2,2);
I want to add items to list at specific locations. But if I don't change the index to Nth position I am not being able to add at Nth as told in this answer.
I can't use a map because I don't want to miss a value when the keys are same. Also adding null values to a list for large lists will be an overhead. When there is a collision I want the item to take the next position(nearest to where it should have been).
Is there any List implementation that shifts index before it tries to add the item?

Use something like a MultiMap if your only concern is not "missing a value" if the keys are the same.
I'm not sure how doing a shift/insert helps if I understand your problem statement--if the "key" is the index, inserting will lose the same information.

You can use Vector and call setSize to prepopulate with null elements.
However, your comment about the overhead of the nulls speaks to an associative container as the right solution.

This still smells like you should be using a Map. Why not use a Map<Integer, List<Integer>>?
something like,
private Map<Integer, List<Integer>> myMap = new HashMap<Integer, List<Integer>>();
public void addItem(int key, int value) {
List<Integer> list = myMap.get(key);
if (list == null) {
list = new ArrayList<Integer>();
myMap.put(key, list);
}
list.add(value);
}
public List<Integer> getItems(int key) {
return myMap.get(key);
}

Well, There are a couple of ways I would think to do this, if you are not adding items too frequently, then it might be a good idea to simply do a check to see if there is an item at that location before adding it.
if(list.get(X) == null)
{
list.add(X,Y);
}
Otherwise if you are going to be doing this too often...then I would recommend creating your own custom List class, and extending ArrayList or whatever you are using, and simply override the add method, to deal with collisions.

How to lowercase every element of a collection efficiently?

What's the most efficient way to lower case every element of a List or Set?
My idea for a List:
final List<String> strings = new ArrayList<String>();
strings.add("HELLO");
strings.add("WORLD");
for(int i=0,l=strings.size();i<l;++i)
{
strings.add(strings.remove(0).toLowerCase());
}
Is there a better, faster way? How would this example look like for a Set? As there is currently no method for applying an operation to each element of a Set (or List) can it be done without creating an additional temporary Set?
Something like this would be nice:
Set<String> strings = new HashSet<String>();
strings.apply(
function (element)
{ this.replace(element, element.toLowerCase();) }
);
Thanks,

Yet another solution, but with Java 8 and above:
List<String> result = strings.stream()
.map(String::toLowerCase)
.collect(Collectors.toList());

This seems like a fairly clean solution for lists. It should allow for the particular List implementation being used to provide an implementation that is optimal for both the traversal of the list--in linear time--and the replacing of the string--in constant time.
public static void replace(List<String> strings)
{
ListIterator<String> iterator = strings.listIterator();
while (iterator.hasNext())
{
iterator.set(iterator.next().toLowerCase());
}
}
This is the best that I can come up with for sets. As others have said, the operation cannot be performed in-place in the set for a number of reasons. The lower-case string may need to be placed in a different location in the set than the string it is replacing. Moreover, the lower-case string may not be added to the set at all if it is identical to another lower-case string that has already been added (e.g., "HELLO" and "Hello" will both yield "hello", which will only be added to the set once).
public static void replace(Set<String> strings)
{
String[] stringsArray = strings.toArray(new String[0]);
for (int i=0; i<stringsArray.length; ++i)
{
stringsArray[i] = stringsArray[i].toLowerCase();
}
strings.clear();
strings.addAll(Arrays.asList(stringsArray));
}

You can do this with Google Collections:
Collection<String> lowerCaseStrings = Collections2.transform(strings,
new Function<String, String>() {
public String apply(String str) {
return str.toLowerCase();
}
}
);

If you are fine with changing the input list here is one more way to achieve it.
strings.replaceAll(String::toLowerCase)

Well, there is no real elegant solution due to two facts:
Strings in Java are immutable
Java gives you no real nice map(f, list) function as you have in functional languages.
Asymptotically speaking, you can't get a better run time than your current method. You will have to create a new string using toLowerCase() and you will need to iterate by yourself over the list and generate each new lower-case string, replacing it with the existing one.

Try CollectionUtils#transform in Commons Collections for an in-place solution, or Collections2#transform in Guava if you need a live view.

This is probably faster:
for(int i=0,l=strings.size();i<l;++i)
{
strings.set(i, strings.get(i).toLowerCase());
}

I don't believe it is possible to do the manipulation in place (without creating another Collection) if you change strings to be a Set. This is because you can only iterate over the Set using an iterator or a for each loop, and cannot insert new objects whilst doing so (it throws an exception)

Referring to the ListIterator method in the accepted (Matthew T. Staebler's) solution. How is using the ListIterator better than the method here?
public static Set<String> replace(List<String> strings) {
Set<String> set = new HashSet<>();
for (String s: strings)
set.add(s.toLowerCase());
return set;
}

I was looking for similar stuff, but was stuck because my ArrayList object was not declared as GENERIC and it was available as raw List type object from somewhere. I was just getting an ArrayList object "_products". So, what I did is mentioned below and it worked for me perfectly ::
List<String> dbProducts = _products;
for(int i = 0; i<dbProducts.size(); i++) {
dbProducts.add(dbProducts.get(i).toLowerCase());
}
That is, I first took my available _products and made a GENERIC list object (As I were getting only strings in same) then I applied the toLowerCase() method on list elements which was not working previously because of non-generic ArrayList object.
And the method toLowerCase() we are using here is of String class.
String java.lang.String.toLowerCase()
not of ArrayList or Object class.
Please correct if m wrong. Newbie in JAVA seeks guidance. :)

Using JAVA 8 parallel stream it becomes faster
List<String> output= new ArrayList<>();
List<String> input= new ArrayList<>();
input.add("A");
input.add("B");
input.add("C");
input.add("D");
input.stream().parallel().map((item) -> item.toLowerCase())
.collect(Collectors.toCollection(() -> output));

Compare new Integer Objects in ArrayList Question

I am storing Integer objects representing an index of objects I want to track. Later in my code I want to check to see if a particular object's index corresponds to one of those Integers I stored earlier. I am doing this by creating an ArrayList and creating a new Integer from the index of a for loop:
ArrayList<Integer> courseselectItems = new ArrayList();
//Find the course elements that are within a courseselect element and add their indicies to the ArrayList
for(int i=0; i<numberElementsInNodeList; i++) {
if (nodeList.item(i).getParentNode().getNodeName().equals("courseselect")) {
courseselectItems.add(new Integer(i));
}
}
I then want to check later if the ArrayList contains a particular index:
//Cycle through the namedNodeMap array to find each of the course codes
for(int i=0; i<numberElementsInNodeList; i++) {
if(!courseselectItems.contains(new Integer(i))) {
//Do Stuff
}
}
My question is, when I create a new Integer by using new Integer(i) will I be able to compare integers using ArrayList.contains()? That is to say, when I create a new object using new Integer(i), will that be the same as the previously created Integer object if the int value used to create them are the same?
I hope I didn't make this too unclear. Thanks for the help!

Yes, you can use List.contains() as that uses equals() and an Integer supports that when comparing to other Integers.
Also, because of auto-boxing you can simply write:
List<Integer> list = new ArrayList<Integer>();
...
if (list.contains(37)) { // auto-boxed to Integer
...
}
It's worth mentioning that:
List list = new ArrayList();
list.add(new Integer(37));
if (list.contains(new Long(37)) {
...
}
will always return false because an Integer is not a Long. This trips up most people at some point.
Lastly, try and make your variables that are Java Collections of the interface type not the concrete type so:
List<Integer> courseselectItems = new ArrayList();
not
ArrayList<Integer> courseselectItems = new ArrayList();

My question is, when I create a new Integer by using new Integer(i) will I be able to compare integers using ArrayList.contains()? That is to say, when I create a new object using new Integer(i), will that be the same as the previously created Integer object if the int value used to create them are the same?
The short answer is yes.
The long answer is ...
That is to say, when I create a new object using new Integer(i), will that be the same as the previously created Integer object if the int value used to create them are the same?
I assume you mean "... will that be the same instance as ..."? The answer to that is no - calling new will always create a distinct instance separate from the previous instance, even if the constructor parameters are identical.
However, despite having separate identity, these two objects will have equivalent value, i.e. calling .equals() between them will return true.
Collection.contains()
It turns out that having separate instances of equivalent value (.equals() returns true) is okay. The .contains() method is in the Collection interface. The Javadoc description for .contains() says:
http://java.sun.com/javase/6/docs/api/java/util/Collection.html#contains(java.lang.Object)
boolean contains(Object o)
Returns true if this collection
contains the specified element. More
formally, returns true if and only if
this collection contains at least one
element e such that (o==null ? e==null
: o.equals(e)).
Thus, it will do what you want.
Data Structure
You should also consider whether you have the right data structure.
Is the list solely about containment? is the order important? Do you care about duplicates? Since a list is order, using a list can imply that your code cares about ordering. Or that you need to maintain duplicates in the data structure.
However, if order is not important, if you don't want or won't have duplicates, and if you really only use this data structure to test whether contains a specific value, then you might want to consider whether you should be using a Set instead.

Short answer is yes, you should be able to do ArrayList.contains(new Integer(14)), for example, to see if 14 is in the list. The reason is that Integer overrides the equals method to compare itself correctly against other instances with the same value.

Yes it will, because List.contains() use the equals() method of the object to be compared. And Integer.equals() does compare the integer value.

As cletus and DJ mentioned, your approach will work.
I don't know the context of your code, but if you don't care about the particular indices, consider the following style also:
List<Node> courseSelectNodes = new ArrayList<Node>();
//Find the course elements that are within a courseselect element
//and add them to the ArrayList
for(Node node : numberElementsInNodeList) {
if (node.getParentNode().getNodeName().equals("courseselect")) {
courseSelectNodes.add(node);
}
}
// Do stuff with courseSelectNodes
for(Node node : courseSelectNodes) {
//Do Stuff
}

I'm putting my answer in the form of a (passing) test, as an example of how you might research this yourself. Not to discourage you from using SO - it's great - just to try to promote characterization tests.
import java.util.ArrayList;
import junit.framework.TestCase;
public class ContainsTest extends TestCase {
public void testContains() throws Exception {
ArrayList<Integer> list = new ArrayList<Integer>();
assertFalse(list.contains(new Integer(17)));
list.add(new Integer(17));
assertTrue(list.contains(new Integer(17)));
}
}

Yes, automatic boxing occurs but this results in a performance penalty. Its not clear from your example why you would want to solve the problem in this manner.
Also, because of boxing, creating the Integer class by hand is superfluous.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

removing duplicates from an arraylist - java

Try adding j--; after removing an item. That will reindex for you and solve your issue.

public static void removeDuplicates(ArrayList list) { HashSet set = new HashSet(list); list.clear(); list.addAll(set); } override equals and hashcode appropriatley

custormers = new ArrayList(new HashSet(customers)) ensure the equals and hashmethod are correctly implemented

It's your int j=i+1 that causes trouble. You need to test with the last value of the customers list for each iteration.

Before you add them to the list in the above loop, why don't you check if(!cutomers.contains(accounts.get(i).getCustomer()) { //add them if it doesn't contain } It should save you from doing the second loop Edit: Need to override the equals method.

Related

Java, Sorting an ArrayList by using Comparator

Using Set<Integer> to remove indexes in an List<String>

Adding items to empty List at specific locations in java

How to lowercase every element of a collection efficiently?

Compare new Integer Objects in ArrayList Question

Categories

Resources