My question is about iteration and performance. Let's think of following case:
public class Car {
private String name;
private int type;
private int horsePower;
String getKey() {
return type + "_" + horsePower;
}
private final int NUM_OF_CARS = 50000;
public void test() {
List<Car> cars = new ArrayList<Car>(NUM_OF_CARS);
for (int i = 0; i < NUM_OF_CARS; i++) {
Car c = new Car();
if (i == 0 || i == 176 || i == 895 || i == 1500 || i == 4600) {
c.name = "Audi A4 " + i;
c.type = 1;
c.horsePower = 200;
} else {
c.name = "Default";
c.type = 2 + i;
c.horsePower = 201;
}
cars.add(c);
}
// Matches should contain all Audi's since they have same type and horse
// power
long time = SystemClock.currentThreadTimeMillis();
HashMap<String, List<Car>> map = new HashMap<String, List<Car>>();
for (Car c : cars) {
if (map.get(c.getKey()) != null) {
map.get(c.getKey()).add(c);
} else {
List<Car> list = new ArrayList<Car>();
list.add(c);
map.put(c.getKey(), list);
}
}
Iterator<Entry<String, List<Car>>> iterator = map.entrySet().iterator();
while (iterator.hasNext()) {
if (iterator.next().getValue().size() == 1) {
iterator.remove();
}
}
Log.d("test", String.valueOf((SystemClock.currentThreadTimeMillis() - time)));
}
}
Is this the most efficient way of finding all Audi's here?
This took me 1700 ms
Thanks.
It depends why you're iterating. If you really do need to visit every bottom-level Car, then you don't really have a choice. However, if you're looking for specific matches to the String, then you might consider using a Map.
Why don't you try (Map):
http://docs.oracle.com/javase/7/docs/api/java/util/Map.html
Basically it's a collection of Hashmaps:
http://docs.oracle.com/javase/7/docs/api/java/util/HashMap.html
Here's an example:
Map<String, List<Car>> map = new HashMap<String, List<Car>>();
If you want to find a String you should use HashMap. Otherwise you cannot avoid that type of iterations as far as i have in mind.
Use Hashing collections: HashSet that uses Object.hashCode() and Object.equals() to optimize search. How?
You must define your implementation of MyClass.hashCode() and equals(). hashCode() gives an integer representation of your object (you can do what you want here, but do it in a way where two different objects have different values)
HashSet will then do a modulo on your result ex: if the HashSet size is 5000 it will do a modulo 5000 and it will find the index where to put your object ex: if your hashCode() returns 10252, then 10252 % 5000 = 252. And your object will be put in an array with the index 252.
Finally, when you will ask (do I have an instance of "BMW x6", the object you ask for will have its hashCode() method called, which will return 10252 again. And HashSet will only search if it has an object in the 252 index.
If ever two objects give the same hashCode, then they will be compared through the equals() method.
I hope my explanations were clear. In short implement hashCode and equals() and try making the implementation of hashCode() optimized so you will gain time when filling your HashSet
You will probably also be interested in HashMap which stores keys and values where keys use the hashing mechanism: so you can find an object by its key
Related
How could I go about detecting (returning true/false) whether an ArrayList contains more than one of the same element in Java?
Many thanks,
Terry
Edit
Forgot to mention that I am not looking to compare "Blocks" with each other but their integer values. Each "block" has an int and this is what makes them different.
I find the int of a particular Block by calling a method named "getNum" (e.g. table1[0][2].getNum();
Simplest: dump the whole collection into a Set (using the Set(Collection) constructor or Set.addAll), then see if the Set has the same size as the ArrayList.
List<Integer> list = ...;
Set<Integer> set = new HashSet<Integer>(list);
if(set.size() < list.size()){
/* There are duplicates */
}
Update: If I'm understanding your question correctly, you have a 2d array of Block, as in
Block table[][];
and you want to detect if any row of them has duplicates?
In that case, I could do the following, assuming that Block implements "equals" and "hashCode" correctly:
for (Block[] row : table) {
Set set = new HashSet<Block>();
for (Block cell : row) {
set.add(cell);
}
if (set.size() < 6) { //has duplicate
}
}
I'm not 100% sure of that for syntax, so it might be safer to write it as
for (int i = 0; i < 6; i++) {
Set set = new HashSet<Block>();
for (int j = 0; j < 6; j++)
set.add(table[i][j]);
...
Set.add returns a boolean false if the item being added is already in the set, so you could even short circuit and bale out on any add that returns false if all you want to know is whether there are any duplicates.
Improved code, using return value of Set#add instead of comparing the size of list and set.
public static <T> boolean hasDuplicate(Iterable<T> all) {
Set<T> set = new HashSet<T>();
// Set#add returns false if the set does not change, which
// indicates that a duplicate element has been added.
for (T each: all) if (!set.add(each)) return true;
return false;
}
With Java 8+ you can use Stream API:
boolean areAllDistinct(List<Block> blocksList) {
return blocksList.stream().map(Block::getNum).distinct().count() == blockList.size();
}
If you are looking to avoid having duplicates at all, then you should just cut out the middle process of detecting duplicates and use a Set.
Improved code to return the duplicate elements
Can find duplicates in a Collection
return the set of duplicates
Unique Elements can be obtained from the Set
public static <T> List getDuplicate(Collection<T> list) {
final List<T> duplicatedObjects = new ArrayList<T>();
Set<T> set = new HashSet<T>() {
#Override
public boolean add(T e) {
if (contains(e)) {
duplicatedObjects.add(e);
}
return super.add(e);
}
};
for (T t : list) {
set.add(t);
}
return duplicatedObjects;
}
public static <T> boolean hasDuplicate(Collection<T> list) {
if (getDuplicate(list).isEmpty())
return false;
return true;
}
I needed to do a similar operation for a Stream, but couldn't find a good example. Here's what I came up with.
public static <T> boolean areUnique(final Stream<T> stream) {
final Set<T> seen = new HashSet<>();
return stream.allMatch(seen::add);
}
This has the advantage of short-circuiting when duplicates are found early rather than having to process the whole stream and isn't much more complicated than just putting everything in a Set and checking the size. So this case would roughly be:
List<T> list = ...
boolean allDistinct = areUnique(list.stream());
If your elements are somehow Comparable (the fact that the order has any real meaning is indifferent -- it just needs to be consistent with your definition of equality), the fastest duplicate removal solution is going to sort the list ( 0(n log(n)) ) then to do a single pass and look for repeated elements (that is, equal elements that follow each other) (this is O(n)).
The overall complexity is going to be O(n log(n)), which is roughly the same as what you would get with a Set (n times long(n)), but with a much smaller constant. This is because the constant in sort/dedup results from the cost of comparing elements, whereas the cost from the set is most likely to result from a hash computation, plus one (possibly several) hash comparisons. If you are using a hash-based Set implementation, that is, because a Tree based is going to give you a O( n logĀ²(n) ), which is even worse.
As I understand it, however, you do not need to remove duplicates, but merely test for their existence. So you should hand-code a merge or heap sort algorithm on your array, that simply exits returning true (i.e. "there is a dup") if your comparator returns 0, and otherwise completes the sort, and traverse the sorted array testing for repeats. In a merge or heap sort, indeed, when the sort is completed, you will have compared every duplicate pair unless both elements were already in their final positions (which is unlikely). Thus, a tweaked sort algorithm should yield a huge performance improvement (I would have to prove that, but I guess the tweaked algorithm should be in the O(log(n)) on uniformly random data)
If you want the set of duplicate values:
import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
public class FindDuplicateInArrayList {
public static void main(String[] args) {
Set<String> uniqueSet = new HashSet<String>();
List<String> dupesList = new ArrayList<String>();
for (String a : args) {
if (uniqueSet.contains(a))
dupesList.add(a);
else
uniqueSet.add(a);
}
System.out.println(uniqueSet.size() + " distinct words: " + uniqueSet);
System.out.println(dupesList.size() + " dupesList words: " + dupesList);
}
}
And probably also think about trimming values or using lowercase ... depending on your case.
Simply put:
1) make sure all items are comparable
2) sort the array
2) iterate over the array and find duplicates
To know the Duplicates in a List use the following code:It will give you the set which contains duplicates.
public Set<?> findDuplicatesInList(List<?> beanList) {
System.out.println("findDuplicatesInList::"+beanList);
Set<Object> duplicateRowSet=null;
duplicateRowSet=new LinkedHashSet<Object>();
for(int i=0;i<beanList.size();i++){
Object superString=beanList.get(i);
System.out.println("findDuplicatesInList::superString::"+superString);
for(int j=0;j<beanList.size();j++){
if(i!=j){
Object subString=beanList.get(j);
System.out.println("findDuplicatesInList::subString::"+subString);
if(superString.equals(subString)){
duplicateRowSet.add(beanList.get(j));
}
}
}
}
System.out.println("findDuplicatesInList::duplicationSet::"+duplicateRowSet);
return duplicateRowSet;
}
best way to handle this issue is to use a HashSet :
ArrayList<String> listGroupCode = new ArrayList<>();
listGroupCode.add("A");
listGroupCode.add("A");
listGroupCode.add("B");
listGroupCode.add("C");
HashSet<String> set = new HashSet<>(listGroupCode);
ArrayList<String> result = new ArrayList<>(set);
Just print result arraylist and see the result without duplicates :)
This answer is wrriten in Kotlin, but can easily be translated to Java.
If your arraylist's size is within a fixed small range, then this is a great solution.
var duplicateDetected = false
if(arrList.size > 1){
for(i in 0 until arrList.size){
for(j in 0 until arrList.size){
if(i != j && arrList.get(i) == arrList.get(j)){
duplicateDetected = true
}
}
}
}
private boolean isDuplicate() {
for (int i = 0; i < arrayList.size(); i++) {
for (int j = i + 1; j < arrayList.size(); j++) {
if (arrayList.get(i).getName().trim().equalsIgnoreCase(arrayList.get(j).getName().trim())) {
return true;
}
}
}
return false;
}
String tempVal = null;
for (int i = 0; i < l.size(); i++) {
tempVal = l.get(i); //take the ith object out of list
while (l.contains(tempVal)) {
l.remove(tempVal); //remove all matching entries
}
l.add(tempVal); //at last add one entry
}
Note: this will have major performance hit though as items are removed from start of the list.
To address this, we have two options. 1) iterate in reverse order and remove elements. 2) Use LinkedList instead of ArrayList. Due to biased questions asked in interviews to remove duplicates from List without using any other collection, above example is the answer. In real world though, if I have to achieve this, I will put elements from List to Set, simple!
/**
* Method to detect presence of duplicates in a generic list.
* Depends on the equals method of the concrete type. make sure to override it as required.
*/
public static <T> boolean hasDuplicates(List<T> list){
int count = list.size();
T t1,t2;
for(int i=0;i<count;i++){
t1 = list.get(i);
for(int j=i+1;j<count;j++){
t2 = list.get(j);
if(t2.equals(t1)){
return true;
}
}
}
return false;
}
An example of a concrete class that has overridden equals() :
public class Reminder{
private long id;
private int hour;
private int minute;
public Reminder(long id, int hour, int minute){
this.id = id;
this.hour = hour;
this.minute = minute;
}
#Override
public boolean equals(Object other){
if(other == null) return false;
if(this.getClass() != other.getClass()) return false;
Reminder otherReminder = (Reminder) other;
if(this.hour != otherReminder.hour) return false;
if(this.minute != otherReminder.minute) return false;
return true;
}
}
ArrayList<String> withDuplicates = new ArrayList<>();
withDuplicates.add("1");
withDuplicates.add("2");
withDuplicates.add("1");
withDuplicates.add("3");
HashSet<String> set = new HashSet<>(withDuplicates);
ArrayList<String> withoutDupicates = new ArrayList<>(set);
ArrayList<String> duplicates = new ArrayList<String>();
Iterator<String> dupIter = withDuplicates.iterator();
while(dupIter.hasNext())
{
String dupWord = dupIter.next();
if(withDuplicates.contains(dupWord))
{
duplicates.add(dupWord);
}else{
withoutDupicates.add(dupWord);
}
}
System.out.println(duplicates);
System.out.println(withoutDupicates);
A simple solution for learners.
//Method to find the duplicates.
public static List<Integer> findDublicate(List<Integer> numList){
List<Integer> dupLst = new ArrayList<Integer>();
//Compare one number against all the other number except the self.
for(int i =0;i<numList.size();i++) {
for(int j=0 ; j<numList.size();j++) {
if(i!=j && numList.get(i)==numList.get(j)) {
boolean isNumExist = false;
//The below for loop is used for avoid the duplicate again in the result list
for(Integer aNum: dupLst) {
if(aNum==numList.get(i)) {
isNumExist = true;
break;
}
}
if(!isNumExist) {
dupLst.add(numList.get(i));
}
}
}
}
return dupLst;
}
I have a list which I need to sort based on two criteria.
The first criterion is a Boolean, let's say isBig. The second one is a Long, which represents a timestamp.
I need to order the elements of the list in this way: before the isBig = true, and then the isBig = false. Within these groups, the single elements should be ordered descending on the basis of their timestamp.
Basically, I expect the result to be something like this:
isBig - 2015/10/29
isBig - 2015/10/28
isBig - 2015/10/27
!isBig - 2015/10/30
!isBig - 2015/10/27
!isBig - 2015/10/26
Let's say the object is this:
public class Item {
Boolean isBig;
Long timestamp;
// ...
}
and the list is just List<Item> list.
I figured out that one method would be make three for-cycles: the first to make up the two groups: isBig and !isBig. The second and the third for sorting the elements within them. Finally I merge the two lists.
Is there a more efficient algorithm for sorting lists on the basis of two criteria?
You can sort the list directly using a custom comparison method which checks both criteria.
Use the Collections.sort method and pass a custom comparator with the method compare overriden to:
int compare(Item o1, Item o2) {
if (o1.isBig && !o2.isBig)
return -1;
if (!o1.isBig && o2.isBig)
return 1;
if (o1.timestamp < o2.timestamp)
return -1;
if (o1.timestamp > o2.timestamp)
return 1;
return 0;
}
If you are obsessed with performance you could possibly speed it up by a few percents with a more sophisticated approach, but for a list of a few hundred elements the gains would be negligible.
An optimized comparison method:
int compare(Item o1, Item o2) {
int bigness = (o2.isBig ? 2 : 0) - (o1.isBig ? 2 : 0);
long diff = o1.timestamp - o2.timestamp;
return bigness + (int) Long.signum(diff);
}
It features no conditional branches what means it will probably be faster than the naive version above.
That's probably everything that can be done for performance. If we knew something more about your data (for instance there are always more big object than small ones, or all the timestamps are unique, or all the timestamps are from a certain narrow range etc) we could probably propose some better solution. However, when we assume that your data is arbitrary and has no specific pattern than the very best solution is to use a standard sort utility like I've shown above.
Splitting the list into two sublists and sorting them separately will definitely be slower. Actually the sorting algorithm will most probably divide the data into two groups and then recursively into four groups, and so on. However, the division won't follow the isBig criterion. If you want to learn more, read how quick sort or merge sort work.
The following things you need to do to have two comparable objects for sorting on two parameters.
You need to implement Comparator for two comparable objects that you have is one Boolean and one Timestamp.
you need to pass these comparators to Collections.sort() because as they are objects that compared for two keys and the data structure is not of primitives they need Collections.sort().
/**
* Comparator to sort employees list or array in order of Salary
*/
public static Comparator<BooleanComaprator> booleanComparator= new Comparator<BooleanComaprator>() {
#Override
public int compare(BooleanComaprator e1, BooleanComaprator e2) {
if (e1.isBig && !e2.isBig)
return -1;
if (!e1.isBig && e2.isBig)
return 1;
else
return 0;
}
}
use this object in Collections.sort(booleanComparator);
In theory, the approach using two separate lists should be faster than the approach using a two-step Comparator, because a comparison based on one field is obviously faster than a comparison based on two. By using two lists you are speeding up the part of the algorithm that has O(n log n) time complexity (the sort), at the expense of an additional initial stage (splitting into two pieces) which has time complexity O(n). Since n log n > n, the two lists approach should be faster for very, very large values of n.
However, in practice we are talking about such tiny differences in times that you have to have extremely long lists before the two lists approach wins out, and so it's very difficult to demonstrate the difference using lists before you start running into problems such as an OutOfMemoryError.
However, if you use arrays rather than lists, and use clever tricks to do it in place rather than using separate data structures, it is possible to beat the two-step Comparator approach, as the code below demonstrates. Before anybody complains: yes I know this is not a proper benchmark!
Even though sort2 is faster than sort1, I would probably not use it in production code. It is better to use familiar idioms and code that obviously works, rather than code that is harder to understand and maintain, even if it slightly faster.
public class Main {
static Random rand = new Random();
static Compound rand() {
return new Compound(rand.nextBoolean(), rand.nextLong());
}
static Compound[] randArray() {
int length = 100_000;
Compound[] temp = new Compound[length];
for (int i = 0; i < length; i++)
temp[i] = rand();
return temp;
}
static class Compound {
boolean bool;
long time;
Compound(boolean bool, long time) {
this.bool = bool;
this.time = time;
}
#Override
public boolean equals(Object o) {
if (this == o)
return true;
if (o == null || getClass() != o.getClass())
return false;
Compound compound = (Compound) o;
return bool == compound.bool && time == compound.time;
}
#Override
public int hashCode() {
int result = (bool ? 1 : 0);
result = 31 * result + (int) (time ^ (time >>> 32));
return result;
}
}
static final Comparator<Compound> COMPARATOR = new Comparator<Compound>() {
#Override
public int compare(Compound o1, Compound o2) {
int result = (o1.bool ? 0 : 1) - (o2.bool ? 0 : 1);
return result != 0 ? result : Long.compare(o1.time, o2.time);
}
};
static final Comparator<Compound> LONG_ONLY_COMPARATOR = new Comparator<Compound>() {
#Override
public int compare(Compound o1, Compound o2) {
return Long.compare(o1.time, o2.time);
}
};
static void sort1(Compound[] array) {
Arrays.sort(array, COMPARATOR);
}
static void sort2(Compound[] array) {
int secondIndex = array.length;
if (secondIndex == 0)
return;
int firstIndex = 0;
for (Compound c = array[0];;) {
if (c.bool) {
array[firstIndex++] = c;
if (firstIndex == secondIndex)
break;
c = array[firstIndex];
} else {
Compound c2 = array[--secondIndex];
array[secondIndex] = c;
if (firstIndex == secondIndex)
break;
c = c2;
}
}
Arrays.sort(array, 0, firstIndex, LONG_ONLY_COMPARATOR);
Arrays.sort(array, secondIndex, array.length, LONG_ONLY_COMPARATOR);
}
public static void main(String... args) {
// Warm up the JVM and check the algorithm actually works.
for (int i = 0; i < 20; i++) {
Compound[] arr1 = randArray();
Compound[] arr2 = arr1.clone();
sort1(arr1);
sort2(arr2);
if (!Arrays.equals(arr1, arr2))
throw new IllegalStateException();
System.out.println(i);
}
// Begin the test proper.
long normal = 0;
long split = 0;
for (int i = 0; i < 100; i++) {
Compound[] array1 = randArray();
Compound[] array2 = array1.clone();
long time = System.nanoTime();
sort1(array1);
normal += System.nanoTime() - time;
time = System.nanoTime();
sort2(array2);
split += System.nanoTime() - time;
System.out.println(i);
System.out.println("COMPARATOR: " + normal);
System.out.println("LONG_ONLY_COMPARATOR: " + split);
}
}
}
This is called sorting by multiple keys, and it's easy to do. If you're working with a sort library function that takes a comparator callback function to decide the relative ordering of two elements, define the comparator function so that it first checks whether the two input values a and b have equal isBig values, and, if not, immediately returns a.isBig > b.isBig (I'm assuming here that > is defined for boolean values; if not, substitute the obvious test). But if the isBig values are equal, you should return a.timestamp > b.timestamp.
You can a define a custom comparator and use it to sort the List. E.g.
class ItemComparator implements Comparator {
#Override
public int compare (Item a, Item b) {
int bc = Boolean.compare(a.isBig, b.isBig);
if (bc != 0)
return bc;
return Long.compare(a.timestamp, b.timestamp);
}
}
and use it like this
Collections.sort(list, ItemComparator);
My program takes in thousands of PL/SQL functions, procedures and views, saves them as objects and then adds them to an array list. My array list stores objects with the following format:
ArrayList<PLSQLItemStore> storedList = new ArrayList<>();
storedList.add(new PLSQLItemStore(String, String, String, Long ));
storedList.add(new PLSQLItemStore(Name, Type, FileName, DatelastModified));
What I wanted to do is remove duplicate objects from the array-list based on their Name. The older object would be removed based on its dateLastModified variable. The approach i took was to have an outer loop and an inner loop with each object comparing themselves to every other object and then changing the name to "remove" if it was considered to be older. The program then does one final pass backwards through the array-list removing any objects whose name is set as "remove". While this works fine it seems extremely inefficient. 1000 objects will mean 1,000,000 passes need to be made. I was wondering if someone could help me make it more efficient? Thanks.
Sample Input:
storedList.add(new PLSQLItemStore("a", "function", "players.sql", 1234));
storedList.add(new PLSQLItemStore("a", "function", "team.sql", 2345));
storedList.add(new PLSQLItemStore("b", "function", "toon.sql", 1111));
storedList.add(new PLSQLItemStore("c", "function", "toon.sql", 2222));
storedList.add(new PLSQLItemStore("c", "function", "toon.sql", 1243));
storedList.add(new PLSQLItemStore("d", "function", "toon.sql", 3333));
ArrayList Iterator:
for(int i = 0; i < storedList.size();i++)
{
for(int k = 0; k < storedList.size();k++)
{
if (storedList.get(i).getName().equalsIgnoreCase("remove"))
{
System.out.println("This was already removed");
break;
}
if (storedList.get(i).getName().equalsIgnoreCase(storedList.get(k).getName()) && // checks to see if it is valid to be removed
!storedList.get(k).getName().equalsIgnoreCase("remove") &&
i != k )
{
if(storedList.get(i).getLastModified() >= storedList.get(k).getLastModified())
{
storedList.get(k).setName("remove");
System.out.println("Set To Remove");
}
else
{
System.out.println("Not Older");
}
}
}
}
Final Pass to remove Objects:
System.out.println("size: " + storedList.size());
for (int i= storedList.size() - 1; i >= 0; i--)
{
if (storedList.get(i).getName().equalsIgnoreCase("remove"))
{
System.out.println("removed: " + storedList.get(i).getName());
storedList.remove(i);
}
}
System.out.println("size: " + storedList.size());
You need to make PLSQLItemStore implement hashCode and equals methods and then you can use Set to remove the duplicates.
public class PLSQLItemStore {
private String name;
#Override
public int hashCode() {
int hash = 7;
hash = 47 * hash + (this.name != null ? this.name.hashCode() : 0);
return hash;
}
#Override
public boolean equals(Object obj) {
if (obj == null) {
return false;
}
if (getClass() != obj.getClass()) {
return false;
}
final PLSQLItemStore other = (PLSQLItemStore) obj;
if ((this.name == null) ? (other.name != null) : !this.name.equals(other.name)) {
return false;
}
return true;
}
}
And then just do Set<PLSQLItemStore> withoutDups = new HashSet<>(storedList);
P.S. equals and hashCode are generated by NetBeans IDE.
Put them in a Guava ArrayListMultimap<String,PLSQLItemStore>.
Add each PLSQLItemStore using name as the key.
When you're done adding, loop through the multimap, sort each List with a Comparator<PLSQLItemStore> which sorts by dateLastModified, and pull out the last entry of each List - it will be the latest PLSQLItemStore.
Put these entries in another Map<String,PLSQLItemStore> (or List<PLSQLItemStore>, if you no longer care about the name) and throw away the ArrayListMultimap.
Building off of Petr Mensik's answer, you should implement equals and hashCode. From there, you can put items into the map. If you come across a duplicate, you can decide then what to do:
Map<String, PLSQLItemStore> storeMap = new HashMap<String, PLSQLItemStore>();
for(PLSQLItemStore currentStore : storedList) {
// See if an item exists in the map with this name
PLSQLItemStore buffStore = storeMap.get(currentStore.getName();
// If this value was never in the map, put it in the map and move on
if(buffStore == null) {
storeMap.put(currentStore.getName(), currentStore);
continue;
}
// If we've gotten here, then something is in buffStore.
// If buffStore is newer, put it in the map. Otherwise, do nothing
// (this might be backwards -- I didn't quite follow your logic.
// Feel free to correct me
if(buffStore.getLastModified() > currentStore.getLastModified())
storeMap.put(currentStore.getName(), currentStore);
}
Your map is dup-free. Because Map is a Collection, you can iterate through it later in your code:
for(PLSQLItemStore currentStore : storeMap) {
// Do whatever you want with your items
}
Hello I cannot find any info on what I need to do to make two keys seems equal. That is, i need to provide a custom compare method that will be used by map.put() Implementing comparable doesnt help.
For example, this code doesnt work as intended - for the purposes of my program, two keys n and n2 ARE the same.
private class N implements Comparable<N> {
int value;
int stuff;
String z;
#Override
public int compareTo(N arg0) {
if (arg0.z.equals(z))
return 0;
return 1;
}
}
public void dostuff() {
HashMap m = new HashMap();
N n = new N();
n.z = "1";
N n2 = new N();
n2.z = "1";
m.put(n, "one");
m.put(n2, "two");
// will print refs to two instances! - wrong
Iterator it = m.keySet().iterator();
while (it.hasNext()) {
System.err.println(it.next());
}
}
You need to override equals and hashCode - HashMap doesn't use compareTo, which is meant for sorting.
Note that your compareTo implementation is already broken, as it's really only testing for equality. In particular, x.compareTo(y) and y.compareTo(x) both returning 1 violates the contract of compareTo:
The implementor must ensure sgn(x.compareTo(y)) == -sgn(y.compareTo(x)) for all x and y.
How could I go about detecting (returning true/false) whether an ArrayList contains more than one of the same element in Java?
Many thanks,
Terry
Edit
Forgot to mention that I am not looking to compare "Blocks" with each other but their integer values. Each "block" has an int and this is what makes them different.
I find the int of a particular Block by calling a method named "getNum" (e.g. table1[0][2].getNum();
Simplest: dump the whole collection into a Set (using the Set(Collection) constructor or Set.addAll), then see if the Set has the same size as the ArrayList.
List<Integer> list = ...;
Set<Integer> set = new HashSet<Integer>(list);
if(set.size() < list.size()){
/* There are duplicates */
}
Update: If I'm understanding your question correctly, you have a 2d array of Block, as in
Block table[][];
and you want to detect if any row of them has duplicates?
In that case, I could do the following, assuming that Block implements "equals" and "hashCode" correctly:
for (Block[] row : table) {
Set set = new HashSet<Block>();
for (Block cell : row) {
set.add(cell);
}
if (set.size() < 6) { //has duplicate
}
}
I'm not 100% sure of that for syntax, so it might be safer to write it as
for (int i = 0; i < 6; i++) {
Set set = new HashSet<Block>();
for (int j = 0; j < 6; j++)
set.add(table[i][j]);
...
Set.add returns a boolean false if the item being added is already in the set, so you could even short circuit and bale out on any add that returns false if all you want to know is whether there are any duplicates.
Improved code, using return value of Set#add instead of comparing the size of list and set.
public static <T> boolean hasDuplicate(Iterable<T> all) {
Set<T> set = new HashSet<T>();
// Set#add returns false if the set does not change, which
// indicates that a duplicate element has been added.
for (T each: all) if (!set.add(each)) return true;
return false;
}
With Java 8+ you can use Stream API:
boolean areAllDistinct(List<Block> blocksList) {
return blocksList.stream().map(Block::getNum).distinct().count() == blockList.size();
}
If you are looking to avoid having duplicates at all, then you should just cut out the middle process of detecting duplicates and use a Set.
Improved code to return the duplicate elements
Can find duplicates in a Collection
return the set of duplicates
Unique Elements can be obtained from the Set
public static <T> List getDuplicate(Collection<T> list) {
final List<T> duplicatedObjects = new ArrayList<T>();
Set<T> set = new HashSet<T>() {
#Override
public boolean add(T e) {
if (contains(e)) {
duplicatedObjects.add(e);
}
return super.add(e);
}
};
for (T t : list) {
set.add(t);
}
return duplicatedObjects;
}
public static <T> boolean hasDuplicate(Collection<T> list) {
if (getDuplicate(list).isEmpty())
return false;
return true;
}
I needed to do a similar operation for a Stream, but couldn't find a good example. Here's what I came up with.
public static <T> boolean areUnique(final Stream<T> stream) {
final Set<T> seen = new HashSet<>();
return stream.allMatch(seen::add);
}
This has the advantage of short-circuiting when duplicates are found early rather than having to process the whole stream and isn't much more complicated than just putting everything in a Set and checking the size. So this case would roughly be:
List<T> list = ...
boolean allDistinct = areUnique(list.stream());
If your elements are somehow Comparable (the fact that the order has any real meaning is indifferent -- it just needs to be consistent with your definition of equality), the fastest duplicate removal solution is going to sort the list ( 0(n log(n)) ) then to do a single pass and look for repeated elements (that is, equal elements that follow each other) (this is O(n)).
The overall complexity is going to be O(n log(n)), which is roughly the same as what you would get with a Set (n times long(n)), but with a much smaller constant. This is because the constant in sort/dedup results from the cost of comparing elements, whereas the cost from the set is most likely to result from a hash computation, plus one (possibly several) hash comparisons. If you are using a hash-based Set implementation, that is, because a Tree based is going to give you a O( n logĀ²(n) ), which is even worse.
As I understand it, however, you do not need to remove duplicates, but merely test for their existence. So you should hand-code a merge or heap sort algorithm on your array, that simply exits returning true (i.e. "there is a dup") if your comparator returns 0, and otherwise completes the sort, and traverse the sorted array testing for repeats. In a merge or heap sort, indeed, when the sort is completed, you will have compared every duplicate pair unless both elements were already in their final positions (which is unlikely). Thus, a tweaked sort algorithm should yield a huge performance improvement (I would have to prove that, but I guess the tweaked algorithm should be in the O(log(n)) on uniformly random data)
If you want the set of duplicate values:
import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
public class FindDuplicateInArrayList {
public static void main(String[] args) {
Set<String> uniqueSet = new HashSet<String>();
List<String> dupesList = new ArrayList<String>();
for (String a : args) {
if (uniqueSet.contains(a))
dupesList.add(a);
else
uniqueSet.add(a);
}
System.out.println(uniqueSet.size() + " distinct words: " + uniqueSet);
System.out.println(dupesList.size() + " dupesList words: " + dupesList);
}
}
And probably also think about trimming values or using lowercase ... depending on your case.
Simply put:
1) make sure all items are comparable
2) sort the array
2) iterate over the array and find duplicates
To know the Duplicates in a List use the following code:It will give you the set which contains duplicates.
public Set<?> findDuplicatesInList(List<?> beanList) {
System.out.println("findDuplicatesInList::"+beanList);
Set<Object> duplicateRowSet=null;
duplicateRowSet=new LinkedHashSet<Object>();
for(int i=0;i<beanList.size();i++){
Object superString=beanList.get(i);
System.out.println("findDuplicatesInList::superString::"+superString);
for(int j=0;j<beanList.size();j++){
if(i!=j){
Object subString=beanList.get(j);
System.out.println("findDuplicatesInList::subString::"+subString);
if(superString.equals(subString)){
duplicateRowSet.add(beanList.get(j));
}
}
}
}
System.out.println("findDuplicatesInList::duplicationSet::"+duplicateRowSet);
return duplicateRowSet;
}
best way to handle this issue is to use a HashSet :
ArrayList<String> listGroupCode = new ArrayList<>();
listGroupCode.add("A");
listGroupCode.add("A");
listGroupCode.add("B");
listGroupCode.add("C");
HashSet<String> set = new HashSet<>(listGroupCode);
ArrayList<String> result = new ArrayList<>(set);
Just print result arraylist and see the result without duplicates :)
This answer is wrriten in Kotlin, but can easily be translated to Java.
If your arraylist's size is within a fixed small range, then this is a great solution.
var duplicateDetected = false
if(arrList.size > 1){
for(i in 0 until arrList.size){
for(j in 0 until arrList.size){
if(i != j && arrList.get(i) == arrList.get(j)){
duplicateDetected = true
}
}
}
}
private boolean isDuplicate() {
for (int i = 0; i < arrayList.size(); i++) {
for (int j = i + 1; j < arrayList.size(); j++) {
if (arrayList.get(i).getName().trim().equalsIgnoreCase(arrayList.get(j).getName().trim())) {
return true;
}
}
}
return false;
}
String tempVal = null;
for (int i = 0; i < l.size(); i++) {
tempVal = l.get(i); //take the ith object out of list
while (l.contains(tempVal)) {
l.remove(tempVal); //remove all matching entries
}
l.add(tempVal); //at last add one entry
}
Note: this will have major performance hit though as items are removed from start of the list.
To address this, we have two options. 1) iterate in reverse order and remove elements. 2) Use LinkedList instead of ArrayList. Due to biased questions asked in interviews to remove duplicates from List without using any other collection, above example is the answer. In real world though, if I have to achieve this, I will put elements from List to Set, simple!
/**
* Method to detect presence of duplicates in a generic list.
* Depends on the equals method of the concrete type. make sure to override it as required.
*/
public static <T> boolean hasDuplicates(List<T> list){
int count = list.size();
T t1,t2;
for(int i=0;i<count;i++){
t1 = list.get(i);
for(int j=i+1;j<count;j++){
t2 = list.get(j);
if(t2.equals(t1)){
return true;
}
}
}
return false;
}
An example of a concrete class that has overridden equals() :
public class Reminder{
private long id;
private int hour;
private int minute;
public Reminder(long id, int hour, int minute){
this.id = id;
this.hour = hour;
this.minute = minute;
}
#Override
public boolean equals(Object other){
if(other == null) return false;
if(this.getClass() != other.getClass()) return false;
Reminder otherReminder = (Reminder) other;
if(this.hour != otherReminder.hour) return false;
if(this.minute != otherReminder.minute) return false;
return true;
}
}
ArrayList<String> withDuplicates = new ArrayList<>();
withDuplicates.add("1");
withDuplicates.add("2");
withDuplicates.add("1");
withDuplicates.add("3");
HashSet<String> set = new HashSet<>(withDuplicates);
ArrayList<String> withoutDupicates = new ArrayList<>(set);
ArrayList<String> duplicates = new ArrayList<String>();
Iterator<String> dupIter = withDuplicates.iterator();
while(dupIter.hasNext())
{
String dupWord = dupIter.next();
if(withDuplicates.contains(dupWord))
{
duplicates.add(dupWord);
}else{
withoutDupicates.add(dupWord);
}
}
System.out.println(duplicates);
System.out.println(withoutDupicates);
A simple solution for learners.
//Method to find the duplicates.
public static List<Integer> findDublicate(List<Integer> numList){
List<Integer> dupLst = new ArrayList<Integer>();
//Compare one number against all the other number except the self.
for(int i =0;i<numList.size();i++) {
for(int j=0 ; j<numList.size();j++) {
if(i!=j && numList.get(i)==numList.get(j)) {
boolean isNumExist = false;
//The below for loop is used for avoid the duplicate again in the result list
for(Integer aNum: dupLst) {
if(aNum==numList.get(i)) {
isNumExist = true;
break;
}
}
if(!isNumExist) {
dupLst.add(numList.get(i));
}
}
}
}
return dupLst;
}