JUnit - compare unknown Collection and ArrayList

JUnit - compare unknown Collection and ArrayList - java

I want to compare Collection (products) (in my case it is LinkedHashMap$LinkedValues) and ArrayList.
The test
assertThat(products, equalTo(Lists.newArrayList(product1, product2, product3)));
doesn't work because LinkedValues doesn't implement equals method.
So I changed my test to:
assertThat(new ArrayList<>(products), equalTo(Lists.newArrayList(product1, product2, product3)));
Is there a better solution where I do not have to check if the collection implements equals method?

Since you're using Hamcrest, you should use the slightly confusingly named method Matchers.contains(). It checks whether the target collection contains the same elements in the same order as the original collection.
Given
Map<String, String> linkedHashMap = new LinkedHashMap<>();
linkedHashMap.put("a", "A");
linkedHashMap.put("b", "B");
This will pass:
assertThat(linkedHashMap.values(), contains("A", "B"));
and this would fail:
assertThat(linkedHashMap.values(), contains("B", "A"));
Note that Hamcrest has been long dead and even though it works fine and is okay for 99% of usages, you will be shocked by how good AssertJ is, how much functionality it provides and how easy asserting can be.
With AssertJ:
assertThat(linkedHashMap.values()).containsExactly("A", "B");

Assuming the data type you're using already has an equals method, then there's no need to check for an (un-)implemented equals() function. Otherwise, you would have to create something that compares the data you're using.
On a side note, the two lines of code you have are identical. Did you mean to put something else in the second line?

You can use Arrays.equals:
assertTrue(Arrays.equals(products.toArray(), new Product[] {product1, product2, product3}));
This checks array sizes and odering of items. Your product class should implement equals() to be something meaningful.
Note that you can use ArrayList.toArray() to get an array if needed.

Related

Incomprehensible JUnit error

Trying to test equality of two Maps (including order) by turning them into lists beforehand. There are probably better ways to do it, but I'd like to know why this error comes up. Here is the test:
#Test
public void sortedEntriesTest() {
List<Map.Entry<String, AtomicInteger>> actualList = stream.sortedEntries(stream.getMap());
List<Map.Entry<String, AtomicInteger>> expectedList =
expectedMap.entrySet()
.stream()
.sorted(Comparator.comparingInt(e -> -e.getValue().get()))
.collect(Collectors.toList());
Assert.assertThat(expectedList, is(actualList));
}
Here is the error:
java.lang.AssertionError:
Expected: is <[file=1, for=1, project=1, is=1, an=1, just=1, example=1, this=2]>
but: was <[file=1, for=1, project=1, is=1, an=1, just=1, example=1, this=2]>
Expected :is <[file=1, for=1, project=1, is=1, an=1, just=1, example=1, this=2]>
Actual :<[file=1, for=1, project=1, is=1, an=1, just=1, example=1, this=2]>

Try
Assert.assertThat(expectedList, is(equalTo(actualList)));
instead.

Explanation:
You are comparing references of two different objects, which are (just as the objects) different. That is why You are getting the AssertionError - first reference is not the second reference.
Solution:
Use the equals method (link to the Java documentation for List.equals()), and it will compare the contents of the lists, also by calling the Map's equals method.
Assert.assertTrue(expectedList.equals(actualList));
Documentation on Assert.assertTrue
Also, check this StackOverflow question and the first (selected) answer - comparing two maps.
Edit
Since You told that the error is still here, then it might be a problem in the list's items. You should check how Map.Entry instances in the expectedList and actualList are being created. Their actual types might be different, since the Map.Entry is just an interface.
Also, I suggest You to use a simpler method of getting the desired values for comparison.

How to avoid duplicate strings in Java?

I want to be able to add specific words from a text into a vector. Now the problem is I want to avoid adding duplicate strings. The first thing that comes to my mind is to compare all strings before adding them, as the amount of entries grow, this becomes really inefficient solution. The only "time efficient" solution that I can think of is unordered_multimap container that has included in C++11. I couldn't find a Java equivalent of it. I was thinking to add strings to the map and at the end just copying all entries to the vector, in that way it would be a lot more efficient than the first solution. Now I wonder whether there is any Java library that does what I want? If not is there any C++ unordered_multimap container equivalent in Java that I couldn't find?

You can use a Set<String> Collection. It does not allow duplicates. You can choose then as implementantion:
1) HashSet if you do not care about the order of elements (Strings).
2) LinkedHashSet if you want to keep the elements in the inserting order.
3) TreeSet if you want the elements to be sorted.
For example:
Set<String> mySet = new TreeSet<String>();
mySet.add("a_String");
...
Vector is "old-fashioned" in Java. You had better avoid it.

You can use a set (java.util.Set):
Set<String> i_dont_allow_duplicates = new HashSet<String>();
i_dont_allow_duplicates.add(my_string);
i_dont_allow_duplicates.add(my_string); // wont add 'my_string' this time.
HashSet will do the job most effeciently and if you want to keep insertion order then you can use LinkedHashSet.

Use a Set. A HashSet will do fine if you do not need to preserve order. A LinkedHashSet works if you need that.

You should consider using a Set:
A collection that contains no duplicate elements. More formally, sets
contain no pair of elements e1 and e2 such that e1.equals(e2), and at
most one null element. As implied by its name, this interface models
the mathematical set abstraction.
HashSet should be good for your use:
HashSet class implements the Set interface, backed by a hash table
(actually a HashMap instance). It makes no guarantees as to the
iteration order of the set; in particular, it does not guarantee that
the order will remain constant over time. This class permits the null
element.
So simply define a Set like this and use it appropriately:
Set<String> myStringSet = new HashSet<String>();

Set<String> set = new HashSet<String>();
The general contract of hashCode is:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified.
This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.

Comparing an array and getting the difference

How would I compare two arrays that might have different lengths and get the difference between each array?
For example:
Cat cat = new Cat();
Dog dog = new Dog();
Alligator alligator = new Alligator();
Animal animals[] = { cat, dog };
Animal animals2[] = { cat, dog, alligator };
How would I compare them two arrays and make it return the instance of Alligator?

I would suggest that your question needs to be clarified. Currently, everyone is guessing what about what you are actually asking.
Are the arrays intended to represent sets, or lists, or something in between? In other words, does element order matter, and can there be duplicates?
What does "equal" mean? Does new Cat() "equal" new Cat()? Your example suggests that it does!!
What do you mean by the "difference"? Do you mean set difference?
What do you want to happen if the two arrays have the same length?
Is this a once-off comparison or does it occur repeatedly for the same arrays?
How many elements are there in the arrays (on average)?
Why are you using arrays at all?
Making the assumption that these arrays are intended to be true sets, then you probably should be using HashSet instead of arrays, and using collection operations like addAll and retainAll to calculate the set difference.
On the other hand, if the arrays are meant to represent lists, it is not at all clear what "difference" means.
If it is critical that the code runs fast, then you most certainly need to rethink your data structures. If you always start with arrays, you are not going to be able to calculate the "differences" fast ... at least in the general case.
Finally, if you are going to use anything that depends on the equals(Object) method (and that includes any of the Java collection types, you really need to have a clear understanding of what "equals" is supposed to mean in your application. Are all Cat instances equal? Are they all different? Are some Cat instances equal and others not? If you don't figure this out, and implement the equals and hashCode methods accordingly you will get confusing results.

I suggest that you put your objects in sets and then use an intersection of the sets:
// Considering you put your objects in setA and setB
Set<Object> intersection = new HashSet<Object>(setA);
intersection.retainAll(setB);
After that you can use removeAll to get a difference to any of the two sets:
setA.removeAll(intersection);
setB.removeAll(intersection);
Inspired by: http://hype-free.blogspot.com/2008/11/calculating-intersection-of-two-java.html

Well, you could maybe use Set instead and use the removeAll() method.
Or you could use the following simple and slow algorithm for doing:
List<Animal> differences = new ArrayList<Animal>();
for (Animal a1 : animals) {
boolean isInSecondArray = false;
for (Animal a2 : animals2) {
if (a1 == a2) {
isInSecondArray = true;
break;
}
}
if (!isInSecondArray)
differences.add(a1)
}
Then differences will have all the objects that are in animals array but not in animals2 array. In a similar way you can do the opposite (get all the objects that are in animals2 but not in animals).

You may want to look at this article for more information:
http://download-llnw.oracle.com/javase/tutorial/collections/interfaces/set.html
As was mentioned, removeAll() is made for this, but you will want to do it twice, so that you can create a list of all that are missing in both, and then you could combine these two results to have a list of all the differences.
But, this is a destructive operation, so if you don't want to lose the information, copy the Set and operate on that one.
UPDATE:
It appears that my assumption of what is in the array is wrong, so removeAll() won't work, but with a 5ms requirement, depeending on the number of items to search it could be a problem.
So, it would appear a HashMap<String, Animal> would be the best option, as it is fast in searching.
Animal is an interface with at least one property, String name. For each class that implements Animal write code for Equals and hashCode. You can find some discussion here: http://www.ibm.com/developerworks/java/library/j-jtp05273.html. This way, if you want the hash value to be a combination of the type of animal and the name then that will be fine.
So, the basic algorithm is to keep everything in the hashmaps, and then to search for differences, just get an array of keys, and search through to see if that key is contained in the other list, and if it isn't put it into a List<Object>, storing the value there.
You will want to do this twice, so, if you have at least a dual-core processor, you may get some benefit out of having both searches being done in separate threads, but then you will want to use one of the concurrent datatypes added in JDK5 so that you don't have to worry about synchronizations in the combined list of differences.
So, I would write it first as a single-thread and test, to get some ideas as to how much faster it is, also comparing it to the original implmemntation.
Then, if you need it faster, try using threads, again, compare to see if there is a speed increase.
Before making any optimization ensure you have some metrics on what you already have, so that you can compare and see if the one change will lead to an increase in speed.
If you make too many changes at a time, one may have a large improvement on speed, but others may lead to a performance decrease, and it wouldn't be seen, which is why each change should be one at a time.
Don't lose the other implementations though, by using unit tests and testing perhaps 100 times each, you can get an idea as to what improvement each change gives you.

I don't care about perf for my usages (and you shouldn't either, unless you have a good reason to, and you find out via your profiler that this code is the bottleneck).
What I do is similar to functional's answer. I use LINQ set operators to get the exception on each list:
http://msdn.microsoft.com/en-us/library/bb397894.aspx
Edit:
Sorry, I didn't notice this is Java. Sorry, I'm off in C# la-la land, and they look very similar :)

Problem with implementing removeAll for List of custom object

I have a scenario in my code where I need to compare two Lists and remove from the first list, objects which are present in the second list. Akin to how the "removeAll" object works for List. Since my List is created on a custom object, the removeAll method won't work for me.
I have tried various methods to make this work:
- implemented equals() and hashCode for the custom object comprising the list
- implemented the Comparable Interface for the custom object
- implemented the Comparator Interface for the custom object
I've even tried using the Apache Common's CollectionUtils and ListUtils methods (subtract, intersect, removeAll). None seem to work.
I understand I will perhaps need to write some custom removal code. But not sure how to go about doing that. Any pointers helping me move in the right direction will be really appreciated.
Thanks,
Jay

Java Collections already cater for your scenario. Call Collection.removeAll(Collection) and it'll remove all the items from the passed in collection using the equals() method to test for equality.
List<String> list1 = new ArrayList<String>();
Collections.addAll(list1, "one", "two", "three", "four");
List<String> list2 = new ArrayList<String>();
Collections.addAll(list2, "three", "four", "five");
list1.removeAll(list2); // now contains "one", "two"
To make this work the objects you're storing just need to properly implement the equals/hashCode contract, which is: given any two objects a and b:
a.equals(b) == b.equals(a)
and:
a.hashCode() == b.hashCode() if a.equals(b)
Improperly defined equals and hashCode methods create undefined behaviour and are the common cause of collections related issues.

Overriding equals and hashCode methods is enough to make method removeAll work on custom objects.
It's likely that you didn't override them in a proper way. Some code will help us a lot.

You said:
... Since my List is created on a custom object, the removeAll method won't work for me.
As others have stated, .removeAll() should work for the scenario you described, even for custom objects, as long as the custom objects obey the contracts that Java Collections expects of its objects, including a properly implement equals() and hashCode() method.
I have tried various methods to make this work: - implemented equals() and hashCode for the custom object comprising the list - implemented the Comparable Interface for the custom object - implemented the Comparator Interface for the custom object ...
It sounds like your are shot-gunning different approaches: coding one, trying it, quickly coding another, trying it, coding yet another, ... It make be worthwhile to slow down and try to understand why each approach failed and/or determine why that approach won't work for your situation, before moving on to the next. If you've already investigated and determined why each approach won't work, please explain in your question. If you haven't, then let us help by posting code.
Since most people agree the first approach (.removeall()) should have work and since custom objects are involved, why not take a quick review of this StackOverflow question to see if anything jumps out of you:
Overriding equals and hashCode in Java
"What issues / pitfalls do I need to consider when overriding equals and hashCode in a java class?"

I have found that his original statement it true. removeAll works automatically only if you override the remove in the iterator. Just overriding the remove in the Collection is not enough because the removeAll (and clear and retainAll) all use the iterator to work. Since you should not change the underlying collection while using the iterator except for the remove in the iterator, if you do not override the remove in the iterator, the removeAll, clear and retainAll will not work. If you throw an UnsupportedOperationException in the remove method inside the iterator, that is what you will see if you call one of the three methods discussed.

Removing duplicates without overriding hash method

I have a List which contains a list of objects and I want to remove from this list all the elements which have the same values in two of their attributes. I had though about doing something like this:
List<Class1> myList;
....
Set<Class1> mySet = new HashSet<Class1>();
mySet.addAll(myList);
and overriding hash method in Class1 so it returns a number which depends only in the attributes I want to consider.
The problem is that I need to do a different filtering in another part of the application so I can't override hash method in this way (I would need two different hash methods).
What's the most efficient way of doing this filtering without overriding hash method?
Thanks

Overriding hashCode and equals in Class1 (just to do this) is problematic. You end up with your class having an unnatural definition of equality, which may turn out to be other for other current and future uses of the class.
Review the Comparator interface and write a Comparator<Class1> implementation to compare instances of your Class1 based on your criteria; e.g. based on those two attributes. Then instantiate a TreeSet<Class>` for duplicate detection using the TreeSet(Comparator) constructor.
EDIT
Comparing this approach with #Tom Hawtin's approach:
The two approaches use roughly comparable space overall. The treeset's internal nodes roughly balance the hashset's array and the wrappers that support the custom equals / hash methods.
The wrapper + hashset approach is O(N) in time (assuming good hashing) versus O(NlogN) for the treeset approach. So that is the way to go if the input list is likely to be large.
The treeset approach wins in terms of the lines of code that need to be written.

Let your Class1 implements Comparable. Then use TreeSet as in your example (i.e. use addAll method).

As an alternative to what Roman said you can have a look at this SO question about filtering using Predicates. If you use Google Collections anyway this might be a good fit.

I would suggest introducing a class for the concept of the parts of Class1 that you want to consider significant in this context. Then use a HashSet or HashMap.

Sometimes programmers make things too complicated trying to use all the nice features of a language, and the answers to this question are an example. Overriding anything on the class is overkill. What you need is this:
class MyClass {
Object attr1;
Object attr2;
}
List<Class1> list;
Set<Class1> set=....
Set<MyClass> tempset = new HashSet<MyClass>;
for (Class1 c:list) {
MyClass myc = new MyClass();
myc.attr1 = c.attr1;
myc.attr2 = c.attr2;
if (!tempset.contains(myc)) {
tempset.add(myc);
set.add(c);
}
}
Feel free to fix up minor irregulairites. There will be some issues depending on what you mean by equality for the attributes (and obvious changes if the attributes are primitive). Sometimes we need to write code, not just use the builtin libraries.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.