Set created from Arrays.asList storing duplicate elements - java

I used a structure like the following to get unique elements from an array of objects.
dataList.put(column, new LinkedList<Object>(new HashSet<Object>(Arrays.asList(entry.getValue()))));
The array from entry.getValue() is a 100 element array containing values from 1 to 99, with 1 being repeated twice.
The documentation says that Arrays.asList(arr[]) method returns a fixed-length list of the same length as the array.
I have observed that the set created also contains the duplicate values given by the original array.
Please explain this behaviour.
More details.
I have also tried using set.addAll(Arrays.asList(entry.getValue()); , where set is a HashSet and got the same results.
The array returned by entry.getValue() is an array of type java.lang.Short

Most likely, you didn't override equals() in the class of the objects in the array returned by entry.getValue(). And especially since you are using a HashSet, you should override hashCode() too, so that it "agrees" with equals(), as per the javadoc of equals():
Note that it is generally necessary to override the hashCode method whenever this method is overridden, so as to maintain the general contract for the hashCode method, which states that equal objects must have equal hash codes.
If you don't override equals(), each instance will not be equal() to any other instance despite its "value" being the same, because that's the default implementation of equals(), so the Set will see both "1" objects as "different".

After running the above code fragment with arrays of various data types I found that arrays of primitive types are stored as a single array object, whereas arrays of Java Classes like String, Integer, Float etc. are stored as a collection of their elements.
In my case I had an array of int[] passed into the HashSet as a List, which was taken as a single Object and no filtering out of duplicate elements was done. When the array contained Integer objects, the HashSet could filter out the duplicates.

Related

What happens if I change an Object that is inside a HashSet?

I made an own class called Region and I store instances of Region in a HashSet. I use a HashSet, that there are no Objects which are equal in the list. The String name of a Region should be unique in the HashSet, so I have overriden the equals method.
My Question:
What happens if I store two regions with different names into the HashSet and then I make the different names equal (by a setter for the name)?
This is no duplicate. The other question is about equal HashSets and not about equal objects in HashSets.
The behavior of a set is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is an element in the set.
-- the Set Javadoc

How does Java determine uniqueness of Hashmap key?

I want to maintain a list of objects such that each object in the list is unique.Also I want to retrieve it at one point. Objects are in thousands and I can't modify their source to add a unique id. Also hascodes are unreliable.
My approach was to utilize the key uniqueness of a map.
Say a maintain a map like :
HashMap<Object,int> uniqueObjectMap.
I will add object to map with as a key and set a random int as value. But how does java determine if the object is unique when used as a key ?
Say,
List listOne;
List listTwo;
Object o = new Object;
listOne.add(o);
listTwo.add(o);
uniqueObjectMap.put(listOne.get(0),randomInt()); // -- > line 1
uniqueObjectMap.put(listTw0.get(0),randomInt()); // --> line 2
Will line 2 give an unique key violation error since both are referring to the same object o ?
Edit
So if will unqiueObjectMap.containsKey(listTwo.get(0)) return true ? How are objects determined to be equal here ? Is a field by field comparison done ? Can I rely on this to make sure only one copy of ANY type of object is maintained in the map as key ?
Will line 2 give an unique key violation error since both are referring to the same object o ?
- No. If a key is found to be already present, then its value will be overwritten with the new one.
Next, HashMap has a separate hash() method which Applies a supplemental hash function to a given hashCode (of key objects), which defends against poor quality hash functions.
It does so by calling the Object's hashcode() function.
The default implementation is roughly equivalent to the object's unique identifier (much like a memory address); however, there are objects that are compare-by-value. If dealing with a compare-by-value object, hashcode() will be overridden to compute a number based on the values, such that two identical values yield the same hashcode() number.
As for the collection items that are hash based, the put(...) operation is fine with putting something over the original location. In short, if two objects yeild the same hashcode() and a positive equals(...) result, then operations will assume that they are for all practical purposes the same object. Thus, put may replace the old with the new, or do nothing, as the object is considered the same.
It may not store two copies in the same "place" as it makes no sense to store two copies at the same location. Thus, sets will only contain one copy, as will map keys; however, lists will possibly contain two copies, depending on how you added the second copy.
How are objects determined to be equal here ?
By using equals and Hashcode function of Object class.
Is a field by field comparison done ?
No, if you dont implement equals and hashcode, java will compare the references of your objects.
Can I rely on this to make sure only one copy of ANY type of object is maintained in the map as key ?
No.
Using a Set is a better approch than using Map because it removes duplicates by his own, but in this case it wont work either because Set determinates duplicates the same way like a Map does it with Keys.
If you will refer to same then it ll not throw an error because when HashMap get same key then it's related value will be overwrite.
If the same key is exist in HashMap then it will be overwritten.
if you want to check if the key or value is already exist or not then you can use:
containsKey() and containsValue().
ex :
hashMap.containsKey(0);
this will return true if the key named 0 is already exist otherwise false.
By getting hashcode value using hash(key.hashCode())
HashMap has an inner class Entry with attributes
final K key;
V value;
Entry<K ,V> next;
final int hash;
Hash value is used to calculate the index in the array for storing Entry object, there might be the scenario where 2 unequal object can have same equal hash value.
Entry objects are stored in linked list form, in case of collision, all entry object with same hash value are stored in same Linkedlist but equal method will test for true equality. In this way, HashMap ensure the uniqueness of keys.

ArrayList<int[]> containing a value

if i have an ArrayList<int[]> example and I want to check if {2,4} is in it how would I do this?
exaple.contains({2,4}); //doesn't work
and
exaple.contains(2,4); //doesn't work either
what is wrong with the code?
Arrays don't override the Object.equals() method (that ArrayList.contains() uses to compare objects). So an array is only equal to itself. You'll have to loop through the list and compare each element with your array using Arrays.equals().
What I suspect, though, is that you shouldn't have a List<int[]>, but a List<Coordinate>. The Coordinate class could then override equals() and hashCode(), and you would be able to use
example.contains(new Coordinate(2, 4))
You could also use a List<List<Integer>>, but if what I suspect is true (i.e. you're using arrays to hold two coordinates that should be in class), then go with the custom Coordinate class.
Even if you did it properly, ie:
int[] vals = {2, 4};
exaple.contains(vals);
You would not return true, because the contains method will use the equals method. Unless you have overridden the equals method, then this will always resolve to false, unless you pass in the exact same array.
You might want to take a look at this. And please post a bit of you're code.
i need to find a integer data in arraylist?
The contains() method checks, if ArrayList contains a specific object, it doesn't concern itself with the values inside an object. In other words, if you instantiate two {2,4} arrays, it will be two different objects and contains() method will diferentiate between them.
What you need to do is either override the contains() method to look on the content of arrays, instead of only the reference, or you can drop the contains() method completely and check it by hand.

Contains and equals

I'm a little befuddled by some code:
for (AbstractItem item : mSetOfItems) {
if (item.equals(pPrimaryItem))
{
System.out.println("Contains? " + mSetOfItems.contains(pPrimaryItem));
}
}
How could it be possible that item.equals(pPrimaryItem) resolves as true, and mSetOfItems.contains(pPrimaryItem) resolves as false? Because that's what I'm seeing in my code.
In other words, if I iterate through my set, I can find an element equal to my test element. But if I use contains, my test elements is not reported being in the set. I'm baffled because I thought contains used equals. What could I be overlooking?
You didn't give the type of mSetOfItems, but I'm guessing that AbstractItem overrides .equals() but not .hashcode(). This is bad.
If mSetOfItems uses hashcode for lookup, which it could based on its type, you'll get the behavior you described.
Your assumption is that .contains() is implemented with iteration and .equals(). There's no list interface which guarantees that.
What is the implementation of mSetOfItems?
If it's a tree, it could be that your comparison function returns inconsistent values.
If it's a hash, it could be that your equals() returns true for objects with different hash codes, or that the object's hashCode() has changed since it was inserted into the set.
If your set is a TreeSet or some other set where you're using a custom comparator, then you could see this if the comparator was broken, either by not returning a valid sorted order or by having objects that are actually equal compare unequal. When the set internally looks up an element and uses the comparator, it would make a wrong choice and not see the element.
If your set is a HashSet, your hash function could be broken and cause two objects that are equal to have different hash code. Internally as the HashSet uses the object's hash code to figure out where to look, it might end up looking in the wrong bucket.
Alternatively, if you store objects in a Set of any sort and then modify them, you might end up breaking some internal invariant of the Set. For instance, if you store something in a HashSet and then change its value, it will be in the wrong bucket, and if you have a TreeSet and change the value it may appear in the wrong spot in sorted order.
If you are concurrently modifying the set, it's possible that you might have added the element in another thread but not had any guarantees that the operation that made that change be visible in another thread. The second thread would then not see the element even if it were added.
Check the hashcode() method of your class
If mSetOfItems is a java.util.HashTable (or similar 'Hash' Collection, Set, etc) then you must implement hashCode() as well. boolean contains(Object elem) will first try to find the passed object by calculating its hash and retrieving it in the Collection. Once contains finds something, it will then use the equals method to verify that the two objects are the same objects according to your implementation.
If not properly overridden, hashCode() will return an unpredictable int that is usually the integer representation of the internal address of the object itself. This will always be different for two distinct objects no matter the the values of their instance variables. If not overridden, contains won't be able to find it any objects...
When implementing hashCode() remind that:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.
Also, make sure that you properly overridden the equals function by respecting its signature:
public boolean equals(Object obj);

Setting key type in HashMap, how?

hi
I want to create a HashMap (java) that stores Expression, a little object i've created.
How do I choose what type of key to use? What's the difference for me between integer and String? I guess i just don't fully understand the idea behind HashMap so i'm not sure what keys to use.
Thanks!
Java HashMap relies on two things:
the hashCode() method, which returns an integer that is generated from the key and used inside the map
the equals(..) method, which should be consistent to the hash calculated, this means that if two keys has the same hashcode than it is desiderable that they are the same element.
The specific requirements, taken from Java API doc are the following:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.
If you don't provide any kind of specific implementation, then the memory reference of the object is used as the hashcode. This is usually good in most situations but if you have for example:
Expression e1 = new Expression(2,4,PLUS);
Expression e2 = new Expression(2,4,PLUS);
(I don't actually know what you need to place inside your hashmap so I'm just guessing)
Then, since they are two different object although with same parameters, they will have different hashcodes. This could be or not be a problem for your specific situation.
In case it isn't just use the hasmap without caring about these details, if it is you will need to provide a better way to compute the hashcode and equality of your Expression class.
You could do it in a recursive way (by computing the hashcode as a result of the hashcodes of children) or in a naive way (maybe computing the hashcode over a toString() representation).
Finally, if you are planning to use just simple types as keys (like you said integers or strings) just don't worry, there's no difference. In both cases two different items will have the same hashcode. Some examples:
assert(new String("hello").hashCode() == new String("hello").hashCode());
int x = 123;
assert(new Integer(x).hashCode() == new Integer(123).hashCode());
Mind that the example with strings is not true in general, like I explained you before, it is just because the hashcode method of strings computes the value according to the content of the string itself.
The key is what you use to identify objects. You might have a situation where you want to identify numbers by their name.
Map<String,Integer> numbersByName = new HashMap<String,Integer>();
numbersByName.put("one",Integer.valueOf(1));
numbersByName.put("two",Integer.valueOf(2));
numbersByName.put("three",Integer.valueOf(3));
... etc
Then later you can get them out by doing
Integer three = numbersByName.get("three");
Or you might have a need to go the other way. If you know you're going to have integer values, and want the names, you can map integers to strings
Map<String,Integer> numbersByValue = new HashMap<String,Integer>();
numbersByValue.put(Integer.valueOf(1),"one");
numbersByValue.put(Integer.valueOf(2),"two");
numbersByValue.put(Integer.valueOf(3),"three");
... etc
And get it out
String three = numbersByValue.get(Integer.valueOf(3));
Keys and their associated values are both objects. When you get something from a HashMap, you have to cast it to the actual type of object it represents (we can do this because all objects in Java inherit the Object class). So, if your keys are strings and your values are Integers, you would do something like:
Integer myValue = (Integer)myMap.get("myKey");
However, you can use Java generics to tell the compiler that you're only going to be using Strings and Integers:
HashMap<String,Integer> myMap = new HashMap<String,Integer>();
See http://download.oracle.com/javase/1.4.2/docs/api/java/util/HashMap.html for more details on HashMap.
If you do not want to look up the expressions, why do you want them to store in a map?
But if you want to, then the key is that item you use for lookup.

Categories

Resources