Data structure to hold just keys (not caring about value)

Data structure to hold just keys (not caring about value) - java

I need to store a list of Strings and need to check if a String exists in the list.
I normally would just use some Map with a key and boolean... i.e.
HashMap map<String,Boolean> = new HashMap<String,Boolean)()
And just do a map.contains(string)
This is sort of the way I have always done these kind of lookups in the past because I know that using a map will be O(1) access.
I know this might be nitpicky and unimportant, but I was just curious if there was some structure that was out there that would save that boolean value. Just seems like a waste of memory because I don't care about the false value because if the key doesn't exist that equates to false.
I was thinking maybe pointing a keyword to null would do what I want, but I was wondering if there was some data structure that sort of did this.

This is what the Set<T> collection is for. The HashSet<T> implementation is O(1) and internally does just what you propose: It's a HashMap<T,V> where the value for each key is the same internal Object instance. That is, the source contains
// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();
and the value of each entry is set to PRESENT.

Maybe a HashSet<E>?
Or indeed, anything that implements Set<E>, although they don't all have O(1) expected lookup.

HashSet (or) Someother Set interface implementation may get you the functionality you are looking for.

You should use HashSet<E> - "data structure to hold just keys (not caring about value)". Its implementation is based on HashMap<K,V>, where each element is a value, and keys are just Object, exactly what you need.
public class HashSet<E> ... {
...
private transient HashMap<E,Object> map;
// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();
public HashSet() {
map = new HashMap<E,Object>();
}
...
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
...
}

Why not use the usual ArrayList<String> it has the contain method and everything else...

Related

Efficiently Comparing Checking Maps inside a list of Objects in Java

So, I have an object, that has a hashmap inside it (of course, with the corresponding getter and setter).
public class ObjA {
private HashMap<String, Boolean> mymap;
...
}
Then, I have another class with a list of those objects (and corresponding getters/setters):
public class ObjB {
private List<ObjA> list;
...
}
Then, a third class CheckerClass needs to check how many ObjA in the list inside ObjB have the same map (this means same keys and same values for each key).
My initial thought was: loop the list, and for each ObjA parse its map to a String. Collect all the parsings on a Collection, and then count duplicates. But I want something more memory efficient, and more elegant: you know, easily read. And I'm not sure how to get it.

Here's a way of creating a count of each unique map:
list.stream().collect(Collectors.groupingBy(ObjA::getMyMap, Collectors.counting());
This works because Map.equals does exactly what you want: checks if the keysets are the same and map to equal values.
So then for each unique map you can find the number of times it appears in the list. Or if you specifically want to know how many duplicates there are, then use .values().stream().filter(c -> c > 1).count() on the resulting map.

My opinion is to modify ObjA as a Comparable object. You may do like this:
public class ObjA implements Comparable<ObjA>{
private HashMap<String, Boolean> mymap;
...
#Override
public int compareTo(ObjA o) {
//overide it by comparing key and value.
}
}
In the CheckerClass. You can sort the ObjA list and count duplicated objects.
Collections.sort(list)

HashMap with ArrayList key can not find it when Arraylist grows

Well my problem is that in some part of my code I use an arraylist as a key in a hashmap for example
ArrayList<Integer> array = new ArrayList<Integer>();
And then I put my array like a key in a hash map (I need it in this way I'm sure of that)
HashMap<ArrayList<Integer>, String> map = new HashMap<ArrayList<Integer>, String>();
map.put(array, "value1");
Here comes the problem: When I add some value to my array and then I try to recover the data using the same array then the hash map cant find it.
array.add(23);
String value = map.get(array);
At this time value is null instead of string "value1"
I was testing and I discovered that the hashCode changes when array list grows up and this is the central point of my problem, but I want to know how can I fix this.

Use an IdentityHashMap. Then that same array instance will always map to the same value, no matter how its contents (and therefore hash code) are changed.

You can't use a mutable object (that is, one whose hashCode changes) as the key of a HashMap. See if you can find something else to use as the key instead. It's somewhat unusual to map a collection to a string; the other way around is much more common.

Its a weird use case but if you must do it then you can sub class the array and override the hashCode method.

Its a bit of an add thing to try and do in my opinion.
I assume what you are trying to model is a variable length key made up of n integers, and assume that the hash of the ArrayList will be consistent, but I'm not sure that is the case.
I would suggest that you either subclass ArrayList and override the hash() & equals() methods, or wrap the HashMap in a key class.

I'm almost certain you would not want to do that. It's more likely you would want a Map<String, List<Integer>>. However, if you absolutely must do this, use a holder class:
public class ListHolder {
private List<Integer> list = new ArrayList<Integer>();
public List<Integer> getList() {return list;}
}
Map<ListHolder, String> map = new HashMap<ListHolder, String>;

The basic reason: When we use HashMap.put(k, v), it will digit k.hashCode() so that it can know where to put it.
And it also find the value by this number(k.hashCode());
You can see the ArrayList.hashCode() function and it is in the abstract class of AbstractList. Obviously, after we add some object, it will change the haseCode value. So we can not find the value use HashMap.get(K) and there is no element which hashCode is K.
public int hashCode() {
int hashCode = 1;
for (E e : this)
hashCode = 31*hashCode + (e==null ? 0 : e.hashCode());
return hashCode;
}

Java HashSet vs HashMap

I understand that HashSet is based on HashMap implementation but is used when you need unique set of elements. So why in the next code when putting same objects into the map and set we have size of both collections equals to 1? Shouldn't map size be 2? Because if size of both collection is equal I don't see any difference of using this two collections.
Set testSet = new HashSet<SimpleObject>();
Map testMap = new HashMap<Integer, SimpleObject>();
SimpleObject simpleObject1 = new SimpleObject("Igor", 1);
SimpleObject simplObject2 = new SimpleObject("Igor", 1);
testSet.add(simpleObject1);
testSet.add(simplObject2);
Integer key = new Integer(10);
testMap.put(key, simpleObject1);
testMap.put(key, simplObject2);
System.out.println(testSet.size());
System.out.println(testMap.size());
The output is 1 and 1.
SimpleObject code
public class SimpleObject {
private String dataField1;
private int dataField2;
public SimpleObject(){}
public SimpleObject(String data1, int data2){
this.dataField1 = data1;
this.dataField2 = data2;
}
public String getDataField1() {
return dataField1;
}
public int getDataField2() {
return dataField2;
}
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result
+ ((dataField1 == null) ? 0 : dataField1.hashCode());
result = prime * result + dataField2;
return result;
}
#Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
SimpleObject other = (SimpleObject) obj;
if (dataField1 == null) {
if (other.dataField1 != null)
return false;
} else if (!dataField1.equals(other.dataField1))
return false;
if (dataField2 != other.dataField2)
return false;
return true;
}
}

The map holds unique keys. When you invoke put with a key that exists in the map, the object under that key is replaced with the new object. Hence the size 1.
The difference between the two should be obvious:
in a Map you store key-value pairs
in a Set you store only the keys
In fact, a HashSet has a HashMap field, and whenever add(obj) is invoked, the put method is invoked on the underlying map map.put(obj, DUMMY) - where the dummy object is a private static final Object DUMMY = new Object(). So the map is populated with your object as key, and a value that is of no interest.

A key in a Map can only map to a single value. So the second time you put in to the map with the same key, it overwrites the first entry.

In case of the HashSet, adding the same object will be more or less a no-op. In case of a HashMap, putting a new key,value pair with an existing key will overwrite the existing value to set a new value for that key. Below I've added equals() checks to your code:
SimpleObject simpleObject1 = new SimpleObject("Igor", 1);
SimpleObject simplObject2 = new SimpleObject("Igor", 1);
//If the below prints true, the 2nd add will not add anything
System.out.println("Are the objects equal? " , (simpleObject1.equals(simpleObject2));
testSet.add(simpleObject1);
testSet.add(simplObject2);
Integer key = new Integer(10);
//This is a no-brainer as you've the exact same key, but lets keep it consistent
//If this returns true, the 2nd put will overwrite the 1st key-value pair.
testMap.put(key, simpleObject1);
testMap.put(key, simplObject2);
System.out.println("Are the keys equal? ", (key.equals(key));
System.out.println(testSet.size());
System.out.println(testMap.size());

I just wanted to add to these great answers, the answer to your last dilemma. You wanted to know what is the difference between these two collections, if they are returning the same size after your insertion. Well, you can't really see the difference here, because you are inserting two values in the map with the same key, and hence changing the first value with the second. You would see the real difference (among the others) should you have inserted the same value in the map, but with the different key. Then, you would see that you can have duplicate values in the map, but you can't have duplicate keys, and in the set you can't have duplicate values. This is the main difference here.

Answer is simple because it is nature of HashSets.
HashSet uses internally HashMap with dummy object named PRESENT as value and KEY of this hashmap will be your object.
hash(simpleObject1) and hash(simplObject2) will return the same int. So?
When you add simpleObject1 to hashset it will put this to its internal hashmap with simpleObject1 as a key. Then when you add(simplObject2) you will get false because it is available in the internal hashmap already as key.
As a little extra info, HashSet use effectively hashing function to provide O(1) performance by using object's equals() and hashCode() contract. That's why hashset does not allow "null" which cannot be implemented equals() and hashCode() to non-object.

I think the major difference is,
HashSet is stable in the sense, it doesn't replace duplicate value (if found after inserting first unique key, just discard all future duplicates), and HashMap will make the effort to replace old with new duplicate value. So there must be overhead in HashMap of inserting new duplicate item.

public class HashSet<E>
extends AbstractSet<E>
implements Set<E>, Cloneable, Serializable
This class implements the Set interface, backed by a hash table (actually a HashMap instance). It makes no guarantees as to the iteration order of the set; in particular, it does not guarantee that the order will remain constant over time. This class permits the null element.
This class offers constant time performance for the basic operations (add, remove, contains and size), assuming the hash function disperses the elements properly among the buckets. Iterating over this set requires time proportional to the sum of the HashSet instance's size (the number of elements) plus the "capacity" of the backing HashMap instance (the number of buckets). Thus, it's very important not to set the initial capacity too high (or the load factor too low) if iteration performance is important.
Note that this implementation is not synchronized. If multiple threads access a hash set concurrently, and at least one of the threads modifies the set, it must be synchronized externally. This is typically accomplished by synchronizing on some object that naturally encapsulates the set. If no such object exists, the set should be "wrapped" using the Collections.synchronizedSet method. This is best done at creation time, to prevent accidental unsynchronized access to the set
More Details

Java Hashmap/Hashtable and numbering

the question is simple - I have to implement JTree TreeModel interface which requires that every object has a number. The tree will represent data that are kept in hashmap/hashtable. Keys in that hashmap are client objects and values are arrays of resources (or ArrayLists) so numbering is only a problem at the top level. What would be the easiest way to number keys in Hashmap/Hashtable?

public class IndexedMap<V> extends HashMap<Long, V> {
private AtomicLong index = new AtomicLong();
public void put(V value) {
put(index.getAndIncrement(), value);
}
}
IndexedMap<Object> objects = new IndexedMap<Object>();
objects.put("foo");
objects.put("bar");
// ...
But why don't you just use an ArrayList? It holds objects by an index, exactly what you need.

Sounds like the user-object keys need to be ordered - their "number" would be derived from their spot in the ordering.
Are the keys Comparable? If so, maybe use a TreeMap. If not, I suppose insertion order is your best bet (LinkedHashMap)

What's the best way to detach a Collection from a Map in Java?

I obtain a HashSet from a HashMap and I don't want that my modifications on the HashSet reflect on the HashMap values.
What's the best way of doing something like this :
HashSet<Object> hashset = new HashSet((Collection<Object>) hashmap.values());
//Something like ...
hashset.detach();
//Then i can modify the HashSet without modifying the HashMap values
Edit :
I have to modify an element in the HashSet but I don't want to modify this same element in the HashMap.
Thanks!!!

If you're creating a new HashSet as per the first line of your code snippet, that's already a separate collection. Adding or removing items from the set won't change your hashMap. Modifying the existing items will, of course - but that's a different matter, and will almost always be a Very Bad Thing (assuming your modifications affect object equality).

When you create the HashSet from hashMap.values() like this, then it's already "detached" in the sense that modifying the HashSet will not influence the map it was constructed from.
However, if you modify an object inside the set (for example calling a setter on it), then those changes will be reflected inside the HashMap as well (since the Set and the Map will refer to the same object).
One way around this is to make defensive copies of each element (using clone() or by using a copy constructor).
Another way is to use immutable objects.

You are close:
Set<Object> set = hashmap.values(); // is backed by the map
// create a new hashset seeded from the other set
Set<Object> hashset = new HashSet<Object>(set);

If you are trying to copy the values, and change the state of the values you need to create a deep copy, which relies on knowing how to create copies of the objects held in the Map as values. Hopefuly this test illustrates what I mean.
#Test
public void testHashMap() throws Exception {
final Map<Integer, TestContainer<Double>> hashmap = new HashMap<Integer, TestContainer<Double>>();
final TestContainer<Double> t1 = new TestContainer<Double>(1d);
final TestContainer<Double> t2 = new TestContainer<Double>(2d);
hashmap.put(1, t1);
hashmap.put(2, t2);
// create a separate collection which can be modified
final Set<TestContainer<Double>> hashset = new HashSet<TestContainer<Double>>(hashmap.values());
assertEquals(2, hashmap.size());
assertEquals(2, hashset.size());
hashset.remove(t2);
assertEquals(2, hashmap.size());
assertEquals(1, hashset.size());
// prove that we cannot modify the contents of the collection
hashset.iterator().next().o += 1;
assertEquals(2d, t1.o, 0d);
}
private static final class TestContainer<T> {
private T o;
private TestContainer(final T o) {
this.o = o;
}
}

Try this:
public MyType cloneObject(MyType o) {
MyType clone = new MyType();
// TODO copy the attributes of 'o' to 'clone' return the clone
return clone;
}
public void populateHashSet(HashMap<Object,MyType> hashMap) {
HashSet<MyType> hashSet = new HashSet<MyType>();
for (MyType o : hashMap.values()) {
hashSet.add(cloneObject(o));
}
}
That said, I would be very careful about making copies of objects unless all the attributes of the object are primitive/immutable types. If you just copy an attribute object reference to an object reference in the clone then your 'clone' can still produce side-effects in the original object by changing the objects it references.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Data structure to hold just keys (not caring about value) - java

Maybe a HashSet<E>? Or indeed, anything that implements Set<E>, although they don't all have O(1) expected lookup.

HashSet (or) Someother Set interface implementation may get you the functionality you are looking for.

Why not use the usual ArrayList<String> it has the contain method and everything else...

Related

Efficiently Comparing Checking Maps inside a list of Objects in Java

HashMap with ArrayList key can not find it when Arraylist grows

Java HashSet vs HashMap

Java Hashmap/Hashtable and numbering

What's the best way to detach a Collection from a Map in Java?

Categories

Resources