composite keys in map

composite keys in map - java

Hello I cannot find any info on what I need to do to make two keys seems equal. That is, i need to provide a custom compare method that will be used by map.put() Implementing comparable doesnt help.
For example, this code doesnt work as intended - for the purposes of my program, two keys n and n2 ARE the same.
private class N implements Comparable<N> {
int value;
int stuff;
String z;
#Override
public int compareTo(N arg0) {
if (arg0.z.equals(z))
return 0;
return 1;
}
}
public void dostuff() {
HashMap m = new HashMap();
N n = new N();
n.z = "1";
N n2 = new N();
n2.z = "1";
m.put(n, "one");
m.put(n2, "two");
// will print refs to two instances! - wrong
Iterator it = m.keySet().iterator();
while (it.hasNext()) {
System.err.println(it.next());
}
}

You need to override equals and hashCode - HashMap doesn't use compareTo, which is meant for sorting.
Note that your compareTo implementation is already broken, as it's really only testing for equality. In particular, x.compareTo(y) and y.compareTo(x) both returning 1 violates the contract of compareTo:
The implementor must ensure sgn(x.compareTo(y)) == -sgn(y.compareTo(x)) for all x and y.

Related

TreeSet to find k most frequent words in a book?

The commonly occurring question of finding k most frequent words in a book ,(words can dynamically be added), is usually solved using combination of trie and heap.
However, I think even using a TreeSet should suffice and be cleaner with log(n) performance for insert and retrievals.
The treeset would contain a custom object:
class MyObj implements Comparable{
String value;
int count;
public int incrementCount(){count++;}
//override equals and hashcode to make this object unique by string 'value'
//override compareTo to compare count
}
Whenever we insert object in the treeset we first check if the element is already present in the treeset if yes then we get the obj and increment the count variable of that object.
Whenever, we want to find the k largest words , we just iterate over the first k elements of the treeset
What are your views on the above approach? I feel this approach is easier to code and understand and also matches the time complexity of the trie and heap approach to get k largest elements
EDIT: As stated in one of the answers , incrementing count variable after myobj has been inserted wouldn't re-sort the treeset/treemap. So ,after incrementing the count , I will additionally need to remove and reinsert the object in the treeset/treemap

Once you enter an object into the TreeSet, if the properties used in the comparison of the compareTo method changes, the TreeSet (or the underlying TreeMap) does not reorder the elements. Hence, this approach does not work as you expect.
Here's a simple example to demonstrate it
public static class MyObj implements Comparable<MyObj> {
String value;
int count;
MyObj(String v, int c) {
this.value = v;
this.count = c;
}
public void incrementCount(){
count++;
}
#Override
public int compareTo(MyObj o) {
return Integer.compare(this.count, o.count); //This does the reverse. Orders by freqency
}
}
public static void main(String[] args) {
Set<MyObj> set = new TreeSet<>();
MyObj o1 = new MyObj("a", 1);
MyObj o2 = new MyObj("b", 4);
MyObj o3 = new MyObj("c", 2);
set.add(o1);
set.add(o2);
set.add(o3);
System.out.println(set);
//The above prints [a-1, c-2, b-4]
//Increment the count of c 4 times
o3.incrementCount();
o3.incrementCount();
o3.incrementCount();
o3.incrementCount();
System.out.println(set);
//The above prints [a-1, c-6, b-4]
As we can see the object corresponding to c-6 does not get pushed to the last.
//Insert a new object
set.add(new MyObj("d", 3));
System.out.println(set);
//this prints [a-1, d-3, c-6, b-4]
}
EDIT:
Caveats/Problems:
Using count when comparing two words would remove one word if both words have the same frequency. So, you need to compare the actual words if their frequencies are same.
It would work if we remove and reinsert the object with the updated frequency. But for that, you need to get that object(MyObj instance for a specified value to know the frequency so far) from the TreeSet. A Set does not have a get method. Its contains method just delegates to the underlying TreeMap's containsKey method which identifies the object by using the compareTo logic (and not equals). The compareTo function also takes into account the frequency of the word, so we cannot identify the word in the set to remove it (unless we iterate the whole set on each add)

A TreeMap should work if you remove and insert the object, with an integer key as a frequency and a list of MyObj as a value, the keys are sorted by frequency. An update of the above code demonstrate it:
public class MyObj {
String value;
int count;
MyObj(String v, int c) {
this.value = v;
this.count = c;
}
public int getCount() {
return count;
}
public void incrementCount() {
count++;
}
#Override
public String toString() {
return value + " " + count;
}
public static void put(Map<Integer, List<MyObj>> map, MyObj value) {
List<MyObj> myObjs = map.get(value.getCount());
if (myObjs == null) {
myObjs = new ArrayList<>();
map.put(value.getCount(),myObjs);
}
myObjs.add(value);
}
public static void main(String[] args) {
TreeMap<Integer, List<MyObj>> set = new TreeMap<>();
MyObj o1 = new MyObj("a", 1);
MyObj o2 = new MyObj("b", 4);
MyObj o3 = new MyObj("c", 2);
MyObj o4 = new MyObj("f", 4);
put(set,o1);
put(set,o2);
put(set,o3);
System.out.println(set);
put(set,o4);
System.out.println(set);
}
}

What is the best algorithm to sort a list on the basis of two criteria?

I have a list which I need to sort based on two criteria.
The first criterion is a Boolean, let's say isBig. The second one is a Long, which represents a timestamp.
I need to order the elements of the list in this way: before the isBig = true, and then the isBig = false. Within these groups, the single elements should be ordered descending on the basis of their timestamp.
Basically, I expect the result to be something like this:
isBig - 2015/10/29
isBig - 2015/10/28
isBig - 2015/10/27
!isBig - 2015/10/30
!isBig - 2015/10/27
!isBig - 2015/10/26
Let's say the object is this:
public class Item {
Boolean isBig;
Long timestamp;
// ...
}
and the list is just List<Item> list.
I figured out that one method would be make three for-cycles: the first to make up the two groups: isBig and !isBig. The second and the third for sorting the elements within them. Finally I merge the two lists.
Is there a more efficient algorithm for sorting lists on the basis of two criteria?

You can sort the list directly using a custom comparison method which checks both criteria.
Use the Collections.sort method and pass a custom comparator with the method compare overriden to:
int compare(Item o1, Item o2) {
if (o1.isBig && !o2.isBig)
return -1;
if (!o1.isBig && o2.isBig)
return 1;
if (o1.timestamp < o2.timestamp)
return -1;
if (o1.timestamp > o2.timestamp)
return 1;
return 0;
}
If you are obsessed with performance you could possibly speed it up by a few percents with a more sophisticated approach, but for a list of a few hundred elements the gains would be negligible.
An optimized comparison method:
int compare(Item o1, Item o2) {
int bigness = (o2.isBig ? 2 : 0) - (o1.isBig ? 2 : 0);
long diff = o1.timestamp - o2.timestamp;
return bigness + (int) Long.signum(diff);
}
It features no conditional branches what means it will probably be faster than the naive version above.
That's probably everything that can be done for performance. If we knew something more about your data (for instance there are always more big object than small ones, or all the timestamps are unique, or all the timestamps are from a certain narrow range etc) we could probably propose some better solution. However, when we assume that your data is arbitrary and has no specific pattern than the very best solution is to use a standard sort utility like I've shown above.
Splitting the list into two sublists and sorting them separately will definitely be slower. Actually the sorting algorithm will most probably divide the data into two groups and then recursively into four groups, and so on. However, the division won't follow the isBig criterion. If you want to learn more, read how quick sort or merge sort work.

The following things you need to do to have two comparable objects for sorting on two parameters.
You need to implement Comparator for two comparable objects that you have is one Boolean and one Timestamp.
you need to pass these comparators to Collections.sort() because as they are objects that compared for two keys and the data structure is not of primitives they need Collections.sort().
/**
* Comparator to sort employees list or array in order of Salary
*/
public static Comparator<BooleanComaprator> booleanComparator= new Comparator<BooleanComaprator>() {
#Override
public int compare(BooleanComaprator e1, BooleanComaprator e2) {
if (e1.isBig && !e2.isBig)
return -1;
if (!e1.isBig && e2.isBig)
return 1;
else
return 0;
}
}
use this object in Collections.sort(booleanComparator);

In theory, the approach using two separate lists should be faster than the approach using a two-step Comparator, because a comparison based on one field is obviously faster than a comparison based on two. By using two lists you are speeding up the part of the algorithm that has O(n log n) time complexity (the sort), at the expense of an additional initial stage (splitting into two pieces) which has time complexity O(n). Since n log n > n, the two lists approach should be faster for very, very large values of n.
However, in practice we are talking about such tiny differences in times that you have to have extremely long lists before the two lists approach wins out, and so it's very difficult to demonstrate the difference using lists before you start running into problems such as an OutOfMemoryError.
However, if you use arrays rather than lists, and use clever tricks to do it in place rather than using separate data structures, it is possible to beat the two-step Comparator approach, as the code below demonstrates. Before anybody complains: yes I know this is not a proper benchmark!
Even though sort2 is faster than sort1, I would probably not use it in production code. It is better to use familiar idioms and code that obviously works, rather than code that is harder to understand and maintain, even if it slightly faster.
public class Main {
static Random rand = new Random();
static Compound rand() {
return new Compound(rand.nextBoolean(), rand.nextLong());
}
static Compound[] randArray() {
int length = 100_000;
Compound[] temp = new Compound[length];
for (int i = 0; i < length; i++)
temp[i] = rand();
return temp;
}
static class Compound {
boolean bool;
long time;
Compound(boolean bool, long time) {
this.bool = bool;
this.time = time;
}
#Override
public boolean equals(Object o) {
if (this == o)
return true;
if (o == null || getClass() != o.getClass())
return false;
Compound compound = (Compound) o;
return bool == compound.bool && time == compound.time;
}
#Override
public int hashCode() {
int result = (bool ? 1 : 0);
result = 31 * result + (int) (time ^ (time >>> 32));
return result;
}
}
static final Comparator<Compound> COMPARATOR = new Comparator<Compound>() {
#Override
public int compare(Compound o1, Compound o2) {
int result = (o1.bool ? 0 : 1) - (o2.bool ? 0 : 1);
return result != 0 ? result : Long.compare(o1.time, o2.time);
}
};
static final Comparator<Compound> LONG_ONLY_COMPARATOR = new Comparator<Compound>() {
#Override
public int compare(Compound o1, Compound o2) {
return Long.compare(o1.time, o2.time);
}
};
static void sort1(Compound[] array) {
Arrays.sort(array, COMPARATOR);
}
static void sort2(Compound[] array) {
int secondIndex = array.length;
if (secondIndex == 0)
return;
int firstIndex = 0;
for (Compound c = array[0];;) {
if (c.bool) {
array[firstIndex++] = c;
if (firstIndex == secondIndex)
break;
c = array[firstIndex];
} else {
Compound c2 = array[--secondIndex];
array[secondIndex] = c;
if (firstIndex == secondIndex)
break;
c = c2;
}
}
Arrays.sort(array, 0, firstIndex, LONG_ONLY_COMPARATOR);
Arrays.sort(array, secondIndex, array.length, LONG_ONLY_COMPARATOR);
}
public static void main(String... args) {
// Warm up the JVM and check the algorithm actually works.
for (int i = 0; i < 20; i++) {
Compound[] arr1 = randArray();
Compound[] arr2 = arr1.clone();
sort1(arr1);
sort2(arr2);
if (!Arrays.equals(arr1, arr2))
throw new IllegalStateException();
System.out.println(i);
}
// Begin the test proper.
long normal = 0;
long split = 0;
for (int i = 0; i < 100; i++) {
Compound[] array1 = randArray();
Compound[] array2 = array1.clone();
long time = System.nanoTime();
sort1(array1);
normal += System.nanoTime() - time;
time = System.nanoTime();
sort2(array2);
split += System.nanoTime() - time;
System.out.println(i);
System.out.println("COMPARATOR: " + normal);
System.out.println("LONG_ONLY_COMPARATOR: " + split);
}
}
}

This is called sorting by multiple keys, and it's easy to do. If you're working with a sort library function that takes a comparator callback function to decide the relative ordering of two elements, define the comparator function so that it first checks whether the two input values a and b have equal isBig values, and, if not, immediately returns a.isBig > b.isBig (I'm assuming here that > is defined for boolean values; if not, substitute the obvious test). But if the isBig values are equal, you should return a.timestamp > b.timestamp.

You can a define a custom comparator and use it to sort the List. E.g.
class ItemComparator implements Comparator {
#Override
public int compare (Item a, Item b) {
int bc = Boolean.compare(a.isBig, b.isBig);
if (bc != 0)
return bc;
return Long.compare(a.timestamp, b.timestamp);
}
}
and use it like this
Collections.sort(list, ItemComparator);

Java iteration for possible matches in ArrayList

My question is about iteration and performance. Let's think of following case:
public class Car {
private String name;
private int type;
private int horsePower;
String getKey() {
return type + "_" + horsePower;
}
private final int NUM_OF_CARS = 50000;
public void test() {
List<Car> cars = new ArrayList<Car>(NUM_OF_CARS);
for (int i = 0; i < NUM_OF_CARS; i++) {
Car c = new Car();
if (i == 0 || i == 176 || i == 895 || i == 1500 || i == 4600) {
c.name = "Audi A4 " + i;
c.type = 1;
c.horsePower = 200;
} else {
c.name = "Default";
c.type = 2 + i;
c.horsePower = 201;
}
cars.add(c);
}
// Matches should contain all Audi's since they have same type and horse
// power
long time = SystemClock.currentThreadTimeMillis();
HashMap<String, List<Car>> map = new HashMap<String, List<Car>>();
for (Car c : cars) {
if (map.get(c.getKey()) != null) {
map.get(c.getKey()).add(c);
} else {
List<Car> list = new ArrayList<Car>();
list.add(c);
map.put(c.getKey(), list);
}
}
Iterator<Entry<String, List<Car>>> iterator = map.entrySet().iterator();
while (iterator.hasNext()) {
if (iterator.next().getValue().size() == 1) {
iterator.remove();
}
}
Log.d("test", String.valueOf((SystemClock.currentThreadTimeMillis() - time)));
}
}
Is this the most efficient way of finding all Audi's here?
This took me 1700 ms
Thanks.

It depends why you're iterating. If you really do need to visit every bottom-level Car, then you don't really have a choice. However, if you're looking for specific matches to the String, then you might consider using a Map.

Why don't you try (Map):
http://docs.oracle.com/javase/7/docs/api/java/util/Map.html
Basically it's a collection of Hashmaps:
http://docs.oracle.com/javase/7/docs/api/java/util/HashMap.html
Here's an example:
Map<String, List<Car>> map = new HashMap<String, List<Car>>();

If you want to find a String you should use HashMap. Otherwise you cannot avoid that type of iterations as far as i have in mind.

Use Hashing collections: HashSet that uses Object.hashCode() and Object.equals() to optimize search. How?
You must define your implementation of MyClass.hashCode() and equals(). hashCode() gives an integer representation of your object (you can do what you want here, but do it in a way where two different objects have different values)
HashSet will then do a modulo on your result ex: if the HashSet size is 5000 it will do a modulo 5000 and it will find the index where to put your object ex: if your hashCode() returns 10252, then 10252 % 5000 = 252. And your object will be put in an array with the index 252.
Finally, when you will ask (do I have an instance of "BMW x6", the object you ask for will have its hashCode() method called, which will return 10252 again. And HashSet will only search if it has an object in the 252 index.
If ever two objects give the same hashCode, then they will be compared through the equals() method.
I hope my explanations were clear. In short implement hashCode and equals() and try making the implementation of hashCode() optimized so you will gain time when filling your HashSet
You will probably also be interested in HashMap which stores keys and values where keys use the hashing mechanism: so you can find an object by its key

HashMap with Overriding hashcode() and equals() is not working in my case

this is first time I am posting problem.
Please help me to solve my problem.
In this code i am using HasMap to store key-value pairs, here key is String with three SubStrings separated by " " blank space delimiter.
For example,
String t1 = new String("A B C");
and stored in HashMap as-
m.put(t1,27);
Here, A, B and C are three different Strings. Where different combinations of A,B,C assumed as unique.
Like "A B C", "B A C", "C B A" are all treated as equal.
I implemented hashCode() and equal() for this,
Below code should print only
A B C:61046662
But it is not even calling hashCode() or equals(). Please give me some suggestion.
public class Test {
public int hashCode(){
System.out.println("hashcode method called");
return this.toString().length();
}
public boolean equals(Object obj) {
System.out.println("equal method called ");
int count = 0;
if(!(obj instanceof String))
return false;
if (obj == this)
return true;
count = 0;
StringTokenizer st = new StringTokenizer(((String)obj).toString(), " ");
while(st.hasMoreTokens()){
if(this.toString().contains(st.nextToken())){
count ++;
}
}
return (count == 3);
}
public static void main(String[] args) {
HashMap<String, Integer> m = new HashMap<String, Integer>();
String t1 = new String("A B C");
String t2 = new String("B A C");
String t3 = new String("C B A");
m.put(t1, 27);
m.put(t2, 34);
m.put(t3, 45);
System.out.println(m.get("A B C"));
for(Entry e : m.entrySet()){
System.out.println(((String)e.getKey())+":" +e.getKey().hashCode());
}
}
}

Your equals() and hashCode() methods don't come into the picture because the map keys are of type String, not of type Test. Thus the standard string comparison and hashcode are being used.
You'll need to modify Test so that it holds the string, and change equals() and hashCode() accordingly. You'll then need to change the map to be of type HashMap<Test,Integer>.

The hashCode of the key is used to determine the position and uniqueness of the key. You are adding String objects to the map, so the String.hashCode method is used.
Though you have implemented hashCode for your Test class, these are not used.
To solve your problem you would create your own class used as key, with your own hashCode implementation.
Using your example, you would add a property to your Test class which can hold a string, and use the Test class as key in your map.

how to swap key in map?

is there a way to sort this numbers stored in a string variable?
TreeMap<String,List<QBFElement>> qbfElementMap = new TreeMap<String, List<QBFElement>>();
this is the map where the key is :
27525-1813,
27525-3989,
27525-4083,
27525-4670,
27525-4911,
27526-558,
27526-1303,
27526-3641,
27526-4102,
27527-683,
27527-2411,
27527-4342
this is the list of keys and the value for each of the key is a list.
now, how can i sort this key in ascending order by number.
ex. if i want to sort : 1,2,11,20,31,3,10
i want to have as output is : 1,2,3,10,11,20,31
but when i use the autosort of treemap the output goes : 1,10,11,2,20,3,31
how can i sort it in ascending order by numeric?
and the language is java :) thank you:)

The keys in your map are not Integer but String values. That's why the key's are sorted like observed.
Either change the Map to
TreeMap<Long,List<QBFElement>> qbfElementMap
or create it with a specialized Comparatorthat will provide the expected numerical order for the String type keys.
A mapping from your String values to Longs could be done like this:
private Long convertToLongTypeKey(String key) {
String[] parts = key.split("-");
// next lines assumes, that the second part is in range 0...9999
return Long.parseLong(parts[0]) * 10000 + Long.parseLong(parts[1]);
}
An implementation of Comparator<String> could use the same mapping to create a numerical comparision of two String based keys:
new TreeMap<String,List<QBFElement>>(new Comparator<String>(){
#Override
public int compare(String key1, String key2) {
String[] parts1 = key1.split("-");
Long long1 = Long.parseLong(parts1[0]) * 10000 + Long.parseLong(parts1[1]);
String[] parts2 = key2.split("-");
Long long2 = Long.parseLong(parts2[0]) * 10000 + Long.parseLong(parts2[1]);
return long1.compareTo(long2);
}
});

You can change the way that the TreeMap sorts its keys by providing a custom comparator to the constructor. If you want, you can define a new Comparator that compares strings by breaking them up into numeric components.
It seems like a better idea, though, would be to not use Strings as your keys. The data you're using as keys is clearly not textual - it's numeric - and you might want to define a custom type to represent it. For example:
public class KeyType implements Comparable<KeyType> {
private final int first;
private final int second;
public KeyType(int first, int second) {
this.first = first;
this.second = second;
}
#Override
public boolean equals(Object other) {
if (!(other instanceof KeyType)) return false;
KeyType realOther = (KeyType) other;
return realOther.first == first && realOther.second == second;
}
#Override
public int hashCode() {
return first + 31 * second;
}
public int compareTo(KeyType other) {
if (first != other.first)
return first - other.first;
return second - other.second;
}
}
This approach is the most expressive and robust. It gives you better access to the individual fields of the keys you're using, and also prevents you from adding nonsensical keys into the map like the string "Lalalalala". I'd strongly suggest using this approach, or at least one like it. The type system is your friend.

A TreeMap can take a custom comparator for custom sorting. Write a comparator that sorts the keys the way you want and use it when you create the treemap
TreeMap<String,List<QBFElement>> qbfElementMap = new TreeMap<String, List<QBFElement>>(myComparator);

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

composite keys in map - java

Related

TreeSet to find k most frequent words in a book?

What is the best algorithm to sort a list on the basis of two criteria?

Java iteration for possible matches in ArrayList

HashMap with Overriding hashcode() and equals() is not working in my case

how to swap key in map?

Categories

Resources