Efficient way to exclude a specific value from a TreeMap - java

I am trying to achieve the best performance for my app. At some point in the code, i want to retrieve all the values from a map except one that corresponds to a specific key.
Now, if i wanted to retrieve all the values i would use this:
map.values();
and assuming that the TreeMap class is created efficiently, the 'values()' method is just returning a refference so --> O(1).
In my case though i want to exclude the value of a specific key. This code:
Set<String> set = new ...
for (String key: map.keySet()) {
if (!key.equals("badKey")) {
set.add(map.get(key));
}
}
has a complexity of N*(logN) which is much slower than the initial O(1) and this is caused by the need of removing only one value.
Is there a better way to do this?

You can use entrySet instead of keySet. This way it would take O(1) to find out if a given value belongs to the key you wish to exclude.
You can call entrySet any time you need to iterate over the values, and exclude the bad key while iterating over them. This would give you the same complexity as iterating over the values() Collection would.

How about this?
map.entrySet().stream()
.filter(e -> !e.getKey().equals(keyToFilter))
.map(Map.Entry::getValue);
Finish with either forEach or toCollection(Collectors.TO_SET), or simply return the stream.
Sorry if the code doesn't compile exactly, it's from memory and I haven't touched the java 8 APIs in a few months, but you should get the drift. ;)

You can create set from map.values() and after it remove "badKey" value from this set.
Set<String> set = new HashSet<String>(map.values());
String badValue = map.get("badKey");
set.remove(badValue);

Related

Is there an efficient way of checking if HashMap contains keys that map to the same value?

I basically need to know if my HashMap has different keys that map to the same value. I was wondering if there is a way other than checking each keys value against all other values in the map.
Update:
Just some more information that will hopefully clarify what I'm trying to accomplish. Consider a String "azza". Say that I'm iterating over this String and storing each character as a key, and it's corresponding value is some other String. Let's say I eventually get to the last occurrence of 'a' and the value is already be in the map.This would be fine if the key corresponding with the value that is already in the map is also 'a'. My issue occurs when 'a' and 'z' both map to the same value. Only if different keys map to the same value.
Sure, the fastest to both code and execute is:
boolean hasDupeValues = new HashSet<>(map.values()).size() != map.size();
which executes in O(n) time.
Sets don't allow duplicates, so the set will be smaller than the values list if there are dupes.
Very similar to EJP's and Bohemian's answer above but with streams:
boolean hasDupeValues = map.values().stream().distinct().count() != map.size();
You could create a HashMap that maps values to lists of keys. This would take more space and require (slightly) more complex code, but with the benefit of greatly higher efficiency (amortized O(1) vs. O(n) for the method of just looping all values).
For example, say you currently have HashMap<Key, Value> map1, and you want to know which keys have the same value. You create another map, HashMap<Value, List<Key>> map2.
Then you just modify map1 and map2 together.
map1.put(key, value);
if(!map2.containsKey(value)) {
map2.put(value, new ArrayList<Key>);
}
map2.get(value).add(key);
Then to get all keys that map to value, you just do map2.get(value).
If you need to put/remove in many different places, to make sure that you don't forget to use map2 you could create your own data structure (i.e. a separate class) that contains 2 maps and implement put/remove/get/etc. for that.
Edit: I may have misunderstood the question. If you don't need an actual list of keys, just a simple "yes/no" answer to "does the map already contain this value?", and you want something better than O(n), you could keep a separate HashMap<Value, Integer> that simply counts up how many times the value occurs in the map. This would take considerably less space than a map of lists.
You can check whether a map contains a value already by calling map.values().contains(value). This is not as efficient as looking up a key in the map, but still, it's O(n), and you don't need to create a new set just in order to count its elements.
However, what you seem to need is a BiMap. There is no such thing in the Java standard library, but you can build one relatively easily by using two HashMaps: one which maps keys to values and one which maps values to keys. Every time you map a key to a value, you can then check in amortized O(1) whether the value already is mapped to, and if it isn't, map the key to the value in the one map and the value to the key in the other.
If it is an option to create a new dependency for your project, some third-party libraries contain ready-made bimaps, such as Guava (BiMap) and Apache Commons (BidiMap).
You could iterate over the keys and save the current value in the Set.
But, before inserting that value in a Set, check if the Set already contains that value.
If this is true, it means that a previous key already contains the same value.
Map<Integer, String> map = new HashMap<>();
Set<String> values = new HashSet<>();
Set<Integter> keysWithSameValue = new HashSet<>();
for(Integer key : map.keySet()) {
if(values.contains(map.get(key))) {
keysWithSameValue.add(key);
}
values.add(map.get(key));
}

Search a Map for multiple keys in parallel

Given a Map<String, Collection<String>> up to 1M items. I know what to query that Map for 5K keys, of which I'm unsure whether they are in the map or not.
Currently, I'm using a TreeMap and search for each item, one by one. Which seems sub-optimal. Is there an, already, implemented way to query a Map for X keys?
The result of the search should be a subset of items, which are found in the Map, for further querying - ordering is irrelevant.
I was hoping to use stream, but, apparently, that's only for Collections.
Note: the number are impressions, from what I've seen in the map, probably not the upper limit...
There is no better way than querying your map for each element:
List<V> vs = keysToSearch.stream()
.map(k -> map.get(k))
.filter(Objects::nonNull)
.collect(Collectors.toList())
You can also try using a parallelStream if your data structures work in a concurrent environment.
assuming memory is not a problem for you. here is one way of doing it.
by using retainAll
Set<String> mapKeys = new HashSet<String>(myMap.keySet());
mapKeys.retainAll(my5kKeys); //<--- all keys that match the my5kKeys...
If you have M items in your map, and K keys you are searching for, then your best-case efficiency is O(min(M, K)). If M is very large, the best you can do is to check each K (perhaps in parallel, but you must do each).
If it were the case that M turned out to be much smaller than K, then you could do better by only checking through all M values to see if they existed in K. In any event, you want to check the smaller set's values against the larger.
There is no better way then to create a loop and search for all the keys individually.
A method like retainAll is just a wrapper around such a loop written by somebody else.
However, the important thing is to use a HashMap instead of a TreeMap. Hashmaps contains is O(1) while Treemap takes O(log(n)).
If you need the sorted collection for something else, you could put the data in both a TreeMap and a HashMap.

Which data structure should I use?

I want to store some words and their occurrence times in a website, and I don't know which structure I should use.
Every time I add a word in the structure, it first checks if the word already exists, if yes, the occurrence times plus one, if not, add the word into the structure. Thus I can find an element very fast by using this structure. I guess I should use a hashtable or hashmap, right?
And I also want to get a sorted list, thus the structure can be ranked in a short time.
Forgot to mention, I am using Java to write it.
Thanks guys! :)
A HashMap seems like it would suit you well. If you need a thread-safe option, then go with ConcurrentHashMap.
For example:
Map<String, Integer> wordOccurenceMap = new HashMap<>();
"TreeMap provides guaranteed O(log n) lookup time (and insertion etc), whereas HashMap provides O(1) lookup time if the hash code disperses keys appropriately. Unless you need the entries to be sorted, I'd stick with HashMap." -part of Jon Skeet's answer in TreeMap or HashMap.
TreeMap is the better solution, if you want both Sorting functionality and counting words.
Custom Trie can make more efficient but it's not required unless you are modifying the words.
Define a Hashmap with word as the key and counter as the value
Map<String,Integer> wordsCountMap = new HashMap<String,Integer>();
Then add the logic like this:
When you get a word, check for it in the map using containsKey method
If key(word) is found, fetch the value using get and increment the value
If key(word) is not found, add the value using thw word as key and put with count 1 as value
So, you could use HashMap, but don't forget about multythreading. Is this data structure could be accessed throught few thread? Also, you could use three map in a case that data have some hirarchy (e.g. in a case of rakning and sort it by time). Also, you could look throught google guava collections, probably, they will be more sutabile for you.
Any Map Implementation Will Do. If Localized Changes prefer HashMap otherWise
ConcurrentHashMap for multithreading.
Remember to use any stemming Library.
stemming library in java
for example working and work logically are same word.
Remember Integer is immutable see example below
Example :
Map<String, Integer> occurrence = new ConcurrentHashMap<String, Integer>();
synchronized void addWord(String word) { // may need to synchronize this method
String stemmedWord = stem(word);
Integer count = occurrence.get(stemmedWord)
if(count == null) {
count = new Integer(0);
}
count ++;
occurrence.put(stemmedWord, count);
**// the above is necessary as Integer is immutable**
}

How to check if a key in a Map starts with a given String value

I'm looking for a method like:
myMap.containsKeyStartingWith("abc"); // returns true if there's a key starting with "abc" e.g. "abcd"
or
MapUtils.containsKeyStartingWith(myMap, "abc"); // same
I wondered if anyone knew of a simple way to do this
Thanks
This can be done with a standard SortedMap:
Map<String,V> tailMap = myMap.tailMap(prefix);
boolean result = (!tailMap.isEmpty() && tailMap.firstKey().startsWith(prefix));
Unsorted maps (e.g. HashMap) don't intrinsically support prefix lookups, so for those you'll have to iterate over all keys.
From the map, you can get a Set of Keys, and in case they are String, you can iterate over the elements of the Set and check for startsWith("abc")
To build on Adel Boutros answer/comment about the efficiency of iterating keys, you could encapsulate key iteration in a Map subclass or decorator.
Extending HashMap would give you a class to put the method in and keep map-specific code out of your method, so lowering complexity and making the code more natural to read.

Java Map question

I have one Map that contains some names and numbers
Map<String,Integer> abc = new HashMap<String,Integer>();
It works fine. I can put some values in it but when I call it in different class it gives me wrong order. For example:
I putted
abc.put("a",1);
abc.put("b",5);
abc.put("c",3);
Iterator<String> iter = abc.keySet().iterator();
while (iter.hasNext()) {
String name = iter.next();
System.out.println(name);
}
some time it returns the order (b,a,c) and some time (a,c,b).
What is wrong with it? Is there any step that I am missing when I call this map?
Edit:
I changed to HashMap and result is still same
The only thing that's wrong is your expectations. The Map interface makes no guarantees about iteration order, and the HashMap implementation is based on hash functions which means the iteration order is basically random, and will sometimes change completely when new elements are added.
If you want a specific iteration order, you have thee options:
The SortedMap interfaces with its TreeMap implementation - these guarantee an iteration order according to the natural ordering of the keys (or an ordering imposed by a Comparator instance)
The LinkedHashMap class iterates in the order the elements were added to the map.
Use a List instead of a Map - this has a well-defined iteration order that you can influence in detail.
I think you need LinkedHashMap.
A TreeMap will always have keys in their natural order (unless you provide a comparator) If you are seeing the order any differently it will be the way you are looking at the map and what you are doing with it. If in doubt, use a debugger and you will see the order is sorted.
If you wish to get map values in the same order you used to insert them use LinkedHashMap instead.

Categories

Resources