Get least accessed element in a List (Java 8) - java

I am trying to create an API wrapper.
This API requires an API key, like most do. My goal is to spread out the usage as evenly as possible between a list of API keys. This is needed to reduce the possibility of rate limiting.
Needs:
Immutable List
A solution I could think of is to somehow get the least accessed element maybe with an object that keeps track of only the uses and the actual data? And then sort it and get the first element?
class Key {
private int uses;
private UUID key;
public Key(UUID key) {
this.key = key;
this.uses = 0;
}
public UUID get() {
this.uses++;
return this.key;
}
public int getUses() {
return this.uses;
}
}
I am up for using maven libraries such as Google Guava (which I am already using) if needed or for a more elegant solution. Here is an example of what it might look like.
List<UUID> keys = new ArrayList<>();
public Data getDataFromApi(String name) {
return getData(ENDPOINT_URL_STR + "key=" + keys.getLeastAccessed().toString() + "&name=" + name);
}

Given that the set of keys would be immutable, I suggest implementing round-robin, i.e. you use 1st key, then 2nd, 3rd and so on until you reach nth and then you start over again from 1st key.
This way difference between usages of any 2 keys would be <= 1

I tend to use Apache's LRUMap which removes the least recently used entry if an entry is added when full.
It sounds like what you are looking for. Documentation is here

If there is a list of mentioned Key instances, Stream API would be sufficient to get the least used elements using Stream::filter:
public static List<Key> getLeastUsedKeys(List<Key> keys) {
if (null == keys || keys.size() < 2) {
return keys;
}
int minUsage = keys.stream().mapToInt(Key::getUses).min().getAsInt();
return keys.stream().filter(k -> minUsage == k.getUses()).collect(Collectors.toList());
}
If a single key is needed, Collectors.minBy may be used:
public static Key getLeastUsedKey(List<Key> keys) {
if (null == keys || keys.isEmpty()) {
return null;
}
return keys.stream()
.collect(Collectors.minBy(Comparator.comparingInt(Key::getUses)))
.orElse(null);
}

Related

Why is my HashMap implementation 10 times slower than the JDK's?

I would like to know what makes the difference, what should i aware of when im writing code.
Used the same parameters and methods put(), get() when testing
without printing
Used System.NanoTime() to test runtime
I tried it with 1-10 int keys with 10 values, so every single hash returns unique index, which is the most optimal scenario
My HashSet implementation which is based on this is almost as fast as the JDK's
Here's my simple implementation:
public MyHashMap(int s) {
this.TABLE_SIZE=s;
table = new HashEntry[s];
}
class HashEntry {
int key;
String value;
public HashEntry(int k, String v) {
this.key=k;
this.value=v;
}
public int getKey() {
return key;
}
}
int TABLE_SIZE;
HashEntry[] table;
public void put(int key, String value) {
int hash = key % TABLE_SIZE;
while(table[hash] != null && table[hash].getKey() != key)
hash = (hash +1) % TABLE_SIZE;
table[hash] = new HashEntry(key, value);
}
public String get(int key) {
int hash = key % TABLE_SIZE;
while(table[hash] != null && table[hash].key != key)
hash = (hash+1) % TABLE_SIZE;
if(table[hash] == null)
return null;
else
return table[hash].value;
}
Here's the benchmark:
public static void main(String[] args) {
long start = System.nanoTime();
MyHashMap map = new MyHashMap(11);
map.put(1,"A");
map.put(2,"B");
map.put(3,"C");
map.put(4,"D");
map.put(5,"E");
map.put(6,"F");
map.put(7,"G");
map.put(8,"H");
map.put(9,"I");
map.put(10,"J");
map.get(1);
map.get(2);
map.get(3);
map.get(4);
map.get(5);
map.get(6);
map.get(7);
map.get(8);
map.get(9);
map.get(10);
long end = System.nanoTime();
System.out.println(end-start+" ns");
}
If you read the documentation of the HashMap class, you see that it implements a hash table implementation based on the hashCode of the keys. This is dramatically more efficient than a brute-force search if the map contains a non-trivial number of entries, assuming reasonable key distribution amongst the "buckets" that it sorts the entries into.
That said, benchmarking the JVM is non-trivial and easy to get wrong, if you're seeing big differences with small numbers of entries, it could easily be a benchmarking error rather than the code.
When it is up to performance, never assume something.
Your assumption was "My HashSet implementation which is based on this is almost as fast as the JDK's". No, obviously it is not.
That is the tricky part when doing performance work: doubt everything unless you have measured with great accuracy. Worse, you even measured, and the measurement told you that your implementation is slower; and instead of checking your source, and the source of the thing you are measuring against; you decided that the measuring process must be wrong ...

Bidirectional multimap equivalent data structure

I know that Guava has a BiMultimap class internally but didn't outsource the code. I need a data structure which is bi-directional, i.e. lookup by key and by value and also accepts duplicates.
i.e. something like this: (in my case, values are unique, but two values can point to the same key)
0 <-> 5
1 <-> 10
2 <-> 7
2 <-> 8
3 <-> 11
I want to be able to get(7) -> returning 2 and get(2) returning [7, 8].
Is there another library out there which has a data structure I can make use of?
If not, what do you suggest is the better option to handle this case? Is keeping two Multimaps in memory one with and the other with a bad practice?
P.S.: I have read this question: Bidirectional multi-valued map in Java but considering it is dated in 2011, I thought I'll open a more recent question
What do you mean by
Guava has a BiMultimap class internally but didn't outsource the code
The code of an implementation is here.
I didn't check if this is a working implementation, nor if it made it into a release or if I'm just looking at some kind of snapshot. Everything is out in the open, so you should be able to get it.
From a quick glance at the source code it looks like the implementation does maintain two MultMaps, and this should be fine for the general case.
If you don't need the whole bunch of Guava HashBiMultimap functionality, but just getByKey() and getByValue(), as you specified, I can suggest the approach, where only one HashMultiMap is used as a storage.
The idea is to treat provided key and value as equilibrium objects and put both of them in the storage map as keys and values.
For example: Let we have the following multiMap.put(0, 5), so we should get the storage map containing something like this [[key:0, value:5], [key:5, value:0]].
As far as we need our BiMultiMap to be generic, we also need to provide some wrapper classes, that should be used as storage map type parameters.
Here is this wrapper class:
public class ObjectHolder {
public static ObjectHolder newLeftHolder(Object object) {
return new ObjectHolder(object, false);
}
public static ObjectHolder newRightHolder(Object object) {
return new ObjectHolder(object, true);
}
private Object object;
private boolean flag;
private ObjectHolder(Object object, boolean flag) {
this.object = object;
this.flag = flag;
}
public Object getObject() {
return object;
}
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (!(o instanceof ObjectHolder)) return false;
ObjectHolder that = (ObjectHolder) o;
if (flag != that.flag) return false;
if (!object.equals(that.object)) return false;
return true;
}
#Override
public int hashCode() {
int result = object.hashCode();
result = 31 * result + (flag ? 1 : 0);
return result;
}
}
And here is the MultiMap:
public class BiHashMultiMap<L, R> {
private Map<ObjectHolder, Set<ObjectHolder>> storage;
public SimpleBiMultiMap() {
storage = new HashMap<ObjectHolder, Set<ObjectHolder>>();
}
public void put(L left, R right) {
ObjectHolder leftObjectHolder = ObjectHolder.newLeftHolder(left);
ObjectHolder rightObjectHolder = ObjectHolder.newRightHolder(right);
put(leftObjectHolder, rightObjectHolder);
put(rightObjectHolder, leftObjectHolder);
}
private void put(ObjectHolder key, ObjectHolder value) {
if (!storage.containsKey(key)) {
storage.put(key, new HashSet<ObjectHolder>());
}
storage.get(key).add(value);
}
public Set<R> getRight(L left) {
return this.get(ObjectHolder.newLeftHolder(left));
}
public Set<L> getLeft(R right) {
return this.get(ObjectHolder.newRightHolder(right));
}
private <V> Set<V> get(ObjectHolder key) {
Set<ObjectHolder> values = storage.get(key);
if (values == null || values.isEmpty()) {
return null;
}
Set<V> result = new HashSet<V>();
for (ObjectHolder value : values) {
result.add((V)value.getObject());
}
return result;
}
}
Thing that could seem strange is the left and right prefixed variable everywhere. You can think of them as left is the original key, that was putted to map and right is the value.
Usage example:
BiHashMultiMap<Integer, Integer> multiMap = new BiHashMultiMap<Integer, Integer>();
multiMap.put(0,5);
multiMap.put(1,10);
multiMap.put(2,7);
multiMap.put(3,7);
multiMap.put(2,8);
multiMap.put(3,11);
Set<Integer> left10 = multiMap.getLeft(10); // [1]
Set<Integer> left7 = multiMap.getLeft(7); // [2, 3]
Set<Integer> right0 = multiMap.getRight(0); // [5]
Set<Integer> right3 = multiMap.getRight(3); // [7, 11]
So to get left value we need to provide right value as key and to get right value we need to provide left as a key.
And of course to make map fully function we need to provide other methods, like remove(), contains() and so on.

Get specific objects from ArrayList when objects were added anonymously?

I have created a short example of my problem. I'm creating a list of objects anonymously and adding them to an ArrayList. Once items are in the ArrayList I later come back and add more information to each object within the list. Is there a way to extract a specific object from the list if you do not know its index?
I know only the Object's 'name' but you cannot do a list.get(ObjectName) or anything. What is the recommended way to handle this? I'd rather not have to iterate through the entire list every time I want to retrieve one specific object.
public class TestCode{
public static void main (String args []) {
Cave cave = new Cave();
// Loop adds several Parties to the cave's party list
cave.parties.add(new Party("FirstParty")); // all anonymously added
cave.parties.add(new Party("SecondParty"));
cave.parties.add(new Party("ThirdParty"));
// How do I go about setting the 'index' value of SecondParty for example?
}
}
class Cave {
ArrayList<Party> parties = new ArrayList<Party>();
}
class Party extends CaveElement{
int index;
public Party(String n){
name = n;
}
// getter and setter methods
public String toString () {
return name;
}
}
class CaveElement {
String name = "";
int index = 0;
public String toString () {
return name + "" + index;
}
}
Given the use of List, there's no way to "lookup" a value without iterating through it...
For example...
Cave cave = new Cave();
// Loop adds several Parties to the cave's party list
cave.parties.add(new Party("FirstParty")); // all anonymously added
cave.parties.add(new Party("SecondParty"));
cave.parties.add(new Party("ThirdParty"));
for (Party p : cave.parties) {
if (p.name.equals("SecondParty") {
p.index = ...;
break;
}
}
Now, this will take time. If the element you are looking for is at the end of the list, you will have to iterate to the end of the list before you find a match.
It might be better to use a Map of some kind...
So, if we update Cave to look like...
class Cave {
Map<String, Party> parties = new HashMap<String, Party>(25);
}
We could do something like...
Cave cave = new Cave();
// Loop adds several Parties to the cave's party list
cave.parties.put("FirstParty", new Party("FirstParty")); // all anonymously added
cave.parties.put("SecondParty", new Party("SecondParty"));
cave.parties.put("ThirdParty", new Party("ThirdParty"));
if (cave.parties.containsKey("SecondParty")) {
cave.parties.get("SecondParty").index = ...
}
Instead...
Ultimately, this will all depend on what it is you want to achieve...
List.indexOf() will give you what you want, provided you know precisely what you're after, and provided that the equals() method for Party is well-defined.
Party searchCandidate = new Party("FirstParty");
int index = cave.parties.indexOf(searchCandidate);
This is where it gets interesting - subclasses shouldn't be examining the private properties of their parents, so we'll define equals() in the superclass.
#Override
public boolean equals(Object o) {
if (this == o) {
return true;
}
if (!(o instanceof CaveElement)) {
return false;
}
CaveElement that = (CaveElement) o;
if (index != that.index) {
return false;
}
if (name != null ? !name.equals(that.name) : that.name != null) {
return false;
}
return true;
}
It's also wise to override hashCode if you override equals - the general contract for hashCode mandates that, if x.equals(y), then x.hashCode() == y.hashCode().
#Override
public int hashCode() {
int result = name != null ? name.hashCode() : 0;
result = 31 * result + index;
return result;
}
If you want to lookup objects based on their String name, this is a textbook case for a Map, say a HashMap. You could use a LinkedHashMap and convert it to a List or Array later (Chris has covered this nicely in the comments below).
LinkedHashMap because it lets you access the elements in the order you insert them if you want to do so. Otherwise HashMap or TreeMap will do.
You could get this to work with List as the others are suggesting, but that feels Hacky to me.. and this will be cleaner both in short and long run.
If you MUST use a list for the object, you could still store a Map of the object name to the index in the array. This is a bit uglier, but you get almost the same performance as a plain Map.
You could use list.indexOf(Object) bug in all honesty what you're describing sounds like you'd be better off using a Map.
Try this:
Map<String, Object> mapOfObjects = new HashMap<String, Object>();
mapOfObjects.put("objectName", object);
Then later when you want to retrieve the object, use
mapOfObjects.get("objectName");
Assuming you do know the object's name as you stated, this will be both cleaner and will have faster performance besides, particularly if the map contains large numbers of objects.
If you need the objects in the Map to stay in order, you can use
Map<String, Object> mapOfObjects = new LinkedHashMap<String, Object>();
instead
As per your question requirement , I would like to suggest that Map will solve your problem very efficient and without any hassle.
In Map you can give the name as key and your original object as value.
Map<String,Cave> myMap=new HashMap<String,Cave>();
I would suggest overriding the equals(Object) of your Party class. It might look something like this:
public boolean equals(Object o){
if(o == null)
return false;
if(o instanceof String)
return name.equalsIgnoreCase((String)o);
else if(o instanceof Party)
return equals(((Party)o).name);
return false;
}
After you do that, you could use the indexOf(Object) method to retrieve the index of the party specified by its name, as shown below:
int index = cave.parties.indexOf("SecondParty");
Would return the index of the Party with the name SecondParty.
Note: This only works because you are overriding the equals(Object) method.
You could simply create a method to get the object by it's name.
public Party getPartyByName(String name) {
for(Party party : parties) {
if(name.equalsIgnoreCase(party.name)) {
return party;
}
}
return null;
}

Java: Priority Queue implementation iterable in proper order

I'm looking for some implementation of PQ in Java which allows iteration in PQ order - top element first, next one next etc. I tried using TreeSet (which implements NavigableSet) but it causes one problem. In my case:
I'm using Comparator for my objects
priority changes due to some external actions
if priority changes I know for which object, but I don't know it's previous priority
As a result to the last point - I can't find my element in TreeSet when I would like to update its priority:/
Do you happen to know: smart way to obey this? or some implementation of PQ that is iterable in "good" way? or should I create some linked data structure that will match objects with their positions in tree ?
UPDATE:
concurrency is not an issue
object can't be removed from TreeSet because it's priority changed so Comparator will evaluate differently and object won't be found in this data structure. Inserting is not a problem.
I can't use compareTo method as this priority is not proper way to compare those objects. That is why I need to use Comparator
POSSIBLE SOLUTION:
create class PrioritizedObject which will be compared by priority and keep my object
use map: my object -> PrioritizedObject
keep PrioritizedObject in some NavigableSet
I would use this map to remove objects from NavigableSet. And of course update it with new elements if I add something.
Problem is that I will have to wrap iterator from this NavigableSet to get iterator returning my objects.
Is there any better solution?
if priority changes I know for which object, but I don't know it's previous priority
You don't need to know its previous priority. All you have to do is remove it and re-insert it.
If concurrency is not an issue all you need to do is to reorder the tree right after updating an element's priority. If I understood the problem right, this sketch should suit you.
Example element:
public class Element implements Comparable<Element> {
private final Integer id;
private Integer priority;
public Element(Integer id, Integer priority) {
this.id = id;
this.priority = priority;
}
#Override
public String toString() {
return "Element{" + "id=" + id + ", priority=" + priority + '}';
}
public Integer getPriority() {
return priority;
}
public void setPriority(Integer priority) {
this.priority = priority;
}
#Override
public int compareTo(Element o) {
if (o == null) {
throw new NullPointerException();
}
return priority.compareTo(o.priority);
}
}
The sketch:
public class Tree {
public static TreeSet<Element> priorityQueue = new TreeSet<Element>();
public static void dump(TreeSet<Element> in) {
for (Element e : in) {
System.out.println(e);
}
}
public static void updatePriority(Element e, int newPriority) {
if (priorityQueue.remove(e)) {
e.setPriority(newPriority);
priorityQueue.add(e);
}
}
public static void main(String[] args) {
int id;
Element lastElement = null;
for (int i = 0;i < 10 ; i++) {
id = (int)(Math.random()*1000);
priorityQueue.add(lastElement = new Element(id, id));
}
dump(priorityQueue);
updatePriority(lastElement, 0);
System.out.println("updating "+lastElement+ " priority to 0");
dump(priorityQueue);
}
}
You update the element by removing it from the treeset, setting the new priority and then reinserting it. The complexity of the update operation with this scenario is 2*O(log(n)) = O(log(n))
UPDATE:
The best I could understand is: you have two criterias upon which you need to sort/index. When I had the same problem I used this approach but this is a very interesting approach that I strongly recommend reading.
I recommend ConcurrentSkipListSet instead of TreeSet since it's thread-safe. If you know the object whose priority is changing, you can call remove(objToChange), change its priority, then re-add it to the set.
Be very careful adding to a set any objects whose equals, hashcode, and compareTo methods depend on mutable fields.
Edit: I think any solution will end up looking like your PrioritizedObject which seems fine to me. If you want to iterate through your objects, use Map.keySet.

Null-free "maps": Is a callback solution slower than tryGet()?

In comments to "How to implement List, Set, and Map in null free design?", Steven Sudit and I got into a discussion about using a callback, with handlers for "found" and "not found" situations, vs. a tryGet() method, taking an out parameter and returning a boolean indicating whether the out parameter had been populated. Steven maintained that the callback approach was more complex and almost certain to be slower; I maintained that the complexity was no greater and the performance at worst the same.
But code speaks louder than words, so I thought I'd implement both and see what I got. The original question was fairly theoretical with regard to language ("And for argument sake, let's say this language don't even have null") -- I've used Java here because that's what I've got handy. Java doesn't have out parameters, but it doesn't have first-class functions either, so style-wise, it should suck equally for both approaches.
(Digression: As far as complexity goes: I like the callback design because it inherently forces the user of the API to handle both cases, whereas the tryGet() design requires callers to perform their own boilerplate conditional check, which they could forget or get wrong. But having now implemented both, I can see why the tryGet() design looks simpler, at least in the short term.)
First, the callback example:
class CallbackMap<K, V> {
private final Map<K, V> backingMap;
public CallbackMap(Map<K, V> backingMap) {
this.backingMap = backingMap;
}
void lookup(K key, Callback<K, V> handler) {
V val = backingMap.get(key);
if (val == null) {
handler.handleMissing(key);
} else {
handler.handleFound(key, val);
}
}
}
interface Callback<K, V> {
void handleFound(K key, V value);
void handleMissing(K key);
}
class CallbackExample {
private final Map<String, String> map;
private final List<String> found;
private final List<String> missing;
private Callback<String, String> handler;
public CallbackExample(Map<String, String> map) {
this.map = map;
found = new ArrayList<String>(map.size());
missing = new ArrayList<String>(map.size());
handler = new Callback<String, String>() {
public void handleFound(String key, String value) {
found.add(key + ": " + value);
}
public void handleMissing(String key) {
missing.add(key);
}
};
}
void test() {
CallbackMap<String, String> cbMap = new CallbackMap<String, String>(map);
for (int i = 0, count = map.size(); i < count; i++) {
String key = "key" + i;
cbMap.lookup(key, handler);
}
System.out.println(found.size() + " found");
System.out.println(missing.size() + " missing");
}
}
Now, the tryGet() example -- as best I understand the pattern (and I might well be wrong):
class TryGetMap<K, V> {
private final Map<K, V> backingMap;
public TryGetMap(Map<K, V> backingMap) {
this.backingMap = backingMap;
}
boolean tryGet(K key, OutParameter<V> valueParam) {
V val = backingMap.get(key);
if (val == null) {
return false;
}
valueParam.value = val;
return true;
}
}
class OutParameter<V> {
V value;
}
class TryGetExample {
private final Map<String, String> map;
private final List<String> found;
private final List<String> missing;
private final OutParameter<String> out = new OutParameter<String>();
public TryGetExample(Map<String, String> map) {
this.map = map;
found = new ArrayList<String>(map.size());
missing = new ArrayList<String>(map.size());
}
void test() {
TryGetMap<String, String> tgMap = new TryGetMap<String, String>(map);
for (int i = 0, count = map.size(); i < count; i++) {
String key = "key" + i;
if (tgMap.tryGet(key, out)) {
found.add(key + ": " + out.value);
} else {
missing.add(key);
}
}
System.out.println(found.size() + " found");
System.out.println(missing.size() + " missing");
}
}
And finally, the performance test code:
public static void main(String[] args) {
int size = 200000;
Map<String, String> map = new HashMap<String, String>();
for (int i = 0; i < size; i++) {
String val = (i % 5 == 0) ? null : "value" + i;
map.put("key" + i, val);
}
long totalCallback = 0;
long totalTryGet = 0;
int iterations = 20;
for (int i = 0; i < iterations; i++) {
{
TryGetExample tryGet = new TryGetExample(map);
long tryGetStart = System.currentTimeMillis();
tryGet.test();
totalTryGet += (System.currentTimeMillis() - tryGetStart);
}
System.gc();
{
CallbackExample callback = new CallbackExample(map);
long callbackStart = System.currentTimeMillis();
callback.test();
totalCallback += (System.currentTimeMillis() - callbackStart);
}
System.gc();
}
System.out.println("Avg. callback: " + (totalCallback / iterations));
System.out.println("Avg. tryGet(): " + (totalTryGet / iterations));
}
On my first attempt, I got 50% worse performance for callback than for tryGet(), which really surprised me. But, on a hunch, I added some garbage collection, and the performance penalty vanished.
This fits with my instinct, which is that we're basically talking about taking the same number of method calls, conditional checks, etc. and rearranging them. But then, I wrote the code, so I might well have written a suboptimal or subconsicously penalized tryGet() implementation. Thoughts?
Updated: Per comment from Michael Aaron Safyan, fixed TryGetExample to reuse OutParameter.
I would say that neither design makes sense in practice, regardless of the performance. I would argue that both mechanisms are overly complicated and, more importantly, don't take into account actual usage.
Actual Usage
If a user looks up a value in a map and it isn't there, most likely the user wants one of the following:
To insert some value with that key into the map
To get back some default value
To be informed that the value isn't there
Thus I would argue that a better, null-free API would be:
has(key) which indicates if the key is present (if one only wishes to check for the key's existence).
get(key) which reports the value if the key is present; otherwise, throws NoSuchElementException.
get(key,defaultval) which reports the value for the key, or defaultval if the key isn't present.
setdefault(key,defaultval) which inserts (key,defaultval) if key isn't present, and returns the value associated with key (which is defaultval if there is no previous mapping, otherwise prev mapping).
The only way to get back null is if you explicity ask for it as in get(key,null). This API is incredibly simple, and yet is able to handle the most common map-related tasks (in most use cases that I have encountered).
I should also add that in Java, has() would be called containsKey() while setdefault() would be called putIfAbsent(). Because get() signals an object's absence via a NoSuchElementException, it is then possible to associate a key with null and treat it as a legitimate association.... if get() returns null, it means the key has been associated with the value null, not that the key is absent (although you can define your API to disallow a value of null if you so choose, in which case you would throw an IllegalArgumentException from the functions that are used to add associations if the value given is null). Another advantage to this API, is that setdefault() only needs to perform the lookup procedure once instead of twice, which would be the case if you used if( ! dict.has(key) ){ dict.set(key,val); }. Another advantage is that you do not surprise developers who write something like dict.get(key).doSomething() who assume that get() will always return a non-null object (because they have never inserted a null value into the dictionary)... instead, they get a NoSuchElementException if there is no value for that key, which is more consistent with the rest of the error checking in Java and which is also a much easier to understand and debug than NullPointerException.
Answer To Question
To answer original question, yes, you are unfairly penalizing the tryGet version.... in your callback based mechanism you construct the callback object only once and use it in all subsequent calls; whereas in your tryGet example, you construct your out parameter object in every single iteration. Try taking the line:
OutParameter out = new OutParameter();
Take the line above out of the for-loop and see if that improves the performance of the tryGet example. In other words, place the line above the for-loop, and re-use the out parameter in each iteration.
David, thanks for taking the time to write this up. I'm a C# programmer, so my Java skills are a bit vague these days. Because of this, I decided to port your code over and test it myself. I found some interesting differences and similarities, which are pretty much worth the price of admission as far as I'm concerned. Among the major differences are:
I didn't have to implement TryGet because it's built into Dictionary.
In order to use the native TryGet, instead of inserting nulls to simulate misses, I simply omitted those values. This still means that v = map[k] would have set v to null, so I think it's a proper porting. In hindsight, I could have inserted the nulls and changed (_map.TryGetValue(key, out value)) to (_map.TryGetValue(key, out value) && value != null)), but I'm glad I didn't.
I want to be exceedingly fair. So, to keep the code as compact and maintainable as possible, I used lambda calculus notation, which let me define the callbacks painlessly. This hides much of the complexity of setting up anonymous delegates, and allows me to use closures seamlessly. Ironically, the implementation of Lookup uses TryGet internally.
Instead of declaring a new type of Dictionary, I used an extension method to graft Lookup onto the standard dictionary, much simplifying the code.
With apologies for the less-than-professional quality of the code, here it is:
using System;
using System.Collections.Generic;
using System.Linq;
namespace ConsoleApplication1
{
static class CallbackDictionary
{
public static void Lookup<K, V>(this Dictionary<K, V> map, K key, Action<K, V> found, Action<K> missed)
{
V v;
if (map.TryGetValue(key, out v))
found(key, v);
else
missed(key);
}
}
class TryGetExample
{
private Dictionary<string, string> _map;
private List<string> _found;
private List<string> _missing;
public TryGetExample(Dictionary<string, string> map)
{
_map = map;
_found = new List<string>(_map.Count);
_missing = new List<string>(_map.Count);
}
public void TestTryGet()
{
for (int i = 0; i < _map.Count; i++)
{
string key = "key" + i;
string value;
if (_map.TryGetValue(key, out value))
_found.Add(key + ": " + value);
else
_missing.Add(key);
}
Console.WriteLine(_found.Count() + " found");
Console.WriteLine(_missing.Count() + " missing");
}
public void TestCallback()
{
for (int i = 0; i < _map.Count; i++)
_map.Lookup("key" + i, (k, v) => _found.Add(k + ": " + v), k => _missing.Add(k));
Console.WriteLine(_found.Count() + " found");
Console.WriteLine(_missing.Count() + " missing");
}
}
class Program
{
static void Main(string[] args)
{
int size = 2000000;
var map = new Dictionary<string, string>(size);
for (int i = 0; i < size; i++)
if (i % 5 != 0)
map.Add("key" + i, "value" + i);
long totalCallback = 0;
long totalTryGet = 0;
int iterations = 20;
TryGetExample tryGet;
for (int i = 0; i < iterations; i++)
{
tryGet = new TryGetExample(map);
long tryGetStart = DateTime.UtcNow.Ticks;
tryGet.TestTryGet();
totalTryGet += (DateTime.UtcNow.Ticks - tryGetStart);
GC.Collect();
tryGet = new TryGetExample(map);
long callbackStart = DateTime.UtcNow.Ticks;
tryGet.TestCallback();
totalCallback += (DateTime.UtcNow.Ticks - callbackStart);
GC.Collect();
}
Console.WriteLine("Avg. callback: " + (totalCallback / iterations));
Console.WriteLine("Avg. tryGet(): " + (totalTryGet / iterations));
}
}
}
My performance expectations, as I said in the article that inspired this one, would be that neither one is much faster or slower than the other. After all, most of the work is in the searching and adding, not in the simple logic that structures it. In fact, it varied a bit among runs, but I was unable to detect any consistent advantage.
Part of the problem is that I used a low-precision timer and the test was short, so I increased the count by 10x to 2000000 and that helped. Now callbacks are about 3% slower, which I do not consider significant. On my fairly slow machine, callbacks took 17773437 while tryget took 17234375.
Now, as for code complexity, it's a bit unfair because TryGet is native, so let's just ignore the fact that I had to add a callback interface. At the calling spot, lambda notation did a great job of hiding the complexity. If anything, it's actually shorter than the if/then/else used in the TryGet version, although I suppose I could have used a ternary operator to make it equally compact.
On the whole, I found the C# to be more elegant, and only some of that is due to my bias as a C# programmer. Mainly, I didn't have to define and implement interfaces, which cut down on the plumbing overhead. I also used pretty standard .NET conventions, which seem to be a bit more streamlined than the sort of style favored in Java.

Categories

Resources