Why is getEntry(Object key) not exposed on HashMap? - java

Here is my use case, I have an object that is logically equal to my HashMap key but not the same object (not ==). I need to get the actuall key object out of the HashMap so that i can synchronise on it. I am aware that i can iterate over the ketSet, but this is slow in comparison to hashing.
Looking through the java.util.HashMap implementation i see a getEntry(Object key) method that is exactly what i need. Any idea why this has not been exposed?
Can you think of any other way i can get the key out?

I think you would be better off putting in an extra layer of indirection on the value. The key should also be a "pure" value. Instead of:
Map<ReferenceObjectKey,Thing> map;
Use:
Map<ValueObjectKey,ReferenceObject<Thing>> map;

I can't answer your actual question (why is the method not exposed) beyond the rather obvious, "because the authors decided not to expose it."
However your question leads me to believe that you have a rather strange synchronization scheme going on; from my understanding you're only trying to call it to get a canonical representation of equal objects for synchronization. That sounds like a really bad idea, as I noted in my comment to the question.
A better approach would be to revisit how and why you want to synchronize on these key objects, and rework your synchronization to be clearer and saner, preferably at a level higher up or by using an alternative approach altogether.
It might help if you posted a code snippet of what you want to do with this synchronization so that others can give their opinions on a cleaner way to implement it. One example would simply be to use a thread-safe map class (such as ConcurrentHashMap), if this is indeed what you're trying to achieve here.
Edit: Have a look at How To Ask Questions The Smart Way, in particular the bullet point I've linked as this is a classic example of that deficiency. It seems likely that your overall design is a bit off and needs to go in a different direction; so while you're stuck on this specific issue it's a symptom of a larger problem. Giving us the broader context will lead to you getting much better overall answers.

Actually, the method the caller is asking for would have been useful. It was arguably a mistake that it, or something like it, was not included.
As it is, supposing you wish to increment the Integer value that's mapped from key "a" -- you end up having to do a hash lookup on "a" twice. Supposing you want to distinguish between a value being not present and the value being present but mapped to null -- again, two hash lookups.
In practice the world hasn't ended because of this, though.

I stumbled upon this problem recently myself recently. When I boiled the problem down enough, it was that I was essentially using 2 different methods to associate data with the part of the key object that was used for determining equality.
With the value the key mapped to, via the Map
With the data contained with the key object, but that wasn't used in the .equals()/hashCode methods, via composition.
I was using a List in the key class to determine equality and hashcode, and there were 3 other fields in it - a boolean, and 2 Strings. In the end, I remade the map as a Map<List<String>, ...> and refactored the other 3 fields into their own class, then had the original class as a composition of the List and the new class. I felt that the code seemed better after this.

This sounds like a deeper problem you're heaving. Why do you need such a thing? Why is the key not unique to its object?
What do you mean with "so this i can synchronise on it" ?

I'm sorry, but you seem to have a conceptual break here.
If your problem is that you "hold" an equivalent object (.equals() is true but == is false) to a key, and need to find the key, using the Object variant of get would not help you, because the only .equals that Object supports is identity (==).
What you need to do is to implement equals() and of course hashcode() in your key class.
This will make it trivial to obtain the entry.

Related

Nesting collections in Java (list of hashmaps containing String key, arrayList value)

Is nesting collections in Java something that I should be doing?
I'm currently working on a project where I want to have a bunch of hashmaps that would contain a String key and an arrayList value. That way when I create and add an object of another class to the collection, it would be able to use some piece of information that if it matched up with one of the keys of one of the hashmaps it would then be deposited in the associated arrayList value. That way the list can later on be accessed through the correct key for a specific hashmap.
Is this a good idea? Or is it too convoluted and if so is there a better way to do this?
There are times to nest, for sure. But in the humble opinion of this seasoned dev, you shouldn't do it unless you have a good reason. All too often you would be much better off with some class that represents the inner collection.
So if you find yourself with a Map<String,List<Foo>> ask yourself what that List<Foo really represents. If it's Map<String,List<Student>> then maybe you need Map<String, Roster> or Map<String, Team>. I find this yields faster time to market and fewer bugs. The fact you're asking the question means you think there's a chance that might be true too.

Why null is not allowed in ConcurrentHashMap? [duplicate]

The JavaDoc of ConcurrentHashMap says this:
Like Hashtable but unlike HashMap, this class does not allow null to be used as a key or value.
My question: Why?
2nd question: Why doesn't Hashtable allow null?
I've used a lot of HashMaps for storing data. But when changing to ConcurrentHashMap I got several times into trouble because of NullPointerExceptions.
From the author of ConcurrentHashMap himself (Doug Lea):
The main reason that nulls aren't allowed in ConcurrentMaps
(ConcurrentHashMaps, ConcurrentSkipListMaps) is that ambiguities that
may be just barely tolerable in non-concurrent maps can't be
accommodated. The main one is that if map.get(key) returns null, you
can't detect whether the key explicitly maps to null vs the key isn't
mapped. In a non-concurrent map, you can check this via
map.contains(key), but in a concurrent one, the map might have changed
between calls.
I believe it is, at least in part, to allow you to combine containsKey and get into a single call. If the map can hold nulls, there is no way to tell if get is returning a null because there was no key for that value, or just because the value was null.
Why is that a problem? Because there is no safe way to do that yourself. Take the following code:
if (m.containsKey(k)) {
return m.get(k);
} else {
throw new KeyNotPresentException();
}
Since m is a concurrent map, key k may be deleted between the containsKey and get calls, causing this snippet to return a null that was never in the table, rather than the desired KeyNotPresentException.
Normally you would solve that by synchronizing, but with a concurrent map that of course won't work. Hence the signature for get had to change, and the only way to do that in a backwards-compatible way was to prevent the user inserting null values in the first place, and continue using that as a placeholder for "key not found".
Josh Bloch designed HashMap; Doug Lea designed ConcurrentHashMap. I hope that isn't libelous. Actually I think the problem is that nulls often require wrapping so that the real null can stand for uninitialized. If client code requires nulls then it can pay the (admittedly small) cost of wrapping nulls itself.
You can't synchronize on a null.
Edit: This isn't exactly why in this case. I initially thought there was something fancy going on with locking things against concurrent updates or otherwise using the Object monitor to detect if something was modified, but upon examining the source code it appears I was wrong - they lock using a "segment" based on a bitmask of the hash.
In that case, I suspect they did it to copy Hashtable, and I suspect Hashtable did it because in the relational database world, null != null, so using a null as a key has no meaning.
I guess that the following snippet of the API documentation gives a good hint:
"This class is fully interoperable with Hashtable in programs that rely on its thread safety but not on its synchronization details."
They probably just wanted to make ConcurrentHashMap fully compatible/interchangeable to Hashtable. And as Hashtable does not allow null keys and values..
ConcurrentHashMap is thread-safe. I believe that not allowing null keys and values was a part of making sure that it is thread-safe.
I don't think disallowing null value is a correct option.
In many cases, we do want do put a key with null value into the con-current map. However, by using ConcurrentHashMap, we cannot do that.
I suggest that the coming version of JDK can support that.

Are there advantages to using an enum where you could use a map and vice versa?

Say, for example, I want to make a cash register program. Ignoring, for the sake of being compact, that one wouldn't use floats for currency my first instinct is to use an enum for the denominations, something along the lines of :
private enum Currency {
ONE_HUNDRED(100.00f),
FIFTY( 50.00f),
TWENTY( 20.00f),
TEN( 10.00f),
FIVE( 5.00f),
TWO( 2.00f),
ONE( 1.00f),
HALF_DOLLAR( 0.50f),
QUARTER( 0.25f),
DIME( 0.10f),
NICKEL( 0.05f),
PENNY( 0.01f);
private final float value;
Currency(float value) {
this.value = value;
}
public float getValue() {
return this.value;
}
#Override
public String toString() {
return this.name().replace("_", " ");
}
}
But last I followed instinct, sans forethought, and did something similar for a Morse Code Converter, someone suggested that I use a map instead, explicitly a Bimap. I see the appeal of that collection in that particular scenario, but generally speaking I wanted to inquire if there were any reason to prefer one when the other could be used? If instead of the above code I did this:
Map<String, Float> currency = new LinkedHashMap<>();
currency.put("One Hundred", 100.00f);
currency.put("Fifty", 50.00f);
currency.put("Twenty", 20.00f);
currency.put("Ten", 10.00f);
currency.put("Five", 5.00f);
currency.put("Two", 2.00f);
currency.put("One", 1.00f);
currency.put("Half Dollar", 0.50f);
currency.put("Quarter", 0.25f);
currency.put("Dime", 0.10f);
currency.put("Nickel", 0.05f);
currency.put("Penny", 0.01f);
Would it be superior for any reason?
In cases like these were either could be utilized, are there any performance advantages to using one over another? Is one more preferable/conventional? More maintainable/adaptable?
Is there any rule of thumb I could use for when I should use one over the other?
Here are things I like to keep in mind:
Enums are best used (and in the languages I know of, may only be used) to define a known set of items ahead of time. This has a nice benefit of treating what really boils down to frequently used "data" as code in a very readable way.
In my opinion, any code that relies on frequently hardcoded strings, like you would need to use if implementing data like that in a map is more difficult to read and maintain. This leads to "magic strings", which is a no-no when avoidable.
It's not immediately clear what should exist in the map until you go check, and it's not clear if it's potentially being modified elsewhere. Consider, that if you got an enum value wrong, the code will not even compile. Get a string key wrong, and you might not notice until much later.
Regarding performance, I doubt there is a large difference between the two. Enums are treated largely the same as objects, I suppose the benefit comes from accessing the data as a field on the object rather than a hash lookup.
This article doesn't go in depth as I would like, but may be a good starting point: Memory Consumption of Java Data Types
It is quite common practice to use an enum as keys for a known map and that offers another way of associating data with a set of specific items (rather than setting them as fields on the enum). I believe this approach would be my preferred method since setting lots of fields on an enum makes them feel too much like a class rather than a method of referencing. This doesn't have the same problems as a normal map because since the keys must be enums you don't need to worry about any other keys "accidentally" being added to the map. It seems Java as a whole supports this approach as they provide the EnumMap class.
I would say that the main difference between your two pieces of code is that in case of enum you have fixed list of denominations which are "type-safe". While operating with strings and maps it is very easy to misspell some string, introducing bugs that are hard to spot.
I would use enum in this case it is more sensible and if this were something that were to be used by other people enum's have the associated values display for you if you are using pretty much any ide, where as if you are using a map neither the key or the value is readily available to you. There are other reasons but that was one that came to mind.
Would it be superior for any reason?
The Map design would be appropriate for dynamic data, whereas the enum design would be appropriate for fixed data.
In cases like these were either could be utilized, are there any
performance advantages to using one over another?
Insignificant.
Is one more preferable/conventional?
Only when considering the specific problem to be solved.
More maintainable/adaptable?
Again, it depends on the problem you're trying to solve.
Is there any rule of thumb I could use for when I should use one over
the other?
Whether you're working with a limited, non-varying dataset known at compile time.

What would be a good way to implement a set collection with weak references, compares by reference, and is also sortable in Java?

I want to have an object that allows other objects of a specific type to register themselves with it. Ideally it would store the references to them in some sort of set collection and have .equals() compare by reference rather than value. It shouldn't have to maintain a sort at all times, but it should be able to be sorted before the collection is iterated over.
Looking through the Java Collection Library, I've seen the various features I'm looking for on different collection types, but I am not sure about how I should go about using them to build the kind of collection I'm looking for.
This is Java in the context of Android if that is significant.
Java's built-in tree-based collections won't work.
To illustrate, consider a tree containing weak references to nodes 'B', 'C', and 'D':
C
B D
Now let the weak reference 'C' get collected, leaving null behind:
-
B D
Now insert an element into the tree. The TreeMap/TreeSet doesn't have sufficient information to select the left or right subtree. If your comparator says null is a small value, then it will be incorrect when inserting 'A'. If it says null is a large value, it will be incorrect when inserting 'E'.
Sort on demand is a good choice.
A more robust solution is to use an ArrayList<WeakReference<T>> and to implement a Comparator<WeakReference<T>> that delegates to a Comparator<T>. Then call Collections.sort() prior to iteration.
Android's Collections.sort uses TimSort behind-the-scenes and so it runs quite efficiently if the input is already partially sorted.
Perhaps the collections classes are a level of abstraction below what you're looking for? It sounds like the end product you want is a cache with the ability to iterate in a user-defined sort order. If so, perhaps the cache interface in the Google Guava library is close enough to what you want:
http://code.google.com/p/guava-libraries/source/browse/trunk/guava/src/com/google/common/cache/Cache.java
At a glance, it looks like CacheBuilder in that package doesn't allow you to build an implementation with user-defined iteration order. However, it does provide a Map view that might be good enough for your needs:
List<Thing> cachedThings = Lists.newArrayList(cache.asMap().values());
Collections.sort(cachedThings, YOUR_THING_COMPARATOR);
for (Thing thing : cachedThings) { ... }
Even if this isn't exactly what you want, the classes in that package might give you some useful insights re: using References with Collections.
DISCLAIMER: This was a comment but it got kinda big, sorry if it doesn't solve your problem:
References in Java
Just to clarify what I mean when I say reference, since it isn't really a term commonly used in Java: Java does not really use references or pointers. It uses a kind of pseudo-reference that can be (and is by default) assigned to the special null instance. That's one way to explain it anyway. In Java, these pseudo-references are the only way that an Object can be handled. When I say reference, I mean these pseudo-references.
Sets
Any Set implementation will not allow two references to the same object to be included in it since it uses identity equality for this check. That violates the mathematical concept of a set. The Java Sets ignore any attempt to add duplicate references.
You mention a Map in your comment though... Could you clarify what kind of collection you are after? And why you need that kind of equality checking within it? Are you thinking in C++ terms? I'll try to edit my answer to be more helpful then :)
EDIT: I thought that might have been your goal ;) So a TreeSet should do the trick then! I would not get concerned about performance until there is a performance issue. Simplicity is fantastic for readability, maintenance and preventing bugs. If performance does become a problem, ideally you should profile your code and only optimize the areas that are proven to be the problem.

Can hashCode() have dynamically changeable content?

In my implementation, I have a class A which overrides equals(Object) and hashCode(). But I have a small doubt that is, while adding the instance of A to HashSet/HashMap the value of the hashCode() is x, after sometime the value of the same hashCode() changed to y. Will it effect anything?
The hash code mustn't change after it's been added to a map / set. It's okay for it to change before that, although it generally makes the type easier to work with if it doesn't change.
If the hash code changes, the key won't be found in the map / set, as even if it ends up in the same bucket, the hash code will be changed first.
When the return value of hashCode() or equals() changes while the object is contained in HashMap/HashSet etc., the behavior is undefined (you could get all kinds of strange behavior). So one must avoid such mutation of keys while the object is contained in such collections etc.
It is considered best to use only immutable objects for keys (or place them in a HashSet etc.). In fact python for example, does not allow mutable objects to be used as keys in maps. It is permissive/common to use mutable objects as keys in Java, but in such case it is advisable to make such objects "effectively immutable". I.e. do not change the state of such objects at all after instantiation.
To give an example, using a list as a key in a Map is usually considered okay, but you should avoid mutating such lists at any point of your application to avoid getting bitten by nasty bugs.
As long as you don't change the return value of hashCode() and equals() while the objects are in the container, you should be ok on paper. But one could easily introduce nasty, hard to find bugs by mistake so it's better to avoid the situation altogether.
Yes, the hash code of an object must not change during its lifetime. If it does, you need to notify the container (if that's possible); otherwise you will can get wrong results.
Edit: As pointed out, it depends on the container. Obviously, if the container never uses your hashCode or equals methods, nothing will go wrong. But as soon as it tries to compare things for equality (all maps and sets), you'll get yourself in trouble.
Yes. Many people answered the question here, I just want to say an analogy. Hash code is something like address in hash-based collection:
Imagine you check in a hotel by your name "Mike", after that you change your name to "GreatMike" on check-paper. Then when someone looks for you by your name "Mike", he cannot find you anymore.

Categories

Resources