FindBugs: Detect invocation of Object.hashCode() - java

If an object doesn't implement it's own hashCode() method, then it will use the default implementation Object.hashCode() (provided there's no superclass in between). Object.hashCode() doesn't guarantee the same hash code to be generated in different JVM instance. We are having some problems because of this in a clustered environment.
Additionally to some fixes that we applied, we would like to have static analysis detect this case. We are already using FindBugs, but unfortunatly I have no experience extending the default ruleset.
I've done some research and I know that you can implement your own custom detectors, but I have not found much documentation on how to do this.
I guess my questions are:
Before I invest too much work here, is this approach reasonable, can FindBugs do this?
What's the best resources to get me started writing custom detectors?
Thanks for your input!

Findbugs has some checks for hashCode already: (see also http://findbugs.sourceforge.net/bugDescriptions.html )
HE_EQUALS_NO_HASHCODE
HE_EQUALS_USE_HASHCODE
HE_HASHCODE_NO_EQUALS
HE_HASHCODE_USE_OBJECT_EQUALS
HE_INHERITS_EQUALS_USE_HASHCODE (this might be of interest for your case)
If those are not sufficient for you, they might be a good starting point for creating a custom detector.
UPDATE. The source code of the detectors can be found in https://code.google.com/p/findbugs/source/browse/findbugs/src/java/edu/umd/cs/findbugs/ and the other packages of that repo.

You may try it the other way around:
Add a hashCode() method to all your (entity like) classes. The nonexistance of that method can easily be verified with findbugs. The implementation would look something like:
#Override
public int hashCode() {
throw new UnsupportedOperationException("hashcode() not supported.");
}
By that you can ensure that there is no Object.hashCode() "fallback" - that class will not be used in HashMaps, HashTables, HashSets or any other situation where hashCode() will be called.

I think that will be really difficult (if it's even possible). But I have two other things you could try to find places where classes should override hashCode():
At least Netbeans has a hint for "overrides equals but not hashCode" that could help a little.
Place a breakpoint on Object.hashCode() and run a more or less representative testset.

If you want to find ALL classes which do not implement getHashCode(), couldn't you just use a simple text-search / grep approach ?
Just search all Files with ending .java in your project which do not contain the string "public int getHashCode", would be fairly easy to write a script for this. You can e.g. just use a simple search-tool to find all java-files containing the text and substract this list from a list of all .java files. The resulting list will have all .java files which do not override getHashCode()

Related

Testing a NIO.2 Path instance for "empty path"

So this trivial question has generated a disproportionate amount of discussion here. It feels more like a playful puzzle but not full-on codegolf.
The javadoc for the NIO.2 Path class includes this part:
A Path is considered to be an empty path if it consists
solely of one name element that is empty.
followed by the "empty path maps to default directory" clause -- that behavior is well understood and not relevant to the question here.
The discussion arose from some of our junior developers asking: given a Path instance p, how should they test for an empty path condition? Turns out the rest of their team (with more experience) had each been doing their own thing, and while all of their approaches "worked", they wanted to converge on the officially correct way; I believe there may have been a
round of beers at stake.
Testing for consists solely of one name element is trivial
(p.getNameCount() == 1). Testing for that is empty means obtaining that
name element (p.getName(0) or p.getFileName()), which... is also a Path
instance that needs to be tested for emptiness...
Calling p.toString() and then testing for isEmpty() felt distasteful, because the emptiness test is being done on a String representation of the path, not the path instance itself. This sparked some philosophical debate about the completeness of the Path API and the meaning of canonical representations. I think they were already two beers in by then.
One developer pointed to the Path#resolve(Path other) method's javadocs, which contain the note If other is an empty path then this method trivially returns this path. So his emptiness test uses an isolated Path instance, and tests for isolated.resolve(p).equals(isolated), which seemed suspiciously too clever
and apparently led to raised voices.
Another developer admitted to testing whether p was an instance of sun.nio.fs.UnixPath and then abusing reflection to accessing its private isEmpty() method. I wasn't present to ask what he does for Windows platforms, and suspect this wouldn't work in Java 9+ anyway.
In the end, they said they grudgingly settled on p.toString().length() == 0 but nobody was happy about it. None of them like the idea that the Path class depends on an "emptiness" quality that they could only apparently measure using methods of the String class, either before construction or after conversion. Presumably this solution was good enough for them to figure out who bought the beers, anyway.
Anyhow, once I heard about it I had to admit I was at a loss as to the best practice. What do the experts do for this case? Convert to String and be done with it, or stay within the NIO.2 API and take advantage of the resolve behavior, or...? (If you live near our other team, they might buy you a beer.)
Ideally, toString() should not be used for overall comparisons. And while you could use the resolve method, you really shouldn’t. (I won’t even address the use of reflection for this.)
I believe you all are over-thinking this problem. Just write what you mean. If you want to test if a Path is equal to the empty path, then do exactly that:
if (somePath.equals(Paths.get("")))
I suppose you could store the empty path in a constant, but it’s so trivial that I wouldn’t bother. It might even make the code harder to read instead of making it easier.
If you don’t want to do that, then your first instinct was correct: test for the conditions described in the documentation.
if (somePath.getNameCount() == 1 &&
somePath.getFileName().toString().isEmpty())
Or:
if (somePath.getNameCount() == 1 && somePath.endsWith(""))
I would prefer using equals, because when someone else reads the code, they will see code that shows your intent: to check whether a path is equal to the empty path.

Is it ok to add toString() to ease debugging?

I work a lot in intellij and it can be quite convenient to have classes having their own tostring(the generated one in intellij works fine) so you can see something more informative than MyClass#1345 when trying to figure out what something is.
My question is: Is that ok? I am adding code that has no business value and doesn't affect my test cases or the execution of my software(I am not using toString() for anything more than debugging). Still, it is a part of my process. What is correct here?
The toString() method is mainly designed as a debugging purpose method.
Except some exceptional cases, you should favor its use for debug purposes and not to display information to the clients as client needs may happen to be different or be the same as the toString() method today but could be different tomorrow.
From the toString() javadoc, you can read :
Returns a string representation of the object. In general, the
toString method returns a string that "textually represents" this
object. The result should be a concise but informative representation
that is easy for a person to read. It is recommended that all
subclasses override this method.
The parts that matter for your are :
The result should be a concise but informative representation
that is easy for a person to read.
and
It is recommended that all
subclasses override this method.
You said that :
Still, it is a part of my process. What is correct here?
Good thing : the specification recommends it.
Besides the excellent points by davidxxx, the following things apply:
Consistency matters. People working with your code should not be surprised by what is happening within your classes. So either "all/most" classes #override toString() using similar implementations - or "none" does that.
Thus: make sure everybody agrees if/how to implement toString()
Specifically ensure that your toString() implementation is robust
Meaning: you absolutely have to avoid that your implementation throws any exception (for example a NPE because you happen to do someString + fieldX.name() for some fieldX that might be null).
You also have to avoid creating an "expensive" implementation (for example code that does a "deep dive" into some database to return a value from there).
2 cent of personal opinion: I find toString() to be of great value when debugging things; but I also have seen real performance impacts by toString() too expensive. Thing is: you have no idea how often some trace code might be calling toString() on your objects; so you better make sure it returns quickly.
The docs explain the function of this method:
Returns a string representation of the object. In general, the toString method returns a string that "textually represents" this object. The result should be a concise but informative representation that is easy for a person to read. It is recommended that all subclasses override this method.
As you see, they don't specify a perticular use for this method or discourage you from using it for debuging, but they only state what it is expected to do and also recomend implementing this method in subclasses of Object.
Therefore strictly speaking how you use this method is up to you. In the university course i am taking, overwriting the toString method is required for some tasks and in some cases we are asked to use it to demonstrate debuging.
It is perfectly OK and even a good idea. Most classes don't specify the content of toString so it's not wise to use it for logic (the content may change in a future version of the class). But some classes do, for example StringBuilder. And then it is also OK to use the return value for logic.
So for your own classes you may even opt to specify the content and use (and let your users use) the return value for logic.

Forcing devs to explicitly define keys for configuration data

We are working in a project with multiple developers and currently the retrieval of values from a configuration file is somewhat "wild west":
Everybody uses some string to retrieve a value from the Config object
Those keys are spread across multiple classes and packages
Sometimes the are not even declared as constants
Naming of the keys is inconsistent and the config file (.properties) looks messy
I would like to sort that out and force everyone to explicitly define their configuration keys. Ideally in one place to streamline how config keys actually look.
I was thingking of using an Enum as a key and turning my retrieval method into:
getConfigValue(String key)
into something like
getConfigValue(ConfigKey)
NOTE: I am using this approach since the Preferences API seems a bit overkill to me plus I would actually like to have the configuration in a simple file.
What are the cons of this approach?
First off, FWIW, I think it's a good idea. But you did specifically ask what the "cons" are, so:
The biggest "con" is that it ties any class that needs to use configuration data to the ConfigKey class. Adding a config key used to mean adding a string to the code you were working on; now it means adding to the enum and to the code you were working on. This is (marginally) more work.
You're probably not markedly increasing inter-dependence otherwise, since I assume the class that getConfigValue is part of is the one on which you'd define the enum.
The other downside to consolidation is if you have multiple projects on different parts of the same code base. When you develop, you have to deal with delivery dependencies, which can be a PITA.
Say Project A and Project B are scheduled to get released in that order. Suddenly political forces change in the 9th hour and you have to deliver B before A. Do you repackage the config to deal with it? Can your QA cycles deal with repackaging or does it force a reset in their timeline.
Typical release issues, but just one more thing you have to manage.
From your question, it is clear that you intend to write a wrapper class for the raw Java Properties API, with the intention that your wrapper class provides a better API. I think that is a good approach, but I'd like to suggest some things that I think will improve your wrapper API.
My first suggested improvement is that an operation that retrieves a configuration value should take two parameters rather than one, and be implemented as shown in the following pseudocode:
class Configuration {
public String getString(String namespace, String localName) {
return properties.getProperty(namespace + "." + localName);
}
}
You can then encourage each developer to define a string constant value to denote the namespace for whatever class/module/component they are developing. As long as each developer (somehow) chooses a different string constant for their namespace, you will avoid accidental name clashes and promote a somewhat organised collection of property names.
My second suggested improvement is that your wrapper class should provide type-safe access to property values. For example, provide getString(), but also provide methods with names such as getInt(), getBoolean(), getDouble() and getStringList(). The int/boolean/double variants should retrieve the property value as a string, attempt to parse it into the appropriate type, and throw a descriptive error message if that fails. The getStringList() method should retrieve the property value as a string and then split it into a list of strings based on using, say, a comma as a separator. Doing this will provide a consistent way for developers to get a list value.
My third suggested improvement is that your wrapper class should provide some additional methods such as:
int getDurationMilliseconds(String namespace, String localName);
int getDurationSeconds(String namespace, String localName);
int getMemorySizeBytes(String namespace, String localName);
int getMemorySizeKB(String namespace, String localName);
int getMemorySizeMB(String namespace, String localName);
Here are some examples of their intended use:
cacheSize = cfg.getMemorySizeBytes(MY_NAMSPACE, "cache_size");
timeout = cfg.getDurationMilliseconds(MY_NAMSPACE, "cache_timeout");
The getMemorySizeBytes() method should convert string values such as "2048 bytes" or "32MB" into the appropriate number of bytes, and getMemorySizeKB() does something similar but returns the specified size in terms of KB rather than bytes. Likewise, the getDuration<units>() methods should be able to handle string values like "500 milliseconds", "2.5 minutes", "3 hours" and "infinite" (which is converted into, say, -1).
Some people may think that the above suggestions have nothing to do with the question that was asked. Actually, they do, but in a sneaky sort of way. The above suggestions will result in a configuration API that developers will find to be much easier to use than the "raw" Java Properties API. They will use it to obtain that ease-of-use benefit. But using the API will have the side effect of forcing the developers to adopt a namespace convention, which will help to solve the problem that you are interested in addressing.
Or to look at it another way, the main con of the approach described in the question is that it offers a win-lose situation: you win (by imposing a property-naming convention on developers), but developers lose because they swap the familiar Java Properties API for another API that doesn't offer them any benefits. In contrast, the improvements I have suggested are intended to provide a win-win situation.

Modify code at runtime to log return values in Java?

Is there any way of inserting code at runtime to log return values, for instance, using instrumentation?
So far, I managed to insert code when a method exits, but I would like to log something like "method foo returned HashMap { 1 -> 2, 2 -> 3 }"
I'm looking for a general approach that can also deal with, for instance, java.io.* classes. (So in general I'll have no access to the code).
I tried using a custom classloader too, but lot of difficulties arise as I cannot modify java.* classes.
Thanks for the help!
Sergio
Check out BTrace. It's Java, and I believe it'll do what you want.
Have you considered AOP? (Aspect-oriented programming) - if by "I cannot modify java.* classes" you mean you don't have access to the uncompiled code, and cannot add configuration, etc., then that won't probably work for you. In any other case, check that link for examples using Spring-aop:
http://static.springsource.org/spring/docs/2.5.x/reference/aop.html
If not, you could consider solutions based on remote-debugging, or profiling. But they all involve "some" access to the original code, if only to enable / disable JMX access.
Well, since you're looking for everything, the only thing I can think off is using a machine agent. Machine agents hook into the low levels of the JVM itself and can be used to monitor these things.
I have not used DTrace, but it sounds like it would be able to do what you need. Adam Leventhal wrote a nice blog post about it. The link to DTrace in the blog is broken, but I'm sure a quick search and you'll come up with it.
Take a look at Spring AOP, which is quite powerful, and flexible. To start you off on the method foo, you can apply an AfterReturning advice to it as:
#Aspect
public class AfterReturningExample {
#AfterReturning(
pointcut="package.of.your.choice.YourClassName.foo()",
returning="retVal")
public void logTheFoo( Object retVal ) {
// ... logger.trace( "method 'foo' returned " + retVal ); // might need to convert "retVal" toString representation if needed
}
}
The pointcut syntax is really flexible so you can target all the sub packages, components, methods, return values given the expression.

Can I add and remove elements of enumeration at runtime in Java

It is possible to add and remove elements from an enum in Java at runtime?
For example, could I read in the labels and constructor arguments of an enum from a file?
#saua, it's just a question of whether it can be done out of interest really. I was hoping there'd be some neat way of altering the running bytecode, maybe using BCEL or something. I've also followed up with this question because I realised I wasn't totally sure when an enum should be used.
I'm pretty convinced that the right answer would be to use a collection that ensured uniqueness instead of an enum if I want to be able to alter the contents safely at runtime.
No, enums are supposed to be a complete static enumeration.
At compile time, you might want to generate your enum .java file from another source file of some sort. You could even create a .class file like this.
In some cases you might want a set of standard values but allow extension. The usual way to do this is have an interface for the interface and an enum that implements that interface for the standard values. Of course, you lose the ability to switch when you only have a reference to the interface.
Behind the curtain, enums are POJOs with a private constructor and a bunch of public static final values of the enum's type (see here for an example). In fact, up until Java5, it was considered best-practice to build your own enumeration this way, and Java5 introduced the enum keyword as a shorthand. See the source for Enum<T> to learn more.
So it should be no problem to write your own 'TypeSafeEnum' with a public static final array of constants, that are read by the constructor or passed to it.
Also, do yourself a favor and override equals, hashCode and toString, and if possible create a values method
The question is how to use such a dynamic enumeration... you can't read the value "PI=3.14" from a file to create enum MathConstants and then go ahead and use MathConstants.PI wherever you want...
I needed to do something like this (for unit testing purposes), and I came across this - the EnumBuster:
http://www.javaspecialists.eu/archive/Issue161.html
It allows enum values to be added, removed and restored.
Edit: I've only just started using this, and found that there's some slight changes needed for java 1.5, which I'm currently stuck with:
Add array copyOf static helper methods (e.g. take these 1.6 versions: http://www.docjar.com/html/api/java/util/Arrays.java.html)
Change EnumBuster.undoStack to a Stack<Memento>
In undo(), change undoStack.poll() to undoStack.isEmpty() ? null : undoStack.pop();
The string VALUES_FIELD needs to be "ENUM$VALUES" for the java 1.5 enums I've tried so far
I faced this problem on the formative project of my young career.
The approach I took was to save the values and the names of the enumeration externally, and the end goal was to be able to write code that looked as close to a language enum as possible.
I wanted my solution to look like this:
enum HatType
{
BASEBALL,
BRIMLESS,
INDIANA_JONES
}
HatType mine = HatType.BASEBALL;
// prints "BASEBALL"
System.out.println(mine.toString());
// prints true
System.out.println(mine.equals(HatType.BASEBALL));
And I ended up with something like this:
// in a file somewhere:
// 1 --> BASEBALL
// 2 --> BRIMLESS
// 3 --> INDIANA_JONES
HatDynamicEnum hats = HatEnumRepository.retrieve();
HatEnumValue mine = hats.valueOf("BASEBALL");
// prints "BASEBALL"
System.out.println(mine.toString());
// prints true
System.out.println(mine.equals(hats.valueOf("BASEBALL"));
Since my requirements were that it had to be possible to add members to the enum at run-time, I also implemented that functionality:
hats.addEnum("BATTING_PRACTICE");
HatEnumRepository.storeEnum(hats);
hats = HatEnumRepository.retrieve();
HatEnumValue justArrived = hats.valueOf("BATTING_PRACTICE");
// file now reads:
// 1 --> BASEBALL
// 2 --> BRIMLESS
// 3 --> INDIANA_JONES
// 4 --> BATTING_PRACTICE
I dubbed it the Dynamic Enumeration "pattern", and you read about the original design and its revised edition.
The difference between the two is that the revised edition was designed after I really started to grok OO and DDD. The first one I designed when I was still writing nominally procedural DDD, under time pressure no less.
You can load a Java class from source at runtime. (Using JCI, BeanShell or JavaCompiler)
This would allow you to change the Enum values as you wish.
Note: this wouldn't change any classes which referred to these enums so this might not be very useful in reality.
A working example in widespread use is in modded Minecraft. See EnumHelper.addEnum() methods on Github
However, note that in rare situations practical experience has shown that adding Enum members can lead to some issues with the JVM optimiser. The exact issues may vary with different JVMs. But broadly it seems the optimiser may assume that some internal fields of an Enum, specifically the size of the Enum's .values() array, will not change. See issue discussion. The recommended solution there is not to make .values() a hotspot for the optimiser. So if adding to an Enum's members at runtime, it should be done once and once only when the application is initialised, and then the result of .values() should be cached to avoid making it a hotspot.
The way the optimiser works and the way it detects hotspots is obscure and may vary between different JVMs and different builds of the JVM. If you don't want to take the risk of this type of issue in production code, then don't change Enums at runtime.
You could try to assign properties to the ENUM you're trying to create and statically contruct it by using a loaded properties file. Big hack, but it works :)

Categories

Resources