Java Arrays.hashcode() weird behaivor - java

I have an object which has one field- double[] _myField
it's hashcode is
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + Arrays.hashCode(_myField);
return result;
}
However if I used the object as a key in a Map I get the following weird behavior:
for (Map.Entry<MyObject, String> entry: _myMap.entrySet())
{
if (entry.getValue() != _myMap.get(entry.getKey()))
{
System.out.println("found the problem the value is null");
}
}
The only reason I can think of that the above IF statement is true is that I get a different hashcode for the key.
in fact, I have changed the hashcode function to return 1 in all cases. Not efficient, but good for debugging and indeed the IF statement is always false.
What's wrong with Arrays.hashcode()?
Pls note (after reading some comments):
1) As for the usage of != in the IF statement, indeed it compares references but in the above case it should have been the same. Anyhow the weird thing is that the right hand side returns NULL
2) As for posting Equals function. Of course I've implemented it. But it is irrelevant. Tracing the code in debug reveals that only hashcode is called. The reason for that is presumably the weird thing, the returned hashcode is different from the orignal one. In such cases the Map doesn't find a matching entry and therefore there is no need to call Equals.

Is the array being changed while it's in the map? Because that will change the outcome of the result.

Implementing hashCode is not enough. You also need to implement equals object. In fact whenever you implement hashCode for an object, you MUST implement equals as well. Those 2 work together.
You need to implement equals for your object and ensure that whenever equals is true for 2 objects, their hashCode also matches.

Trying to reproduce the problem on a clean slate revealed it is not reproducible.
Hence, further investigation revealed that the problem was
the _myField used for Hash was changed while the object was stored in the map.
As expected the map is corrupted.
Sorry for the time wasted by those who tried to answer the wrong question.

Related

How do I prove that Object.hashCode() can produce same hash code for two different objects in Java?

Had a discussion with an interviewer regarding internal implementation of Java Hashmaps and how it would behave if we override equals() but not the HashCode() method for an Employee<Emp_ID, Emp_Name> object.
I was told that hashCode for two different objects would never be the same for the default object.hashCode() implementation, unless we overrode the hashCode() ourselves.
From what I remembered, I told him that Java Hashcode contracts says that two different objects "may" have the same hashcode() not that it "must".
According to my interviewer, the default object.hashcode() never returns the same hashcode() for two different objects, Is this true?
Is it even remotely possible to write a code that demonstrates this. From what I understand, Object.hashcode() can produce 2^30 unique values, how does one produce a collision, with such low possibility of collision to demonstrate that two different objects can get the same hashcode() with the Object classes method.
Or is he right, with the default Object.HashCode() implementation, we will never have a collision i.e two different objects can never have the same HashCode. If so, why do so many java manuals don't explicitly say so.
How can I write some code to demonstrate this? Because on demonstrating this, I can also prove that a bucket in a hashmap can contain different HashCodes(I tried to show him the debugger where the hashMap was expanded but he told me that this is just logical Implementation and not the internal algo?)
2^30 unique values sounds like a lot but the birthday problem means we don't need many objects to get a collision.
The following program works for me in about a second and gives a collision between objects 196 and 121949. I suspect it will heavily depend on your system configuration, compiler version etc.
As you can see from the implementation of the Hashable class, every one is guarenteed to be unique and yet there are still collisions.
class HashCollider
{
static class Hashable
{
private static int curr_id = 0;
public final int id;
Hashable()
{
id = curr_id++;
}
}
public static void main(String[] args)
{
final int NUM_OBJS = 200000; // birthday problem suggests
// this will be plenty
Hashable objs[] = new Hashable[NUM_OBJS];
for (int i = 0; i < NUM_OBJS; ++i) objs[i] = new Hashable();
for (int i = 0; i < NUM_OBJS; ++i)
{
for (int j = i + 1; j < NUM_OBJS; ++j)
{
if (objs[i].hashCode() == objs[j].hashCode())
{
System.out.println("Objects with IDs " + objs[i].id
+ " and " + objs[j].id + " collided.");
System.exit(0);
}
}
}
System.out.println("No collision");
}
}
If you have a large enough heap (assuming 64 bit address space) and objects are small enough (the smallest object size on a 64 bit JVM is 8 bytes), then you will be able to represent more than 2^32 objects that are reachable at the same time. At that point, the objects' identity hashcodes cannot be unique.
However, you don't need a monstrous heap. If you create a large enough pool of objects (e.g. in a large array) and randomly delete and recreate them, it is (I think) guaranteed that you will get a hashcode collision ... if you continue doing this long enough.
The default algorithm for hashcode in older versions of Java is based on the address of the object when hashcode is first called. If the garbage collector moves an object, and another one is created at the original address of the first one, and identityHashCode is called, then the two objects will have the same identity hashcode.
The current (Java 8) default algorithm uses a PRNG. The "birthday paradox" formula will tell you the probability that one object's identity hashcode is the same as one more of the other's.
The -XXhashCode=n option that #BastianJ mentioned has the following behavior:
hashCode == 0: Returns a freshly generated pseudo-random number
hashCode == 1: XORs the object address with a pseudo-random number that changes occasionally.
hashCode == 2: The hashCode is 1! (Hence #BastianJ's "cheat" answer.)
hashCode == 3: The hashcode is an ascending sequence number.
hashCode == 4: the bottom 32 bits of the object address
hashCode >= 5: This is the default algorithm for Java 8. It uses Marsaglia's xor-shift PRNG with a thread specific seed.
If you have downloaded the OpenJDK Java 8 source code, you will find the implementation in hotspot/src/share/vm/runtime/synchronizer.cp. Look for the get_next_hash() method.
So that is another way to prove it. Show him the source code!
Use Oracle JVM and set -XX:hashCode=2. If I remember corretly, this chooses the Default implementation to be "constant 1". Just for the purpose of proving you're right.
I have little to add to Michael's answer (+1) except a bit of code golfing and statistics.
The Wikipedia article on the Birthday problem that Michael linked to has a nice table of the number of events necessary to get a collision, with a desired probability, given a value space of a particular size. For example, Java's hashCode has 32 bits, giving a value space of 4 billion. To get a collision with a probability of 50%, about 77,000 events are necessary.
Here's a simple way to find two instances of Object that have the same hashCode:
static int findCollision() {
Map<Integer,Object> map = new HashMap<>();
Object n, o;
do {
n = new Object();
o = map.put(n.hashCode(), n);
} while (o == null);
assert n != o && n.hashCode() == o.hashCode();
return map.size() + 1;
}
This returns the number of attempts it took to get a collision. I ran this a bunch of times and generated some statistics:
System.out.println(
IntStream.generate(HashCollisions::findCollision)
.limit(1000)
.summaryStatistics());
IntSummaryStatistics{count=1000, sum=59023718, min=635, average=59023.718000, max=167347}
This seems quite in line with the numbers from the Wikipedia table. Incidentally, this took only about 10 seconds to run on my laptop, so this is far from a pathological case.
You were right in the first place, but it bears repeating: hash codes are not unique!

Explaination for IntelliJ default hashCode() implementation

I took a look at the IntelliJ default hashCode() implementation and was wondering, why they implemented it the way they did. I'm quite new to the hash concept and found some contradictory statements, that need clarification:
public int hashCode(){
// creationDate is of type Date
int result = this.creationDate != null ? this.creationDate.hashCode() : 0;
// id is of type Long (wrapper class)
result = 31 * result + (this.id != null ? this.id.hashCode() : 0);
// code is of type String
result = 31 * result + (this.code != null ? this.code.hashCode() : 0);
// revision is of type int
result = 31 * result + this.revision;
return result;
}
Imo, the best source about this topic seemed to be this Java world article because I found their arguments most convincing. So I was wondering:
Among other arguments, above source states that multiplication is one of the slower operations. So, wouldn't it be better to skip the multiplication with a prime number whenever I call the hashCode() method of a reference type? Because most of the time this already includes such a multiplication.
Java world states that bitwise XOR ^ also improves the computation due to not mentioned reasons : ( What exactly might be an advantage in comparison to regular addition?
Wouldn't it be better to return different values when the respective class field is null? It would make the result more distinguishable, wouldn't it? Are there any huge disadvantages to use non-zero values?
Their example code looks more appealing to my eye, tbh:
public boolean hashCode() {
return
(name == null ? 17 : name.hashCode()) ^
(birth == null ? 31 : name.hashCode());
}
But I'm not sure if that's objectively true. I'm also a little bit suspicious of IntelliJ because their default code for equals(Object) compares by instanceof instead of comparing the instance classes directly. And I agree with that Java world article that this doesn't seem to fulfill the contract correctly.
As for hashCode(), I would consider it more important to minimize collisions (two different objects having same hashCode()) than the speed of the hashCode() computation. Yes, the hashCode() should be fast (constant-time if possible), but for huge data structures using hashCode() (maps, sets etc.) the collisions are more important factor.
If your hashCode() function performs in constant time (independent on data and input size) and produces a good hashing function (few collisions), asymptotically the operations (get, contains, put) on map will perform in constant time.
If your hashCode() function produces a lot of collisions, the performance will suffer. In extreme case, you can always return 0 from hashCode() - the function itself will be super-fast, but the map operations will perform in linear time (i.e. growing with map size).
Multiplying the hashCode() before adding another field's sub-hashCode should usually provide for less collisions - this is a heuristic based on that often the fields contain similar data / small numbers.
Consider an example of class Person:
class Person {
int age;
int heightCm;
int weightKg;
}
If you just added the numbers together to compute the hashCode, the result would be somewhere between 60 and 500 for all persons. If you multiply it the way Idea does, you will get hashCodes between 2000 and more than 100000 - much bigger space and therefore lower chance of collisions.
Using XOR is not a very good idea, for example if you have class Rectangle with fields height and width, all squares would have the same hashCode - 0.
As for equals() using instanceof vs. getClass().equals(), I've never seen a conclusive debate on this. Both have their advantages and disadvantages, and both ways can cause troubles if you're not careful:
If you use instanceof, any subclass that overrides your equals() will likely break the symmetry requirement
If you use getClass().equals(), this will not work well with some frameworks like Hibernate that produce their own subclasses of your classes to store their own technical information

Effective java item 9: overriding hashcode example

I was reading Effective Java Item 9 and decided to run the example code by myself. But it works slightly different depending on how I insert a new object that I don't understand what exactly is going on inside. The PhoneNumber class looks:
public class PhoneNumber {
private final short areaCode;
private final short prefix;
private final short lineNumber;
public PhoneNumber(int areaCode, int prefix, int lineNumber) {
this.areaCode = (short)areaCode;
this.prefix = (short) prefix;
this.lineNumber = (short)lineNumber;
}
#Override public boolean equals(Object o) {
if(o == this) return true;
if(!(o instanceof PhoneNumber)) return false;
PhoneNumber pn = (PhoneNumber)o;
return pn.lineNumber == lineNumber && pn.prefix == prefix && pn.areaCode == areaCode;
}
}
Then according to the book and as is when I tried,
public static void main(String[] args) {
HashMap<PhoneNumber, String> phoneBook = new HashMap<PhoneNumber, String>();
phoneBook.put(new PhoneNumber(707,867,5309), "Jenny");
System.out.println(phoneBook.get(new PhoneNumber(707,867,5309)));
}
This prints "null" and it's explained in the book because HashMap has an optimization that caches the hash code associated with each entry and doesn't check for object equality if the hash codes don't match. It makes sense to me. But when I do this:
public static void main(String[] args) {
PhoneNumber p1 = new PhoneNumber(707,867,5309);
phoneBook.put(p1, "Jenny");
System.out.println(phoneBook.get(new PhoneNumber(707,867,5309)));
}
Now it returns "Jenny". Can you explain why it didn't fail in the second case?
The experienced behaviour might depend on the Java version and vendor that was used to run the application, because since the general contract of Object.hashcode() is violated, the result is implementation dependent.
A possible explanation (taking one possible implementation of HashMap):
The HashMap class in its internal implementation puts objects (keys) in different buckets based on their hashcode. When you query an element or you check if a key is contained in the map, first the proper bucket is looked for based on the hashcode of the queried key. Inside the bucket objects are checked in a sequencial way, and inside a bucket only the equals() method is used to compare elements.
So if you do not override Object.hashcode() it will be indeterministic if 2 different objects produce default hashcodes which may or may not determine the same bucket. If by any chance they "point" to the same bucket, you will still be able to find the key if the equals() method says they are equal. If by any chance they "point" to 2 different buckets, you will not find the key even if equals() method says they are equal.
hashcode() must be overriden to be consistent with your overridden equals() method. Only in this case it is guaranteed the proper, expected and consistent working of HashMap.
Read the javadoc of Object.hashcode() for the contract that you must not violate. The main point is that if equals() returns true for another object, the hashcode() method must return the same value for both of these objects.
Can you explain why it didn't fail in the second case?
In a nutshell, it is not guaranteed to fail. The two objects in the second example could end up having the same hash code (purely by coincidence or, more likely, due to compiler optimizations or due to how the default hashCode() works in your JVM). This would lead to the behaviour you describe.
For what it's worth, I cannot reproduce this behaviour with my compiler/JVM.
In your case by coincidence JVM was able to find the same hashCode for both object. When I ran your code, in my JVM it gave null for both the case. So your problem is because of JVM not the code.
It is better to override hashCode() each and every time when you override equils() method.
I haven't read Effective Java, I read SCJP by Kathy Sierra. So if you need more details then you can read this book. It's nice.
Your last code snipped does not compile because you haven't declared phoneBook.
Both main methods should work exactly the same. There is a 1 in 16 chance that it will print Jenny because a newly crated HashMap has a default size of 16. In detail that means that only the lower 4 bits of the hashCode will be checked. If they equal the equal method is used.

Why hashCode() returns the same value for a object in all consecutive executions?

I am trying some code around object equality in java. As I have read somewhere
hashCode() is a number which is generated by applying the hash function. Hash Function can be different for each object but can also be same. At the object level, it returns the memory address of the object.
Now, I have sample program, which I run 10 times, consecutively. Every time i run the program I get the same value as hash code.
If hashCode() function returns the memory location for the object, how come the java(JVM) store the object at same memory address in the consecutive runs?
Can you please give me some insight and your view over this issue?
The Program I am running to test this behavior is below :
public class EqualityIndex {
private int index;
public EqualityIndex(int initialIndex) {
this.index = initialIndex;
}
public static void main(String[] args) {
EqualityIndex ei = new EqualityIndex(2);
System.out.println(ei.hashCode());
}
}
Every time I run this program the hash code value returned is 4072869.
how come the java(JVM) store the object at same memory address in the consecutive runs?
Why wouldn't it? Non-kernel programs never work with absolute memory addresses, they use virtual memory where each process gets its own address space. So it's no surprise at all that a deterministic program would place stuff at the same location in each run.
Well, the objects may very well end up in the same location in the virtual memory. Nothing contradictory with that. Bottom line is that you shouldn't care anyway. If the hashCode is implemented to return something related to the internal storage address (there is no guarantee at all!) there is nothing useful you could do with that information anyway.
The hashCode() function should return the same value for the same object in the same execution. It is not necessary to return same value for different execution. See the following comments which was extracted from Object class.
The general contract of hashCode is:
Whenever it is invoked on the same object more than once during
an execution of a Java application, the hashCode method
must consistently return the same integer, provided no information
used in equals comparisons on the object is modified.
This integer need not remain consistent from one execution of an
application to another execution of the same application.
I have executed your code few times.
Here is my result
1935465023
1425840452
1935465023
1935465023
1935465023
1925529038
1935465023
1935465023
The same number is repeated multiple times. But that doesn't mean they are same always. Lets assume, a particular implementation of JVM is looking for first free slot in the memory. then if you don't run any other applications, the chances for allocating the object in the same slot is high.
Which kinds of object you are created? Is it a specific kind of object like String (e.g). If it is a specific kind, the class can already override the hashCode() method and return it to you. In that case, what you receive is not memory address any more.
and also what values you receive, are you sure if it is memory address?
So, please post more code for more details
As I have read somewhere: " ... At the object level, it returns the memory address of the object."
That statement is incorrect, or at best an oversimplification.
The statement is referring to the default implementation of Object.hashCode() which returns the same value as System.identityHashcode(Object).
The Javadoc for Object.hashCode() says this:
As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)
and this:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified.
In fact, the identity hashcode value is typically based on the Object's machine address when the method is first called for the object. The value is stored in a hidden field in the Object header. This allows the same hashcode to be returned on subsequent calls ... even if the GC has relocated the Object in the meantime.
Note that if the identity hashcode value did change over the lifetime of an Object, it would violate the contract for hashcode(), and would be useless as a hashcode.
hashCode can and may be overridden by the class you are using, the JavaDoc for Object.hashCode() states that it is '... typically implemented by converting the internal address of the object into an integer...' so the actual implementation may be system dependant. Note also the third point in the JavaDoc that two objects that are not equal are not required to return different hashCodes.
for (int i=0; i < 10; i++) {
System.out.println("i: " + i + ", hashCode: " + new Object().hashCode());
}
prints:
i: 0, hashCode: 1476323068
i: 1, hashCode: 535746438
i: 2, hashCode: 2038935242
i: 3, hashCode: 988057115
i: 4, hashCode: 1932373201
i: 5, hashCode: 1001195626
i: 6, hashCode: 1560511937
i: 7, hashCode: 306344348
i: 8, hashCode: 1211154977
i: 9, hashCode: 2031692173
If two objects are equal, they have to have the same hash codes. So, as long as you do not create plain Object instances, everytime you create object that has the same value, you'll get the same hash code.

Java - Common Gotchas [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
In the same spirit of other platforms, it seemed logical to follow up with this question: What are common non-obvious mistakes in Java? Things that seem like they ought to work, but don't.
I won't give guidelines as to how to structure answers, or what's "too easy" to be considered a gotcha, since that's what the voting is for.
See also:
Perl - Common gotchas
.NET - Common gotchas
"a,b,c,d,,,".split(",").length
returns 4, not 7 as you might (and I certainly did) expect. split ignores all trailing empty Strings returned. That means:
",,,a,b,c,d".split(",").length
returns 7! To get what I would think of as the "least astonishing" behaviour, you need to do something quite astonishing:
"a,b,c,d,,,".split(",",-1).length
to get 7.
Comparing equality of objects using == instead of .equals() -- which behaves completely differently for primitives.
This gotcha ensures newcomers are befuddled when "foo" == "foo" but new String("foo") != new String("foo").
I think a very sneaky one is the String.substring method. This re-uses the same underlying char[] array as the original string with a different offset and length.
This can lead to very hard-to-see memory problems. For example, you may be parsing extremely large files (XML perhaps) for a few small bits. If you have converted the whole file to a String (rather than used a Reader to "walk" over the file) and use substring to grab the bits you want, you are still carrying around the full file-sized char[] array behind the scenes. I have seen this happen a number of times and it can be very difficult to spot.
In fact this is a perfect example of why interface can never be fully separated from implementation. And it was a perfect introduction (for me) a number of years ago as to why you should be suspicious of the quality of 3rd party code.
Overriding equals() but not hashCode()
It can have really unexpected results when using maps, sets or lists.
SimpleDateFormat is not thread safe.
There are two that annoy me quite a bit.
Date/Calendar
First, the Java Date and Calendar classes are seriously messed up. I know there are proposals to fix them, I just hope they succeed.
Calendar.get(Calendar.DAY_OF_MONTH) is 1-based
Calendar.get(Calendar.MONTH) is 0-based
Auto-boxing preventing thinking
The other one is Integer vs int (this goes for any primitive version of an object). This is specifically an annoyance caused by not thinking of Integer as different from int (since you can treat them the same much of the time due to auto-boxing).
int x = 5;
int y = 5;
Integer z = new Integer(5);
Integer t = new Integer(5);
System.out.println(5 == x); // Prints true
System.out.println(x == y); // Prints true
System.out.println(x == z); // Prints true (auto-boxing can be so nice)
System.out.println(5 == z); // Prints true
System.out.println(z == t); // Prints SOMETHING
Since z and t are objects, even they though hold the same value, they are (most likely) different objects. What you really meant is:
System.out.println(z.equals(t)); // Prints true
This one can be a pain to track down. You go debugging something, everything looks fine, and you finally end up finding that your problem is that 5 != 5 when both are objects.
Being able to say
List<Integer> stuff = new ArrayList<Integer>();
stuff.add(5);
is so nice. It made Java so much more usable to not have to put all those "new Integer(5)"s and "((Integer) list.get(3)).intValue()" lines all over the place. But those benefits come with this gotcha.
Try reading Java Puzzlers which is full of scary stuff, even if much of it is not stuff you bump into every day. But it will destroy much of your confidence in the language.
List<Integer> list = new java.util.ArrayList<Integer>();
list.add(1);
list.remove(1); // throws...
The old APIs were not designed with boxing in mind, so overload with primitives and objects.
This one I just came across:
double[] aList = new double[400];
List l = Arrays.asList(aList);
//do intense stuff with l
Anyone see the problem?
What happens is, Arrays.asList() expects an array of object types (Double[], for example). It'd be nice if it just threw an error for the previous ocde. However, asList() can also take arguments like so:
Arrays.asList(1, 9, 4, 4, 20);
So what the code does is create a List with one element - a double[].
I should've figured when it took 0ms to sort a 750000 element array...
this one has trumped me a few times and I've heard quite a few experienced java devs wasting a lot of time.
ClassNotFoundException --- you know that the class is in the classpath BUT you are NOT sure why the class is NOT getting loaded.
Actually, this class has a static block. There was an exception in the static block and someone ate the exception. they should NOT. They should be throwing ExceptionInInitializerError. So, always look for static blocks to trip you. It also helps to move any code in static blocks to go into static methods so that debugging the method is much more easier with a debugger.
Floats
I don't know many times I've seen
floata == floatb
where the "correct" test should be
Math.abs(floata - floatb) < 0.001
I really wish BigDecimal with a literal syntax was the default decimal type...
Not really specific to Java, since many (but not all) languages implement it this way, but the % operator isn't a true modulo operator, as it works with negative numbers. This makes it a remainder operator, and can lead to some surprises if you aren't aware of it.
The following code would appear to print either "even" or "odd" but it doesn't.
public static void main(String[] args)
{
String a = null;
int n = "number".hashCode();
switch( n % 2 ) {
case 0:
a = "even";
break;
case 1:
a = "odd";
break;
}
System.out.println( a );
}
The problem is that the hash code for "number" is negative, so the n % 2 operation in the switch is also negative. Since there's no case in the switch to deal with the negative result, the variable a never gets set. The program prints out null.
Make sure you know how the % operator works with negative numbers, no matter what language you're working in.
Manipulating Swing components from outside the event dispatch thread can lead to bugs that are extremely hard to find. This is a thing even we (as seasoned programmers with 3 respective 6 years of java experience) forget frequently! Sometimes these bugs sneak in after having written code right and refactoring carelessly afterwards...
See this tutorial why you must.
Immutable strings, which means that certain methods don't change the original object but instead return a modified object copy. When starting with Java I used to forget this all the time and wondered why the replace method didn't seem to work on my string object.
String text = "foobar";
text.replace("foo", "super");
System.out.print(text); // still prints "foobar" instead of "superbar"
I think i big gotcha that would always stump me when i was a young programmer, was the concurrent modification exception when removing from an array that you were iterating:
List list = new ArrayList();
Iterator it = list.iterator();
while(it.hasNext()){
//some code that does some stuff
list.remove(0); //BOOM!
}
if you have a method that has the same name as the constructor BUT has a return type. Although this method looks like a constructor(to a noob), it is NOT.
passing arguments to the main method -- it takes some time for noobs to get used to.
passing . as the argument to classpath for executing a program in the current directory.
Realizing that the name of an Array of Strings is not obvious
hashCode and equals : a lot of java developers with more than 5 years experience don't quite get it.
Set vs List
Till JDK 6, Java did not have NavigableSets to let you easily iterate through a Set and Map.
Integer division
1/2 == 0 not 0.5
Using the ? generics wildcard.
People see it and think they have to, e.g. use a List<?> when they want a List they can add anything to, without stopping to think that a List<Object> already does that. Then they wonder why the compiler won't let them use add(), because a List<?> really means "a list of some specific type I don't know", so the only thing you can do with that List is get Object instances from it.
(un)Boxing and Long/long confusion. Contrary to pre-Java 5 experience, you can get a NullPointerException on the 2nd line below.
Long msec = getSleepMsec();
Thread.sleep(msec);
If getSleepTime() returns a null, unboxing throws.
The default hash is non-deterministic, so if used for objects in a HashMap, the ordering of entries in that map can change from run to run.
As a simple demonstration, the following program can give different results depending on how it is run:
public static void main(String[] args) {
System.out.println(new Object().hashCode());
}
How much memory is allocated to the heap, or whether you're running it within a debugger, can both alter the result.
When you create a duplicate or slice of a ByteBuffer, it does not inherit the value of the order property from the parent buffer, so code like this will not do what you expect:
ByteBuffer buffer1 = ByteBuffer.allocate(8);
buffer1.order(ByteOrder.LITTLE_ENDIAN);
buffer1.putInt(2, 1234);
ByteBuffer buffer2 = buffer1.duplicate();
System.out.println(buffer2.getInt(2));
// Output is "-771489792", not "1234" as expected
Among the common pitfalls, well known but still biting occasionally programmers, there is the classical if (a = b) which is found in all C-like languages.
In Java, it can work only if a and b are boolean, of course. But I see too often newbies testing like if (a == true) (while if (a) is shorter, more readable and safer...) and occasionally writing by mistake if (a = true), wondering why the test doesn't work.
For those not getting it: the last statement first assign true to a, then do the test, which always succeed!
-
One that bites lot of newbies, and even some distracted more experienced programmers (found it in our code), the if (str == "foo"). Note that I always wondered why Sun overrode the + sign for strings but not the == one, at least for simple cases (case sensitive).
For newbies: == compares references, not the content of the strings. You can have two strings of same content, stored in different objects (different references), so == will be false.
Simple example:
final String F = "Foo";
String a = F;
String b = F;
assert a == b; // Works! They refer to the same object
String c = "F" + F.substring(1); // Still "Foo"
assert c.equals(a); // Works
assert c == a; // Fails
-
And I also saw if (a == b & c == d) or something like that. It works (curiously) but we lost the logical operator shortcut (don't try to write: if (r != null & r.isSomething())!).
For newbies: when evaluating a && b, Java doesn't evaluate b if a is false. In a & b, Java evaluates both parts then do the operation; but the second part can fail.
[EDIT] Good suggestion from J Coombs, I updated my answer.
The non-unified type system contradicts the object orientation idea. Even though everything doesn't have to be heap-allocated objects, the programmer should still be allowed to treat primitive types by calling methods on them.
The generic type system implementation with type-erasure is horrible, and throws most students off when they learn about generics for the first in Java: Why do we still have to typecast if the type parameter is already supplied? Yes, they ensured backward-compatibility, but at a rather silly cost.
Going first, here's one I caught today. It had to do with Long/long confusion.
public void foo(Object obj) {
if (grass.isGreen()) {
Long id = grass.getId();
foo(id);
}
}
private void foo(long id) {
Lawn lawn = bar.getLawn(id);
if (lawn == null) {
throw new IllegalStateException("grass should be associated with a lawn");
}
}
Obviously, the names have been changed to protect the innocent :)
Another one I'd like to point out is the (too prevalent) drive to make APIs generic. Using well-designed generic code is fine. Designing your own is complicated. Very complicated!
Just look at the sorting/filtering functionality in the new Swing JTable. It's a complete nightmare. It's obvious that you are likely to want to chain filters in real life but I have found it impossible to do so without just using the raw typed version of the classes provided.
System.out.println(Calendar.getInstance(TimeZone.getTimeZone("Asia/Hong_Kong")).getTime());
System.out.println(Calendar.getInstance(TimeZone.getTimeZone("America/Jamaica")).getTime());
The output is the same.
I had some fun debugging a TreeSet once, as I was not aware of this information from the API:
Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the Set interface. (See Comparable or Comparator for a precise definition of consistent with equals.) This is so because the Set interface is defined in terms of the equals operation, but a TreeSet instance performs all key comparisons using its compareTo (or compare) method, so two keys that are deemed equal by this method are, from the standpoint of the set, equal. The behavior of a set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Set interface.
http://download.oracle.com/javase/1.4.2/docs/api/java/util/TreeSet.html
Objects with correct equals/hashcode implementations were being added and never seen again as the compareTo implementation was inconsistent with equals.
IMHO
1. Using vector.add(Collection) instead of vector.addall(Collection). The first adds the collection object to vector and second one adds the contents of collection.
2. Though not related to programming exactly, the use of xml parsers that come from multiple sources like xerces, jdom. Relying on different parsers and having their jars in the classpath is a nightmare.

Categories

Resources