What is a best practice of writing hash function in java? - java

I'm wondering what is the best practice for writing #hashCode() method in java.
Good description can be found here. Is it that good?

Here's a quote from Effective Java 2nd Edition, Item 9: "Always override hashCode when you override equals":
While the recipe in this item yields reasonably good hash functions, it does not yield state-of-the-art hash functions, nor do Java platform libraries provide such hash functions as of release 1.6. Writing such hash functions is a research topic, best left to mathematicians and computer scientists. [... Nonetheless,] the techniques described in this item should be adequate for most applications.
Josh Bloch's recipe
Store some constant nonzero value, say 17, in an int variable called result
Compute an int hashcode c for each field f that defines equals:
If the field is a boolean, compute (f ? 1 : 0)
If the field is a byte, char, short, int, compute (int) f
If the field is a long, compute (int) (f ^ (f >>> 32))
If the field is a float, compute Float.floatToIntBits(f)
If the field is a double, compute Double.doubleToLongBits(f), then hash the resulting long as in above
If the field is an object reference and this class's equals method compares the field by recursively invoking equals, recursively invoke hashCode on the field. If the value of the field is null, return 0
If the field is an array, treat it as if each element is a separate field. If every element in an array field is significant, you can use one of the Arrays.hashCode methods added in release 1.5
Combine the hashcode c into result as follows: result = 31 * result + c;
Now, of course that recipe is rather complicated, but luckily, you don't have to reimplement it every time, thanks to java.util.Arrays.hashCode(Object[]).
#Override public int hashCode() {
return Arrays.hashCode(new Object[] {
myInt, //auto-boxed
myDouble, //auto-boxed
myString,
});
}
As of Java 7 there is a convenient varargs variant in java.util.Objects.hash(Object...).

A great reference for an implementation of hashCode() is described in the book Effective Java. After you understand the theory behind generating a good hash function, you may check HashCodeBuilder from Apache commons lang, which implements what's described in the book. From the docs:
This class enables a good hashCode
method to be built for any class. It
follows the rules laid out in the book
Effective Java by Joshua Bloch.
Writing a good hashCode method is
actually quite difficult. This class
aims to simplify the process.

It's good, as #leonbloy says, to understand it well. Even then, however, one "best" practice is to simply let your IDE write the function for you. It won't be optimal under some circumstances - and in some very rare circumstances it won't even be good - but for most situations, it's easy, repeatable, error-free, and as good (as a hash code) as it needs to be. Sure, read the docs and understand it well - but don't complicate it unnecessarily.

Related

Why does the instance method `hashCode` on `java.lang.Integer` make an extra jump to the static class method to simply return its own integer value?

I was just exploring different kinds of implementations to the hashCode() method. I opened up the java.lang.Integer class and found this implementation for hashCode():
public int hashCode() {
return Integer.hashCode(value);
}
public static int hashCode(int value) {
return value;
}
My question is, why can't the implementation be as simple as:
public int hashCode(){
return this.value;
}
What is the need to create an additional static method to pass around the value and return the same? Am I overlooking any important detail here?
That code does look odd when viewed on its own.
But notice that the static method java.lang.Integer.hashCode:
was added later, in Java 8
is public
The source code in Java 14 shows no comments to explain why this static method was added. Because the method is public, I presume this new static method plays a part in some new feature in Java 8, perhaps related to streams, called elsewhere in the OpenJDK codebase.
As noted in the Javadoc, the source code of the existing Integer::hashCode instance method was rewritten to call the static hashCode simply for consistency. This way there is only one place where the hash code is actually being generated. Having only one place is wise for review and maintenance of the codebase.
Making hashCode static is certainly unusual. The purpose of the hashCode method is to identify one object of that class to another for use in collections such as HashSet or HashMap. Given that we are comparing instances by the method, it makes sense for hashCode to be an instance method rather than static.
The optimizing compiler such as HotSpot or OpenJ9 is likely to inline the hashCode method calls, making moot the instance-method versus static-method arrangement in source code.
#Basil Bourque's answer covers just about everything. But he leaves open the question of why the public static void hashCode(int) was added.
The change was made in this changeset in November 2012
http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/be1fb42ef696/src/share/classes/java/lang/Integer.java
The title and summary for the changeset say this:
7088913: Add compatible static hashCode(primitive) to primitive wrapper classes
Summary: Adds static utility methods to each primitive wrapper class to allow calculation of a hashCode value from an unboxed primitive.
Note that the changeset does not document the motivation for the change.
I infer that one purpose of the enhancement is to avoid the application programmer having to know how the primitive wrapper classes are computed. Prior to Java 8, to compute the wrapper-compatible hash code for a primitive int, the programmer would have to have written either
int value = ...
int hash = ((Integer) value).hashCode(); // Facially inefficient (depending on
// JIT compiler's ability to get
// rid of the box/unbox sequence)
or
int value = ...
int hash = value; // Hardwires knowledge of how
// Integer.hashCode() is computed.
While the "knowledge" is trivial for int / Integer, consider the case of double / Double where the hash code computation is:
long bits = doubleToLongBits(value);
return (int)(bits ^ (bits >>> 32));
It seems likely that this changeset was also motivated by the Streams project; e.g. so that Integer::hashCode can be used in a stream of integers.
However, the changeset that added sum, min and max for use in stream reductions happened a couple of months after this one. So we cannot definitively make the connection ... based on this evidence.

Java custom numeric interface - square root

The question is on the strategy approach to the problem of defining a square root algorithm in a generic numerical interface. I am aware of the existance of algorithms solving the problem with different conditions. I'm interested in algorithms that:
Solves the problem using only selected functions;
Doesn't care if the objects manipulated are integers, floating points or other, provided those objects can be added, mutiplied and confronted;
Returns an exact solution if the input is a perfect square.
Because the subtlety of the distintion and for the sake of clarity, I will define the problem in a very verbose way. Beware the wall text!
Suppose to have a Java Interface Constant<C extends Constant<C>> with the following abstract methods, that we will call base functions:
C add(C a);
C subtract(C a);
C multiply(C a);
C[] divideAndRemainder(C b);
C additiveInverse();
C multiplicativeInverse();
C additiveIdentity();
C multiplicativeIdentity();
int compareTo(C arg1);
Is not known if C represents an integer or a floating point, nor this must be relevant in the following discussion.
Using only those methods is possible to create static or default implementation of some mathematical algorithm regarding numbers: for example, dividerAndRemainder(C b); and compareTo(C arg1); allow to create algorithms for the greater common divisor, the bezout identity, etc etc...
Now suppose our Interface has a default method for the exponentiation:
public default C pow(int n){
if(n < 0) return this.additiveInverse().pow(-n);
if(n == 0) return additiveIdentity();
int m = n;
C output = this;
while(m > 1)
{
if(m%2 == 0) output = output.multiply(output);
else output = this.multiply(output.multiply(output));
m = m/2;
}
return output;
}
The goal is to define two default method called C root(int n) and C maximumErrorAllowed() such that:
x.equals(y.pow(n)) implies x.root(n).equals(y);
C root(int n); is actually implemented using only base functions and methods created from the base functions;
The interface can still be applied to any kind of numbers, including but not limiting at both integers and floating points.
this.root(n).pow(n).compareTo(maximumErrorAllowed()) == -1 for all this such that this.root(n)!=null, i.e. any eventual approximation has an error minor than C maximumErrorAllowed();
Is that possible? If yes, how and what would be an estimation of the computational complexity?
I went through some time working on a custom number interface for Java, it's amazingly hard--one of the most disappointing experiences I've had with Java.
The problem is that you have to start over from scratch--you can't really re-use anything in Java, so if you want to have implementations for int, float, long, BigInteger, rational, Complex and Vector you have to implement all the methods yourself for every single class, and then don't expect the Math package to be of much help.
It got particularly nasty implementing the "Composed" classes like "Complex" which is made from two of the "Generic" floating point types, or "Rational" which composes two generic integer types.
And math operators are right out--this can be especially frustrating.
The way I got it to work reasonably well was to implement the classes in Java and then write some of the higher-level stuff in Groovy. If you name the operations correctly, Groovy can just pick them up, like if your class implements ".plus()" then groovy will let you do instance1+instance2.
IIRC because of being dynamic, Groovy often handled cross-class pieces nicely, like if you said Complex + Integer you could supply a conversion from Integer to complex and groovy would promote Integer to Complex to do the operation and return a complex.
Groovy is pretty interchangeable with Java, You can usually just rename a Java class ".groovy" and compile it and it will work, so it was a pretty good compromise.
This was a long time ago though, now you might get some traction with Java 8's default methods in your "Number" interface--that could make implementing some of the classes easier but might not help--I'd have to try it again to find out and I'm not sure I want to re-open that can o' worms.
Is that possible? If yes, how?
In theory, yes. There are approximation algorithms for root(), for example the n-th root algorithm. You will run into problems with precision, however, which you might want to solve on a case-by-case basis (i. e. use a look-up table for integers). As such, I'd recommend against a default implementation in an interface.
What would be an estimation of the computational complexity?
This, too, is implementation varies based on your type of number, and is dependant on your precision. For integers, you can create an implementation with a look-up table, and the complexity would be O(1).
If you want a better answer for the complexity of the operation itself, you might want to check out Computational complexity of calculating the nth root of a real number.

Explaination for IntelliJ default hashCode() implementation

I took a look at the IntelliJ default hashCode() implementation and was wondering, why they implemented it the way they did. I'm quite new to the hash concept and found some contradictory statements, that need clarification:
public int hashCode(){
// creationDate is of type Date
int result = this.creationDate != null ? this.creationDate.hashCode() : 0;
// id is of type Long (wrapper class)
result = 31 * result + (this.id != null ? this.id.hashCode() : 0);
// code is of type String
result = 31 * result + (this.code != null ? this.code.hashCode() : 0);
// revision is of type int
result = 31 * result + this.revision;
return result;
}
Imo, the best source about this topic seemed to be this Java world article because I found their arguments most convincing. So I was wondering:
Among other arguments, above source states that multiplication is one of the slower operations. So, wouldn't it be better to skip the multiplication with a prime number whenever I call the hashCode() method of a reference type? Because most of the time this already includes such a multiplication.
Java world states that bitwise XOR ^ also improves the computation due to not mentioned reasons : ( What exactly might be an advantage in comparison to regular addition?
Wouldn't it be better to return different values when the respective class field is null? It would make the result more distinguishable, wouldn't it? Are there any huge disadvantages to use non-zero values?
Their example code looks more appealing to my eye, tbh:
public boolean hashCode() {
return
(name == null ? 17 : name.hashCode()) ^
(birth == null ? 31 : name.hashCode());
}
But I'm not sure if that's objectively true. I'm also a little bit suspicious of IntelliJ because their default code for equals(Object) compares by instanceof instead of comparing the instance classes directly. And I agree with that Java world article that this doesn't seem to fulfill the contract correctly.
As for hashCode(), I would consider it more important to minimize collisions (two different objects having same hashCode()) than the speed of the hashCode() computation. Yes, the hashCode() should be fast (constant-time if possible), but for huge data structures using hashCode() (maps, sets etc.) the collisions are more important factor.
If your hashCode() function performs in constant time (independent on data and input size) and produces a good hashing function (few collisions), asymptotically the operations (get, contains, put) on map will perform in constant time.
If your hashCode() function produces a lot of collisions, the performance will suffer. In extreme case, you can always return 0 from hashCode() - the function itself will be super-fast, but the map operations will perform in linear time (i.e. growing with map size).
Multiplying the hashCode() before adding another field's sub-hashCode should usually provide for less collisions - this is a heuristic based on that often the fields contain similar data / small numbers.
Consider an example of class Person:
class Person {
int age;
int heightCm;
int weightKg;
}
If you just added the numbers together to compute the hashCode, the result would be somewhere between 60 and 500 for all persons. If you multiply it the way Idea does, you will get hashCodes between 2000 and more than 100000 - much bigger space and therefore lower chance of collisions.
Using XOR is not a very good idea, for example if you have class Rectangle with fields height and width, all squares would have the same hashCode - 0.
As for equals() using instanceof vs. getClass().equals(), I've never seen a conclusive debate on this. Both have their advantages and disadvantages, and both ways can cause troubles if you're not careful:
If you use instanceof, any subclass that overrides your equals() will likely break the symmetry requirement
If you use getClass().equals(), this will not work well with some frameworks like Hibernate that produce their own subclasses of your classes to store their own technical information

A more fundamental reason Java does not include operator overloading (at least for assignment)?

Referring to a ~2-year old discussion of the fact that there is no operator overloading in Java ( Why doesn't Java offer operator overloading? ), and coming from many intense C++ years myself to Java, I wonder whether there is a more fundamental reason that operator overloading is not part of the Java language, at least in the case of assignment, than the highest-rated answer in that link states near the bottom of the answer (namely, that it was James Gosling's personal choice).
Specifically, consider assignment.
// C++
#include <iostream>
class MyClass
{
public:
int x;
MyClass(const int _x) : x(_x) {}
MyClass & operator=(const MyClass & rhs) {x=rhs.x; return *this;}
};
int main()
{
MyClass myObj1(1), myObj2(2);
MyClass & myRef = myObj1;
myRef = myObj2;
std::cout << "myObj1.x = " << myObj1.x << std::endl;
std::cout << "myObj2.x = " << myObj2.x << std::endl;
return 0;
}
The output is:
myObj1.x = 2
myObj2.x = 2
In Java, however, the line myRef = myObj2 (assuming the declaration of myRef in the previous line was myClass myRef = myObj1, as Java requires, since all such variables are automatically Java-style 'references') behaves very differently - it would not cause myObj1.x to change and the output would be
myObj1.x = 1
myObj2.x = 2
This difference between C++ and Java leads me to think that the absence of operator overloading in Java, at least in the case of assignment, is not a 'matter of personal choice' on the part of James Gosling, but rather a fundamental necessity given Java's syntax that treats all object variables as references (i.e. MyClass myRef = myObj1 defines myRef to be a Java-style reference). I say this because if assignment in Java causes the left-hand side reference to refer to a different object, rather than allowing the possibility that the object itself change its value, then it would seem that there is no possibility of providing an overloaded assignment operator.
In other words - it's not simply a 'choice', and there's not even the possibility of 'holding your breath' with the hope that it will ever be introduced, as the aforementioned high-rated answer also states (near the end). Quoting: "The reasons for not adding them now could be a mix of internal politics, allergy to the feature, distrust of developers (you know, the saboteur ones), compatibility with the previous JVMs, time to write a correct specification, etc.. So don't hold your breath waiting for this feature.". <-- So this isn't correct, at least for the assignment operator: the reason there's no operator overloading (at least for assignment) is instead fundamental to the nature of Java.
Is this a correct assessment on my part?
ADDENDUM
Assuming the assignment operator is a special case, then my follow-up question is: Are there any other operators, or more generally any other language features, that would by necessity be affected in a similar way as the assignment operator? I would like to know how 'deep' the difference goes between Java and C++ regarding variables-as-values/references. i.e., in C++, variable tokens represent values (and note, even if the variable token was declared initially as a reference, it's still treated as a value essentially wherever it's used), whereas in Java, variable tokens represent honest-to-goodness references wherever the token is later used.
There is a big misconception when talking about similarities and differences between Java and C++, that arises in your question. C++ references and Java references are not the same. In Java a reference is a resettable proxy to the real object, while in C++ a reference is an alias to the object. To put it in C++ terms, a Java references is a garbage collected pointer not a reference. Now, going back to your example, to write equivalent code in C++ and Java you would have to use pointers:
int main() {
type a(1), b(2);
type *pa = &a, *pb = &b;
pa = pb;
// a is still 1, b is still 2, pa == pb == &b
}
Now the examples are the same: the assignment operator is being applied to the pointers to the objects, and in that particular case you cannot overload the operator in C++ either. It is important to note that operator overloading can be easily abused, and that is a good reason to avoid it in the first place. Now if you add the two different types of entities: objects and references, things become more messy to think about.
If you were allowed to overload operator= for a particular object in Java, then you would not be able to have multiple references to the same object, and the language would be crippled:
Type a = new Type(1);
Type b = new Type(2);
a = b; // dispatched to Type.operator=( Type )??
a.foo();
a = new Type(3); // do you want to copy Type(3) into a, or work with a new object?
That in turn would make the type unusable in the language: containers store references, and they reassign them (even the first time just when an object is created), functions don't really use pass-by-reference semantics, but rather pass-by-value the references (which is a completely different issue, again, the difference is void foo( type* ) versus void foo( type& ): the proxy entity is copied, you cannot modify the reference passed in by the caller.
The problem is that the language is trying really hard to hide the fact that a and the object that a refers to are not the same thing (same happens in C#), and that in turn means that you cannot explicitly state that one operation is to be applied to the reference/referent, that is resolved by the language. The outcome of that design is that any operation that can be applied to references can never be applied to the objects themselves.
As of the rest of the operators, the decision is most probably arbitrary, because the language hides the reference/object difference, it could have been designed such that a+b was translated into type* operator+( type*, type* ) by the compiler. Since you cannot use arithmetic then there would be no problem, as the compiler would recognize that a+b is an operation that must be applied to the objects (it does not make sense with references). But then it could be considered a little awkward that you can overload +, but you cannot overload =, ==, !=...
That is the path that C# took, where assignment cannot be overloaded for reference types. Interestingly in C# there are value types, and the set of operators that can be overloaded for reference and value types are different. Not having coded C# in large projects, I cannot really tell whether that potential source of confusion is such or if people are just used to it (but if you search SO, you will find that a few people do ask why X cannot be overloaded in C# for reference types where X is one of the operations that can be applied to the reference itself.
That doesn't explain why they couldn't have allowed overloading of other operators like + or -. Considering James Gosling designed the Java language, and he said it was his personal choice, which he explains in more detail at the link provided in the question you linked, I think that's your answer:
There are some things that I kind of feel torn about, like operator overloading. I left out operator overloading as a fairly personal choice because I had seen too many people abuse it in C++. I've spent a lot of time in the past five to six years surveying people about operator overloading and it's really fascinating, because you get the community broken into three pieces: Probably about 20 to 30 percent of the population think of operator overloading as the spawn of the devil; somebody has done something with operator overloading that has just really ticked them off, because they've used like + for list insertion and it makes life really, really confusing. A lot of that problem stems from the fact that there are only about half a dozen operators you can sensibly overload, and yet there are thousands or millions of operators that people would like to define -- so you have to pick, and often the choices conflict with your sense of intuition. Then there's a community of about 10 percent that have actually used operator overloading appropriately and who really care about it, and for whom it's actually really important; this is almost exclusively people who do numerical work, where the notation is very important to appealing to people's intuition, because they come into it with an intuition about what the + means, and the ability to say "a + b" where a and b are complex numbers or matrices or something really does make sense. You get kind of shaky when you get to things like multiply because there are actually multiple kinds of multiplication operators -- there's vector product, and dot product, which are fundamentally very different. And yet there's only one operator, so what do you do? And there's no operator for square-root. Those two camps are the poles, and then there's this mush in the middle of 60-odd percent who really couldn't care much either way. The camp of people that think that operator overloading is a bad idea has been, simply from my informal statistical sampling, significantly larger and certainly more vocal than the numerical guys. So, given the way that things have gone today where some features in the language are voted on by the community -- it's not just like some little standards committee, it really is large-scale -- it would be pretty hard to get operator overloading in. And yet it leaves this one community of fairly important folks kind of totally shut out. It's a flavor of the tragedy of the commons problem.
UPDATE: Re: your addendum, the other assignment operators +=, -=, etc. would also be affected. You also can't write a swap function like void swap(int *a, int *b);. and other stuff.
Is this a correct assessment on my part?
The lack of operator in general is a "personal choice". C#, which is a very similar language, does allow operator overloading. But you still can't overload assignment. What would that even do in a reference-semantics language?
Are there any other operators, or more
generally any other language features,
that would by necessity be affected in
a similar way as the assignment
operator? I would like to know how
'deep' the difference goes between
Java and C++ regarding
variables-as-values/references.
The most obvious is copying. In a reference-semantics language, clone() isn't that common, and isn't needed at all for immutable types like String. But in C++, where the default assignment semantics are based around copying, copy constructors are very common. And automatically generated if you don't define one.
A more subtle difference is that it's a lot harder for a reference-semantics language to support RAII than a value-semantics language, because object lifetime is harder to track. Raymond Chen has a good explanation.
The reason why operator overloading is abused in C++ language is because it's too complex feature. Here's some aspects of it which makes it complex:
expressions are a tree
operator overloading is the interface/documentation for those expressions
interfaces are basically invisible feature in c++
free functions/static functions/friend functions are a big mess in C++
function prototypes are already complex feature
choice of the syntax for operator overloading is less than ideal
there is no other comparable api in c++ language
user-defined types/function names are handled differently than built-in types/function names in function prototypes
it uses advanced math, like the operator<<(ostream&, ostream & (*fptr)(ostream &));
even the simplest examples of it uses polymorphism
It's the only c++ feature that has 2d array in it
this-pointer is invisible and whether your operators are member functions or outside the class is important choice for programmers
Because of these complexity, very small number of programmers actually understand how it works. I'm probably missing many important aspects of it, but the list above is good indication that it is very complex feature.
Update: some explanation about the #4: the argument pretty much is as follows:
class A { friend void f(); }; class B { friend void f(); }
void f() { /* use both A and B members inside this function */ }
With static functions, you can do this:
class A { static void f(); }; void f() { /* use only class A here */ }
And with free functions:
class A { }; void f() { /* you have no special access to any classes */ }
Update#2: The #10, the example I was thinking looks like this in stdlib:
ostream &operator<<(ostream &o, std::string s) { ... } // inside stdlib
int main() { std::cout << "Hello World" << std::endl; }
Now the polymorphism in this example happens because you can choose between std::cout and std::ofstream and std::stringstream. This is possible because operator<< first parameter takes a reference to ostream. This is normal runtime polymorphism in this example.
Update #3: About the prototypes still. The real interaction between operator overloading and prototypes is because the overloaded operators becomes part of the class' interface. This brings us to the 2d array thing, because inside the compiler the class interface is a 2d data structure which has quite complex data in it, including booleans, types, function names. The rule #4 is needed so that you can choose when your operators are inside this 2d data structure and when they're outside of it. Rule #8 deals with the booleans stored in the 2d data structure. Rule #7 is because class' interface is used to represent elements of an expression tree.

Java, Object.hashCode() result constant across all JVMs/Systems?

Is the output of Object.hashCode() required to be the same on all JVM implementations for the same Object?
For example if "test".hashCode() returns 1 on 1.4, could it potentially return 2 running on 1.6. Or what if the operating systems were different, or there was a different processor architecture between instances?
No. The output of hashCode is liable to change between JVM implementations and even between different executions of a program on the same JVM.
However, in the specific example you gave, the value of "test".hashCode() will actually be consistent because the implementation of hashCode for String objects is part of the API of String (see the Javadocs for java.lang.String and this other SO post).
From the API
The general contract of hashCode is:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.
As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)
No, the result of hashCode() is only constant during a single execution. You should not expect the result of the function to be the same between executions, let alone between JRE versions or platforms.
first of all, the result of hashCode depends heavily on the Object type and its implementation. every class including its subclasses can define its own behavior. you can rely on it following the general contract as outlined in the javadoc as well as in other answers. but the value is not required to stay the same after a VM restart. especially if it depends on the .hashCode implementations of thrid party classes.
when referring to the concrete implementation of the String class, you should not depend on the return value. if you program is executed in a different VM, it could potentially change.
if you refer solely to the Sun Vm, it could be argued that Sun will not break - even badly programmed - existing code. so "test".hashCode() will always return exactly 3556498 for any version of the Sun VM.
if you want to deliberatly shoot yourself in the foot, go ahead and depend on this. people who will need to fix your code running on the "2015 Nintendo Java VM for Hairdryer" will cry out your name at night.
As noted, for many implementations the default behavior of hashCode() is to return the address of the object. Obviously this can be different each time the program is run. This is also consistent with the default behavior of equals(): two objects are equal only if they are the same object (where x and y are both non-null, x.equals(y) if and only if x == y).
For any classes where hashCode() and equals() are overridden, generally they are calculated in a deterministic way based on the values of some or all of the members. Thus, in practice it is likely that if an object in one run of the program can be said to be equal to an object in another run of the program, and the source code is the same (including such things as the source code for String.hashCode() if that is called by the hashCode() override), the hash codes will be the same.
It is not guaranteed, although it is hard to think of a reasonable real-world example.
The only truth: hashcode is the same for the application run. Another run may give other hashcodes.
When you ask for object's hashcode, JVM creates it using one of RNG algorithms and puts it in object's header for future usage.
Just look into get_next_hash function in OpenJDK.
The RNG algorithm is configurable with JVM arg -XX:hashCode=x,
where x is a digit:
0 – Park-Miller RNG (default)
1 – f (address, the global)
2 – constant 1
3 – sequential counter
4 – object's address in heap
5 – Xorshift (the fastest)
When the hashcode equals address in heap - this is sometimes awkward, because GC can move objects to another heap cells etc.

Categories

Resources