What i would like is to create a Hashtable that takes a pair of integers as a key and maps it to a pair of (Integer,String).So having 2 specific integers i would like to be able to acquire both the Integer and the String they map to.What i have done so far:
class Tuple {
public Tuple (int x, int y) {
this.x = x;
this.y = y;
}
public int k;
#Override
public int hashCode() {
int hash = 17;
hash = 5 * hash + this.x;
hash = 5 * hash + this.y;
return hash;
}
#Override
public boolean equals(Object o) {
if (o == this) {
return true;
}
if (!(o instanceof Tuple)) {
return false;
}
Tuple c = (Tuple) o;
return Double.compare(x, c.x) == 0
&& Double.compare(y, c.y) == 0;
}
private int x;
private int y;
}
I created a Tuple class for the key and i initialized the Hashtable like this:
HashMap<Tuple, String> seen = new HashMap<Tuple,String>();
where as a String i put String1:"Integer+"#"+String".
So for a specific key (i,j) i can retrieve this String and separate it to Integer and String.For many and very large integers though String1 takes up a lot of memory since i think for every additional character this string need 2 more bytes.So if i want to store 100000 it would need 12 bytes instead of 4(int's bytes).I tried creating a similar class with Tuple with y being String but i can't retrieve both values only by having (i,j) key.Any ideas?
Related
If I was using a Set similar to this:
Set<node> s=new TreeSet<node>();
class node {
private int x;
private int y;
}
Would this be acceptable, and since it's a TreeSet, would it also sort it?
It's not going to be able to sort it without you implementing Comparable<Node>, and it won't really be an appropriate for set operations until you override equals() and hashCode(). (You don't have to override equals and hashCode for TreeSet to work, but it would make sense to do so.)
Something like this:
final class Node implements Comparable<Node> {
private final int x;
private final int y;
Node(int x, int y) {
this.x = x;
this.y = y;
}
#Override public boolean equals(Object other) {
if (!(other instanceof Node)) {
return false;
}
Node otherNode = (Node) other;
return x == otherNode.x && y == otherNode.y;
}
#Override public int hashCode() {
return x * 31 + y * 17; // For example...
}
#Override public int compareTo(Node other) {
// As of Java 7, this can be replaced with
// return x != other.x ? Integer.compare(x, other.x)
// : Integer.compare(y, other.y);
if (x < other.x || (x == other.x && y < other.y)) {
return -1;
}
return x == other.x && y == other.y ? 0 : 1;
}
}
(Note that by convention the class name would be Node, not node.)
Node needs to implement a Comparable or you need to pass a custom Comparator which can compare two Node objects. Also, any hash based collection relies on the object suitably overriding equals() and hashcode() method.
You have to specify equals, hashCode and implement the Comparable interface
There is nothing wrong with the code as for as acceptance is concerned. But for sorting Node class MUST implement comparable interface.
I have a class with two float variables and hashCode method (without equals in current code snippet):
public class TestPoint2D {
private float x;
private float z;
public TestPoint2D(float x, float z) {
this.x = x;
this.z = z;
}
#Override
public int hashCode() {
int result = (x != +0.0f ? Float.floatToIntBits(x) : 0);
result = 31 * result + (z != +0.0f ? Float.floatToIntBits(z) : 0);
return result;
}
}
The following test
#Test
public void tempTest() {
TestPoint2D p1 = new TestPoint2D(3, -1);
TestPoint2D p2 = new TestPoint2D(-3, 1);
System.out.println(p1.hashCode());
System.out.println(p2.hashCode());
}
returns same values:
-2025848832
In this case I can't use my TestPoint2D within HashSet / HashMap
Can anyone suggest how to implement hashCode in this case or workarounds related to this?
P.S.
Added one more test:
#Test
public void hashCodeTest() {
for (float a = 5; a < 100000; a += 1.5f) {
float b = a + 1000 / a; // negative value depends on a
TestPoint3D p1 = new TestPoint3D(a, -b);
TestPoint3D p2 = new TestPoint3D(-a, b);
Assert.assertEquals(p1.hashCode(), p2.hashCode());
}
}
And it is passed that proves that
TestPoint2D(a, -b).hashCode() == TestPoint2D(-a, b).hashCode()
I would use Objects.hash():
public int hashCode() {
return Objects.hash(x, z);
}
From the Javadoc:
public static int hash(Object... values)
Generates a hash code for a sequence of input values. The hash code is generated as if all the input values were placed into an array, and that array were hashed by calling Arrays.hashCode(Object[]).
This method is useful for implementing Object.hashCode() on objects containing multiple fields. For example, if an object that has three fields, x, y, and z, one could write:
These auto-generated hashcode functions are not very good.
The problem is that small integers cause very "sparse" and similar bitcodes.
To understand the problem, look at the actual computation.
System.out.format("%x\n", Float.floatToIntBits(1));
System.out.format("%x\n", Float.floatToIntBits(-1));
System.out.format("%x\n", Float.floatToIntBits(3));
System.out.format("%x\n", Float.floatToIntBits(-3));
gives:
3f800000
bf800000
40400000
c0400000
As you can see, the - is the most significant bit in IEEE floats. Multiplication with 31 changes them not substantially:
b0800000
30800000
c7c00000
47c00000
The problem are all the 0s at the end. They get preserved by integer multiplication with any prime (because they are base-2 0s, not base-10!).
So IMHO, the best strategy is to employ bit shifts, e.g.:
final int h1 = Float.floatToIntBits(x);
final int h2 = Float.floatToIntBits(z);
return h1 ^ ((h2 >>> 16) | (h2 << 16));
But you may want to look at Which hashing algorithm is best for uniqueness and speed? and test for your particular case of integers-as-float.
according to the java specification, 2 objects can have the same hashCode and this doesnt mean they are equal...
the probability is small but exist...
on the other hand is always a good practice to override both equals and hashcode...
As I understand the problem, you expect a lot of symmetrical pairs of points among your keys, so you need a hashCode method that does not tend to give them the same code.
I did some tests, and deliberately giving extra significance to the sign of x tends to map symmetrical points away from each other. See this test program:
public class Test {
private float x;
private float y;
public static void main(String[] args) {
int collisions = 0;
for (int ix = 0; ix < 100; ix++) {
for (int iz = 0; iz < 100; iz++) {
Test t1 = new Test(ix, -iz);
Test t2 = new Test(-ix, iz);
if (t1.hashCode() == t2.hashCode()) {
collisions++;
}
}
}
System.out.println(collisions);
}
public Test(float x, float y) {
super();
this.x = x;
this.y = y;
}
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = (x >= 0) ? 1 : -1;
result = prime * result + Float.floatToIntBits(x);
result = prime * result + Float.floatToIntBits(y);
return result;
}
// Equals omitted for compactness
}
Without the result = (x >= 0) ? 1 : -1; line it is the hashCode() generated by Eclipse, and counts 9802 symmetrical point collisions. With that line, it counts one symmetrical point collision.
So, I am writing a Befunge Interpreter in Java. I have almost all of it down, except I can't figure out a good solution to the problem of Funge Space. Currently I'm using the style stated in the Befunge 93 specs, which is a 80x25 array to hold the code.
In Funge, though, I'm supposed to have an "infinite" array of code (or 4,294,967,296 x 4,294,967,296, which is -2,147,483,648 to 2,147,483,648 in both dimensions), but obviously it's never a good idea to have that much space allocated. But as well as this, it doesn't seem like a good idea to create a new array and copy every character into it every time the program steps out of bounds. Is there a solution to this problem that I'm missing?
So basically, my problem is that I need to somehow expand the array every time I reach out of bounds, or use some sort of other data structure. Any suggestions?
Funge 98 specs
Also, by the way, I still have never figure out how to pronounce Befunge or Funge, I always just say it like "Bee-funj" and "funj"
Without having read the specs (no - I mean, just NO!): A 4,294,967,296 x 4,294,967,296 array is obviously a highly theoretical construct, and only a tiny fraction of these "array entries" can and will ever be used.
Apart from that: Regardless of whether you use an array or any other collection, you'll have a problem with indexing: Array indices can only be int values, but 4,294,967,296 is twice as large as Integer.MAX_VALUE (there are no unsigned ints in Java).
However, one way of representing such an "infinitely large" sparse 2D array would be a map that maps pairs of long values (the x and y coordinates) to the array entries. Roughly like this:
import java.util.HashMap;
import java.util.Map;
interface Space<T>
{
void set(long x, long y, T value);
T get(long x, long y);
}
class DefaultSpace<T> implements Space<T>
{
private final Map<LongPair, T> map = new HashMap<LongPair, T>();
#Override
public void set(long x, long y, T value)
{
LongPair key = new LongPair(x,y);
if (value == null)
{
map.remove(key);
}
else
{
map.put(key, value);
}
}
#Override
public T get(long x, long y)
{
return map.get(new LongPair(x,y));
}
}
class LongPair
{
private final long x;
private final long y;
LongPair(long x, long y)
{
this.x = x;
this.y = y;
}
#Override
public String toString()
{
return "("+x+","+y+")";
}
#Override
public int hashCode()
{
final int prime = 31;
int result = 1;
result = prime * result + (int) (x ^ (x >>> 32));
result = prime * result + (int) (y ^ (y >>> 32));
return result;
}
#Override
public boolean equals(Object obj)
{
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
LongPair other = (LongPair) obj;
if (x != other.x)
return false;
if (y != other.y)
return false;
return true;
}
}
I was trying to do research on hashmap and came up with the following analysis:
https://stackoverflow.com/questions/11596549/how-does-javas-hashmap-work-internally/18492835#18492835
Q1 Can you guys show me a simple map where you can show the process..that how hashcode for the given key is calculated in detail by using this formula ..Calculate position hash % (arrayLength-1)) where element should be placed(bucket number), let say I have this hashMap
HashMap map=new HashMap();//HashMap key random order.
map.put("Amit","Java");
map.put("Saral","J2EE");
Q2 Sometimes it might happen that hashCodes for 2 different objects are the same. In this case 2 objects will be saved in one bucket and will be presented as LinkedList. The entry point is more recently added object. This object refers to other objest with next field and so one. Last entry refers to null. Can you guys show me this with real example..!!
.
"Amit" will be distributed to the 10th bucket, because of the bit twiddeling. If there were no bit twiddeling it would go to the 7th bucket, because 2044535 & 15 = 7. how this is possible please explanin detail the whole calculation..?
Snapshots updated...
and the other image is ...
that how hashcode for the given key is calculated in detail by using
this formula
In case of String this is calculated by String#hashCode(); which is implemented as follows:
public int hashCode() {
int h = hash;
int len = count;
if (h == 0 && len > 0) {
int off = offset;
char val[] = value;
for (int i = 0; i < len; i++) {
h = 31*h + val[off++];
}
hash = h;
}
return h;
}
Basically following the equation in the java doc
hashcode = s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
One interesting thing to note on this implementation is that String actually caches its hash code. It can do this, because String is immutable.
If I calculate the hashcode of the String "Amit", it will yield to this integer:
System.out.println("Amit".hashCode());
> 2044535
Let's get through a simple put to a map, but first we have to determine how the map is built.
The most interesting fact about a Java HashMap is that it always has 2^n buckets. So if you call it, the default number of buckets is 16, which is obviously 2^4.
Doing a put operation on this map, it will first get the hashcode of the key. There happens some fancy bit twiddeling on this hashcode to ensure that poor hash functions (especially those that do not differ in the lower bits) don't "overload" a single bucket.
The real function that is actually responsible for distributing your key to the buckets is the following:
h & (length-1); // length is the current number of buckets, h the hashcode of the key
This only works for power of two bucket sizes, because it uses & to map the key to a bucket instead of a modulo.
"Amit" will be distributed to the 10th bucket, because of the bit twiddeling. If there were no bit twiddeling it would go to the 7th bucket, because 2044535 & 15 = 7.
Now that we have an index for it, we can find the bucket. If the bucket contains elements, we have to iterate over them and replace an equal entry if we find it.
If none item has been found in the linked list we will just add it at the beginning of the linked list.
The next important thing in HashMap is the resizing, so if the actual size of the map is above over a threshold (determined by the current number of buckets and the loadfactor, in our case 16*0.75=12) it will resize the backing array.
Resize is always 2 * the current number of buckets, which is guranteed to be a power of two to not break the function to find the buckets.
Since the number of buckets change, we have to rehash all the current entries in our table.
This is quite costly, so if you know how many items there are, you should initialize the HashMap with that count so it does not have to resize the whole time.
Q1: look at hashCode() method implementation for String object
Q2: Create simple class and implement its hashCode() method as return 1. That means each your object with that class will have the same hashCode and therefore will be saved in the same bucket in HashMap.
Understand that there are two basic requirements for a hash code:
When the hash code is recalculated for a given object (that has not been changed internally in a way that would alter its identity) it must produce the same value as the previous calculation. Similarly, two "identical" objects must produce the same hash codes.
When the hash code is calculated for two different objects (which are not considered "identical" from the standpoint of their internal content) there should be a high probability that the two hash codes would be different.
How these goals are accomplished is the subject of much interest to the math nerds who work on such things, but understanding the details is not at all important to understanding how hash tables work.
import java.util.Arrays;
public class Test2 {
public static void main(String[] args) {
Map<Integer, String> map = new Map<Integer, String>();
map.put(1, "A");
map.put(2, "B");
map.put(3, "C");
map.put(4, "D");
map.put(5, "E");
System.out.println("Iterate");
for (int i = 0; i < map.size(); i++) {
System.out.println(map.values()[i].getKey() + " : " + map.values()[i].getValue());
}
System.out.println("Get-> 3");
System.out.println(map.get(3));
System.out.println("Delete-> 3");
map.delete(3);
System.out.println("Iterate again");
for (int i = 0; i < map.size(); i++) {
System.out.println(map.values()[i].getKey() + " : " + map.values()[i].getValue());
}
}
}
class Map<K, V> {
private int size;
private Entry<K, V>[] entries = new Entry[16];
public void put(K key, V value) {
boolean flag = true;
for (int i = 0; i < size; i++) {
if (entries[i].getKey().equals(key)) {
entries[i].setValue(value);
flag = false;
break;
}
}
if (flag) {
this.ensureCapacity();
entries[size++] = new Entry<K, V>(key, value);
}
}
public V get(K key) {
V value = null;
for (int i = 0; i < size; i++) {
if (entries[i].getKey().equals(key)) {
value = entries[i].getValue();
break;
}
}
return value;
}
public boolean delete(K key) {
boolean flag = false;
Entry<K, V>[] entry = new Entry[size];
int j = 0;
int total = size;
for (int i = 0; i < total; i++) {
if (!entries[i].getKey().equals(key)) {
entry[j++] = entries[i];
} else {
flag = true;
size--;
}
}
entries = flag ? entry : entries;
return flag;
}
public int size() {
return size;
}
public Entry<K, V>[] values() {
return entries;
}
private void ensureCapacity() {
if (size == entries.length) {
entries = Arrays.copyOf(entries, size * 2);
}
}
#SuppressWarnings("hiding")
public class Entry<K, V> {
private K key;
private V value;
public K getKey() {
return key;
}
public V getValue() {
return value;
}
public void setValue(V value) {
this.value = value;
}
public Entry(K key, V value) {
super();
this.key = key;
this.value = value;
}
}
}
This has been asked several times, i know, but help me understand something.
You have a map you need sorted by Value
Map<String, Integer> m = new HashMap<String, Integer>();
m.put("a", 1);
m.put("b", 13);
m.put("c", 22);
m.put("d", 2);
You call a method to make it happen
public static List<String> sortByValue(final Map<String, Integer> unsortedMap) {
List<String> sortedKeys = new ArrayList<String>();
sortedKeys.addAll(unsortedMap.keySet());
Collections.sort(sortedKeys, new MapComparator(unsortedMap));
return sortedKeys;
}
You have a comparator class
public MapComparator(Map<String, Integer> m) {
this.m = m;
}
#Override
public int compare(String a, String b) {
int x = m.get(a);
int y = m.get(b);
if (x > y)
return x;
if (y > x)
return y;
return 0;
}
This code, obviously is flawed. Please help me understand why?
if (x > y)
return x;
if (y > x)
return y;
return 0;
You should be returning 1 if x > y and -1 if y > x. The Comparator contract specifies that you return a negative number if the first value is less than the second, a positive number if the first is greater than the second, and zero if they're equal.
(Mind you, as it stands, this Comparator implementation will break in very confusing ways if you ever happen to use values that aren't in the original map.)
Better yet, just return Integer.compare(x, y), which does all that for you. (Only in Java 7, though.)
#Override
public int compare(String a, String b) {
Integer x = m.get(a);
Integer y = m.get(b);
return x.compareTo(y);
}
Since you have Integer objects as values, you can use the implicit method to compare the objects and return 1, 0 or -1.
Comparators don't return the greater or lesser value. They return a negative value to indicate less-than or a positive value to indicate greater than.
if (x > y)
return x;
if (y > x)
return y;
return 0;
should probably be
if (x > y)
return -1;
if (y > x)
return 1;
return 0;
Your comparator only ever indicates that the values are equal or that the left is greater than the right.
Consider the case where x is 1 and y is 2. Your comparator will return 2—a positive number—when it should have returned a negative number.
I recommend that you study the Comparator interface documentation again to see which part of the contract you're missing out on here.
public static List<String> sortByValue(final Map<String, Integer> unsortedMap) {
List<String> sortedKeys = new ArrayList<String>();
sortedKeys.addAll(unsortedMap.keySet());
Collections.sort(sortedKeys, new Comparator<String>(){
public int compare(String s1, String s2) {
return unsortedMap.get(s1).compareTo(unsortedMap.get(s2));
}});
return sortedKeys;
}