I have this hash function that calculates a key based on the product of a words ASCII values. It was working fine when I tested it with small words, but then I tried words from a whole text file and some of them get negative values while others are positive. I understand that this is overflow, but how would I fix it?
EDIT:
Ok, so people are saying that negative hash values are valid. My problem is that I implemented the hash table using an array and I am getting an index out of bounds error due to the negative numbers. What would be the best way to fix this?
public int asciiProduct(String word) {
// sets the calculated value of the word to 0
int wordProductValue = 1;
// for every letter in the word, it gets the ascii value of it
// and multiplies it to the wordProductValue
for (int i = 0; i < word.length(); i++) {
wordProductValue = wordProductValue * (int) word.charAt(i);
}
// the key of the word calculated by this function will be set
// to the modulus of the wordProductValue with the size of the array
int arrayIndex = wordProductValue % (hashArraySize-1);
return arrayIndex;
}
You can just take the absolute value of the result of your integer multiplication - which will overflow to a negative number when the integer value gets too big.
wordProductValue = Math.abs(wordProductValue * (int) word.charAt(i));
However, your hash function using modulo via the % operator should still work even with a negative number.
A negative hash code is perfectly valid. There is nothing wrong with it. No need to "fix".
But may I ask why are you doing this?
word.hashCode() should give you a much better hash value, than this ...
Related
I'd like to know that if I try to get a random integer using the following method, should it return negative value?
int value = new Random().nextInt(bound);
No, Random().nextInt(bound) only produces positive numbers from 0 to the number you have specified. If you want an negative number, you will need to multiply the random number by -1.
int number = new Random().nextInt(bound) * -1;
Random().nextInt() on the other hand can return you a negative number.
If you need mix of positive and negative you could use something like this:
int number = new Random().nextInt(bound);
if (number % 2 == 0) {
number *= -1;
}
If you use Random class from java.util package you are supposed to mention the type of numbers you expect, meanwhile setting an upperbound. Your answer will be anywhere from 0 to less than upperbound. nextInt() from Random class returns and integer value from 0 to the argument-1. Similarly we can use methods as nextDouble and nextLong(). The values returned are always positive or zero. Now if you need negative values we can randomly set a counter for negative numbers. Say, one another integer value which is randomly generated and checking it is odd/even to negate the number.
The other approach is to use Math.random() method. This method returns a number equal to or greater than 0 and less than 1. We can use typecasting for integer random values else by default we get double values.
P.S. Check the oracle documentation for better understanding of these classes and methods.
I am trying to create a method to make some of the word's letters visible and other ones *. This is actually a simple word guessing game. I ask the user to choose whether they want to give an answer or request a letter. For example if the answer is "ball" and user decides to request a word, ball should turn into "*a**".
That is the method I have came up with:
public static void showALetter(String correctAnswer) {
int randomLetterIndex = (int) Math.random() % (correctAnswer.length());
for (int i = 0; i < correctAnswer.length(); i++) {
if (i == randomLetterIndex) {
System.out.print(correctAnswer.charAt(randomLetterIndex));
} else {
System.out.print("*");
}
}
}
It only shows the first letter of the correct answer at every single request. What should I do ?
Math.random() returns a double with a value between zero and one (technically [0.0, 1.0) written as a mathematical interval). This is not what you want, so you instead need to use the newer java.util.Random class:
Random random = new Random();
int randomLetterIndex = random.nextInt(correctAnswer.length());
The random.nextInt(int limit) method will return a value from zero (inclusive) to limit (exclusive) which is what you need here for your puproses.
If you're going to use random numbers over and over again, then create your Random instance as a static class member and have your methods refer to that, so that you only create the object once.
Math.random() returns a number from zero to one. So, your randomLetterIndex will always be zero. Use this instead.
(int) (Math.random() * correctAnswer.length())
This will give a random number between 0 and correctAnswer.length() - 1.
Math.random() returns a double higher or equal than 0 and less then 1, (int) Math.random() will always return 0.
Use
(int)(Math.random() * correctAnswer.length())
The modulo is useless here, this way you always hit inside the given string as (int) cast returns the floor value so the result will never be equal or higher than correctAnswer.length().
Ok, I have a project that requires me to have a dynamic hash table that counts the frequency of words in a file. I must use java, however, we are not allowed to use any built in data types or built in classes at all except standard arrays. Also, I am not allowed to use any hash functions off the internet that are known to be fast. I have to make my own hash functions. Lastly, my instructor also wants my table to start as size "1" and double in size every time a new key is added.
My first idea was to sum the ASCII values of the letters composing a word and use that to make a hash function, but different words with the same letters will equal the same value.
How can I get started? Is the ASCII idea on the right track?
A hash table isn't expected to have in general a one-to-one mapping between a value and a hash. A hash table is expected to have collisions. That is, the domain of the hash-function is expected to be larger than the range (i.e., the hash value). However, the general idea is that you come up with a hash function where the probability of collision is drastically small. If your hash-function is uniform, i.e., if you have it designed such that each possible hash-value has the same probability of being generated, then you can minimize collisions this way.
Getting a collision isn't the end of the world. That just means that you have to search the list of values for that hash. If your hashing function is good, overall your performance for lookup should still be O(1).
Generating hashing functions is a subject of its own, and there is no one answer. But a good place for you to start could be to work with the bitwise representations of the characters in the string, and perform some sort of convolution operations on them (rotate, shift, XOR) in series. You could perform these in some way based on some initial seed-value, and then use the output of the first step of hashing as a seed for the next step. This way you can end up magnifying the effects of your convolution.
For example, let's say you get the character A, which is 41 in hex, or 0100 0001 in binary. You could designate each bit to mean some operation (maybe bit 0 is a ROR when it is 0, and a ROL when it is 1; bit 1 is an OR when it is 0, and a XOR when it is 1, etc.). You could even decide how much convolution you want to do based on the value itself. For example, you could say that the lower nibble specifies how much right-rotation you will do, and the upper nibble specifies how much left rotation you will do. Then once you have the final value, you will use that as the seed for the next character. These are just some ideas. Use your imagination as see what you get!
It does not matter how good your hash function is, you will always have collisions you need to resolve.
If you want to keep your approach by using the ASCII values of the you shouldn't just add the values this would lead to a lot collisions. You should work with the power of the values, for example for the word "Help" you just go like: 'H' * 256 + 'e' * 256 + 'l' * 256² + 'p' * 256³. Or in pseudocode:
int hash(String word, int hashSize)
int res = 0
int count = 0;
for char c in word
res += 'c' * 256^count
count++
count = count mod 5
return res mod hashSize
Now you just have to write your own Hashtable:
class WordCounterMap
Entry[] entrys = new Entry[1]
void add(String s)
int hash = hash(s, entrys.length)
if(entrys[hash] == null{
Entry[] temp = new Entry[entry.length * 2]
for(Entry e : entrys){
if(e != null)
int hash = hash(e.word, temp.length)
temp[hash] = e;
entrys = temp;
hash = hash(s, entrys.length)
while(true)
if(entrys[hash] != null)
if(entrys[hash].word.equals(s))
entrys[hash].count++
break
else
entrys[hash] = new Entry(s)
hash++
hash = hash mod entrys.length
int getCount(String s)
int hash = hash(s, length)
if(entrys[hash] == null)
return 0
while(true)
if(entrys[hash].word.equals(s))
entrys[hash].count++
break
hash++
hash = hash mod entrys.length
class Entry
int count
String word
Entry(String s)
this.word = s
count = 1
What does it mean when the second argument is negative. I'm looking at a piece of code that searches for a key in an array. But what does a negative key mean ?
for (int i = 0; i < N; i++) {
int j = Arrays.binarySearch(a, -a[i]);
}
It means its look for a number which it the negative of a number in the array already.
This could be a positive key. For example, if a[0] is -10, it will look for 10 in the same array.
As described in the documentation, key is the value (in the array) to be searched for. Negating the argument just searches for its negation within the array!
I have an array of ints ie. [1,2,3,4,5] . Each row corresponds to decimal value, so 5 is 1's, 4 is 10's, 3 is 100's which gives value of 12345 that I calculate and store as long.
This is the function :
public long valueOf(int[]x) {
int multiplier = 1;
value = 0;
for (int i=x.length-1; i >=0; i--) {
value += x[i]*multiplier;
multiplier *= 10;
}
return value;
}
Now I would like to check if value of other int[] does not exceed long before I will calculate its value with valueOf(). How to check it ?
Should I use table.length or maybe convert it to String and send to
public Long(String s) ?
Or maybe just add exception to throw in the valueOf() function ?
I hope you know that this is a horrible way to store large integers: just use BigInteger.
But if you really want to check for exceeding some value, just make sure the length of the array is less than or equal to 19. Then you could compare each cell individually with the value in Long.MAX_VALUE. Or you could just use BigInteger.
Short answer: All longs fit in 18 digits. So if you know that there are no leading zeros, then just check x.length<=18. If you might have leading zeros, you'll have to loop through the array to count how many and adjust accordingly.
A flaw to this is that some 19-digit numbers are valid longs, namely those less than, I believe it comes to, 9223372036854775807. So if you wanted to be truly precise, you'd have to say length>19 is bad, length<19 is good, length==19 you'd have to check digit-by-digit. Depending on what you're up to, rejecting a subset of numbers that would really work might be acceptable.
As others have implied, the bigger question is: Why are you doing this? If this is some sort of data conversion where you're getting numbers as a string of digits from some external source and need to convert this to a long, cool. If you're trying to create a class to handle numbers bigger than will fit in a long, what you're doing is both inefficient and unnecessary. Inefficient because you could pack much more than one decimal digit into an int, and doing so would give all sorts of storage and performance improvements. Unnecessary because BigInteger already does this. Why not just use BigInteger?
Of course if it's a homework problem, that's a different story.
Are you guaranteed that every value of x will be nonnegative?
If so, you could do this:
public long valueOf(int[]x) {
int multiplier = 1;
long value = 0; // Note that you need the type here, which you did not have
for (int i=x.length-1; i >=0; i--) {
next_val = x[i]*multiplier;
if (Long.MAX_LONG - next_val < value) {
// Error-handling code here, however you
// want to handle this case.
} else {
value += next_val
}
multiplier *= 10;
}
return value;
}
Of course, BigInteger would make this much simpler. But I don't know what your problem specs are.