Retrieving probe length for linear/quadratic hashing - java

My assignment deals with hashing and using Horner's polynomial to create a hash function. I have to computer the theoretical probe length using ( 1 + 1/(1-L)**2)/2 (Usuccessful) or (1+1/(1-L))/2 (successful) for Linear probing and then the same for the correct equations that correspond to quadratic probing. I then have to compare the theoretical values with experimental values for load factors 0.1 through 0.9. I am using the find method and searching for 100 random ints to acquire the experimental data. The problem that I am having is that I am not obtaining the correct probeLength value once the find either succeeds or fails.
I create 10000 random ints to fill with and then 100 random ints that I will search for.
for(i = 0; i<10000; i++)
{
int x = (int)(java.lang.Math.random() * size);
randomints.add(x);
}
//Make arraylist of 10000 random ints to fill
for(p = 0; p<100; p++)
{
int x = (int)(java.lang.Math.random() * size);
randomintsfind.add(x);
}
Later on I have a loop that does the finding and keeps track of how many times the find succeeds or fails. That part of it is working. It is also supposed to keep track of the probeLength for each find and then add them all together so that it can be divided by the number of successes or failures respectively to find out what the average is. That is where I am having a problem. The probeLength isn't being retrieved correctly and I am not sure why.
This is the section of code that calls the find method and keeps track of those variables as well as the creation and filling.
HashTableLinear theHashTable = new HashTableLinear(primesize);
for(int j=0; j<randomintscopy.length; j++) // insert data
{
//aKey = (int)(java.lang.Math.random() * size);
aDataItem = new DataItem(randomintscopy[j]);
theHashTable.insert(aDataItem);
}
for(int f = 0; f < randomintsfindcopy.length;f++)
{
aDataItem = theHashTable.find(randomintsfindcopy[f]);
if(aDataItem != null)
{
linearsuccess += 1;
experimentallinearsuccess += theHashTable.probeLength;
theHashTable.probeLength = 0;
}
else
{
linearfailure += 1;
experimentallinearfailure += theHashTable.probeLength;
theHashTable.probeLength = 0;
}
}
And then the find method in the HashTableLinear class
public DataItem find(int key) // find item with key
{
int hashVal = hashFunc(key); // hash the key
probeLength = 1;
while(hashArray[hashVal] != null) // until empty cell,
{ // found the key?
if(hashArray[hashVal].getKey() == key)
return hashArray[hashVal]; // yes, return item
++hashVal; // go to next cell
++probeLength;
//System.out.println("Find Test: " + probeLength);
hashVal %= arraySize; // wraparound if necessary
}
return null; // can't find item
}
When I test printing the probeLength value in the find method and the values that are gotten in the loops calling find are different from each other.

I realized that I was thinking too hard about this. It resolved it by making a getter and a setter and then setting the value once the item is either found or not found and then retrieving the value with the getter.

Related

Is there a more elegant way to search the station index?

I work on a genetic algorithm for a robotic assembly line balancing problem (assigning assembly operations and robots to stations to minimize the cycle time for a given number of stations). The solution is represented by an ArrayList (configuration) which holds all the operations in the sequence assigned to different stations. Furthermore, I have two more ArrayLists (robotAssignment, operationPartition) which indicate where a new station starts and which robot is assigned to a station. For example, a solution candidate looks like this (configuration, robotAssignment, operationPartition from top to bottom):
Initial cycle time: 50.0
|2|7|3|9|1|5|4|6|8|10|
|2|1|3|2|
|0|2|5|7|
From this solution representation we know that operations 3, 9, and 1 are assigned to the second sation and robot 1 is used.
I need to keep track of the station an operation is assigned to. I tried a lot to store this in the Object Operation itself but I always ended up in problems and therefore I want to write a method that gives me the stations index of an operation.
Here is what I have coded so far:
// Get the station of an operation
public int getStation(Operation operation) {
int stationIndex = 0;
int position = configuration.indexOf(operation);
for (int i = 0; i < GA_RALBP.numberOfStations ; i++ ) {
if (i < GA_RALBP.numberOfStations - 1 && operationPartition.get(i) != null) {
if (isBetween(position, (int) operationPartition.get(i), (int) operationPartition.get(i + 1))) {
return stationIndex + 1;
} else {
stationIndex++;
}
}
else if (i >= GA_RALBP.numberOfStations - 1 && operationPartition.get(i) != null) {
if (isBetween(position, (int) operationPartition.get(i), configurationSize())) {
return stationIndex + 1;
}
}
}
return -1;
}
// Check if value x is between values left and right including left
public static boolean isBetween(int x, int left, int right) {
if (left <= x && x < right ) {
return true;
}
else {
return false;
}
}
However, this does not seem to be (a) very elegant and (b) if I have to do this for a large number of operations the runtime could become a problem. Has anoyone an idea how to solve this more efficiently?
Why not make the partitioning explicit (replaces your operationPartition) - something like:
Map<Integer, Integer> operationToStationMapping = new HashMap<>();
operationToStationMapping.put(2,0);
operationToStationMapping.put(7,0);
operationToStationMapping.put(3,2);
operationToStationMapping.put(9,2);
operationToStationMapping.put(1,2);
operationToStationMapping.put(5,5);
operationToStationMapping.put(6,7);
operationToStationMapping.put(8,-1);
operationToStationMapping.put(10,-1);
Then getStation() becomes:
getStation(int operation) {return operationToStationMapping.get(operation);}

Inserting an object in ascending order with an ArrayList in Java

So I've went back and forth quite a few times trying multiple different methods but I just can't seem to wrap my head around the appropriate algorithm for this method. I am creating a Polynomial class that uses an ArrayList where Term(int coeff, int expo) The coeff is the coefficient of the polynomial and the expo is the exponent. In the test class I have to insert multiple different Term objects but they need to be inserted in ascending order by their exponents (for example, 4x^1 + 2x^3 + x^4 + 5x^7)
This is the code I have up to the end of the insert() method which takes two parameters, the coeff and expo:
public class Polynomial
{
private ArrayList<Term> polynomials ;
/**
* Creates a new Polynomial object with no terms
*/
public Polynomial()
{
polynomials = new ArrayList<>() ;
}
/**
* Inserts a new term into its proper place in a Polynomial
* #param coeff the coefficient of the new term
* #param expo the exponent of the new term
*/
public void insert(int coeff, int expo)
{
Term newTerm = new Term (coeff, expo) ;
if (polynomials.isEmpty())
{
polynomials.add(newTerm);
return;
}
int polySize = polynomials.size() - 1 ;
for (int i = 0 ; i <= polySize ; i++)
{
Term listTerm = polynomials.get(i) ;
int listTermExpo = listTerm.getExpo() ;
if ( expo <= listTermExpo )
{
polynomials.add(i, newTerm);
return;
}
else if ( expo > listTermExpo )
{
polynomials.add(newTerm) ;
return ;
}
}
}
The problem arises near the end of the code. Once I put in a Term whose coefficient isn't <= the Term at the index it goes to the else if statement and adds it to the end of the list. Which is wrong, since it needs to be added where it JUST becomes bigger than the next coefficient. Just because it's larger than that coefficient doesn't mean its the LARGEST coefficient. I've tried doing the for statement backwards where:
for (i = polySize ; i >= 0 ; i--)
{
etc.
}
But that didn't work either since it raises the same issue just the other way around. If anyone could provide some solution or answer it would be much appreciated since I am very confused. At this point I'm sure I'm just making it too complicated. I just want to know how to recognize that the exponent is larger but then go back into the for loop until it is smaller than or equal to the index's exponent.
Also, I should mention, I am not allowed to use any other collection or class, so I must do this using a for, if, else, or do while statement.
Thanks in advance!
Remove this from the for loop:
else if (expo > listTermExpo)
{
polynomials.add(newTerm);
return;
}
Place this after the for loop:
polynomials.add(newTerm);
return;
Reasoning: You want to add the term to the end of the list only if it is not less than ANY term in it - not just the first term.
Also, it is good formatting to have the ; immediately after the statement it is for with no space in between, and for ()s to not have any spaces immediately inside them. I've edited the code I copied from you to show what I mean.
This should have the exact behaviour you specified:
public void insert(int coeff, int expo) {
Term newTerm = new Term(coeff, expo);
int max = polynomials.size();
int min = 0;
int pivot;
while (max > min) {
pivot = (min + max) / 2;
if (expo > polynomials.get(pivot).getExpo()){
min = pivot + 1;
}
else {
max = pivot;
}
}
polynomials.add(min, newTerm);
}
This algorithm will add new Terms right in front of the first term with the same exponent if any such Term is already in the list.

Smallest java structure with relatively decent contains() solution

Alright, here's the lowdown: I'm writing a class in Java that finds the Nth Hardy's Taxi number (a number that can be summed up by two different sets of two cubed numbers). I have the discovery itself down, but I am in desperate need of some space saving. To that end, I need the smallest possible data structure where I can relatively easily use or create a method like contains(). I'm not particularly worried about speed, as my current solution can certainly get it to compute well within the time restrictions.
In short, the data structure needs:
To be able to relatively simply implement a contains() method
To use a low amount of memory
To be able to store very large number of entries
To be easily usable with the primitive long type
Any ideas? I started with a hash map (because I needed to test the values the led to the sum to ensure accuracy), then moved to hash set once I guaranteed reliable answers.
Any other general ideas on how to save some space would be greatly appreciated!
I don't think you'd need the code to answer the question, but here it is in case you're curious:
public class Hardy {
// private static HashMap<Long, Long> hm;
/**
* Find the nth Hardy number (start counting with 1, not 0) and the numbers
* whose cubes demonstrate that it is a Hardy number.
* #param n
* #return the nth Hardy number
*/
public static long nthHardyNumber(int n) {
// long i, j, oldValue;
int i, j;
int counter = 0;
long xyLimit = 2147483647; // xyLimit is the max value of a 32bit signed number
long sum;
// hm = new HashMap<Long, Long>();
int hardyCalculations = (int) (n * 1.1);
HashSet<Long> hs = new HashSet<Long>(hardyCalculations * hardyCalculations, (float) 0.95);
long[] sums = new long[hardyCalculations];
// long binaryStorage, mask = 0x00000000FFFFFFFF;
for (i = 1; i < xyLimit; i++){
for (j = 1; j <= i; j++){
// binaryStorage = ((i << 32) + j);
// long y = ((binaryStorage << 32) >> 32) & mask;
// long x = (binaryStorage >> 32) & mask;
sum = cube(i) + cube(j);
if (hs.contains(sum) && !arrayContains(sums, sum)){
// oldValue = hm.get(sum);
// long oldY = ((oldValue << 32) >> 32) & mask;
// long oldX = (oldValue >> 32) & mask;
// if (oldX != x && oldX != y){
sums[counter] = sum;
counter++;
if (counter == hardyCalculations){
// Arrays.sort(sums);
bubbleSort(sums);
return sums[n - 1];
}
} else {
hs.add(sum);
}
}
}
return 0;
}
private static void bubbleSort(long[] array){
long current, next;
int i;
boolean ordered = false;
while (!ordered) {
ordered = true;
for (i = 0; i < array.length - 1; i++){
current = array[i];
next = array[i + 1];
if (current > next) {
ordered = false;
array[i] = next;
array[i+1] = current;
}
}
}
}
private static boolean arrayContains(long[] array, long n){
for (long l : array){
if (l == n){
return true;
}
}
return false;
}
private static long cube(long n){
return n*n*n;
}
}
Have you considered using a standard tree? In java that would be a TreeSet. By sacrificing speed, a tree generally gains back space over a hash.
For that matter, sums might be a TreeMap, transforming the linear arrayContains to a logarithmic operation. Being naturally ordered, there would also be no need to re-sort it afterwards.
EDIT
The complaint against using a java tree structure for sums is that java's tree types don't support the k-select algorithm. On the assumption that Hardy numbers are rare, perhaps you don't need to sweat the complexity of this container (in which case your array is fine.)
If you did need to improve time performance of this aspect, you could consider using a selection-enabled tree such as the one mentioned here. However that solution works by increasing the space requirement, not lowering it.
Alternately we can incrementally throw out Hardy numbers we know we don't need. Suppose during the running of the algorithm, sums already contains n Hardy numbers and we discover a new one. We insert it and do whatever we need to preserve collection order, and so now contains n+1 sorted elements.
Consider that last element. We already know about n smaller Hardy numbers, and so there is no possible way this last element is our answer. Why keep it? At this point we can shrink sums again down to size n and toss the largest element out. This is both a space savings, and time savings as we have fewer elements to maintain in sorted order.
The natural data structure for sums in that approach is a max heap. In java there is no native implementation available, but a few 3rd party ones are floating around. You could "make it work" with TreeMap::lastKey, which will be slower in the end, but still faster than quadratic bubbleSort.
If you have an extremely large number of elements, and you effectively want an index to allow fast tests for containment in the underlying dataset, then take a look at Bloom Filters. These are space-efficient indexes whose sole purpose is to enable fast tests for containment in a dataset.
Bloom Filters are probabilistic, which means if they return true for containment, then you actually need to check your underlying dataset to confirm that the element is really present.
If they return false, the element is guaranteed not to be contained in the underlying dataset, and in that case the test for containment would be very cheap.
So it depends on the whether most of the time you expect a candidate to really be contained in the dataset or not.
this is core function to find if a given number is HR-number: it's in C but one should get the idea:
bool is_sum_of_cubes(int value)
{
int m = pow(value, 1.0/3);
int i = m;
int j = 1;
while(j < m && i >= 0)
{
int element = i*i*i + j*j*j;
if( value == element )
{
return true;
}
if(element < value)
{
++j;
}
else
{
--i;
}
}
return false;
}

Solve n-puzzle in Java

I'm trying to implement a program to solve the n-puzzle problem.
I have written a simple implementation in Java that has a state of the problem characterized by a matrix representing the tiles. I am also able to auto-generate the graph of all the states giving the starting state. On the graph, then, I can do a BFS to find the path to the goal state.
But the problem is that I run out of memory and I cannot even create the whole graph.
I tried with a 2x2 tiles and it works. Also with some 3x3 (it depends on the starting state and how many nodes are in the graph). But in general this way is not suitable.
So I tried generating the nodes at runtime, while searching. It works, but it is slow (sometimes after some minutes it still have not ended and I terminate the program).
Btw: I give as starting state only solvable configurations and I don't create duplicated states.
So, I cannot create the graph. This leads to my main problem: I have to implement the A* algorithm and I need the path cost (i.e. for each node the distance from the starting state), but I think I cannot calculate it at runtime. I need the whole graph, right? Because A* does not follow a BFS exploration of the graph, so I don't know how to estimate the distance for each node. Hence, I don't know how to perform an A* search.
Any suggestion?
EDIT
State:
private int[][] tiles;
private int pathDistance;
private int misplacedTiles;
private State parent;
public State(int[][] tiles) {
this.tiles = tiles;
pathDistance = 0;
misplacedTiles = estimateHammingDistance();
parent = null;
}
public ArrayList<State> findNext() {
ArrayList<State> next = new ArrayList<State>();
int[] coordZero = findCoordinates(0);
int[][] copy;
if(coordZero[1] + 1 < Solver.SIZE) {
copy = copyTiles();
int[] newCoord = {coordZero[0], coordZero[1] + 1};
switchValues(copy, coordZero, newCoord);
State newState = checkNewState(copy);
if(newState != null)
next.add(newState);
}
if(coordZero[1] - 1 >= 0) {
copy = copyTiles();
int[] newCoord = {coordZero[0], coordZero[1] - 1};
switchValues(copy, coordZero, newCoord);
State newState = checkNewState(copy);
if(newState != null)
next.add(newState);
}
if(coordZero[0] + 1 < Solver.SIZE) {
copy = copyTiles();
int[] newCoord = {coordZero[0] + 1, coordZero[1]};
switchValues(copy, coordZero, newCoord);
State newState = checkNewState(copy);
if(newState != null)
next.add(newState);
}
if(coordZero[0] - 1 >= 0) {
copy = copyTiles();
int[] newCoord = {coordZero[0] - 1, coordZero[1]};
switchValues(copy, coordZero, newCoord);
State newState = checkNewState(copy);
if(newState != null)
next.add(newState);
}
return next;
}
private State checkNewState(int[][] tiles) {
State newState = new State(tiles);
for(State s : Solver.states)
if(s.equals(newState))
return null;
return newState;
}
#Override
public boolean equals(Object obj) {
if(this == null || obj == null)
return false;
if (obj.getClass().equals(this.getClass())) {
for(int r = 0; r < tiles.length; r++) {
for(int c = 0; c < tiles[r].length; c++) {
if (((State)obj).getTiles()[r][c] != tiles[r][c])
return false;
}
}
return true;
}
return false;
}
Solver:
public static final HashSet<State> states = new HashSet<State>();
public static void main(String[] args) {
solve(new State(selectStartingBoard()));
}
public static State solve(State initialState) {
TreeSet<State> queue = new TreeSet<State>(new Comparator1());
queue.add(initialState);
states.add(initialState);
while(!queue.isEmpty()) {
State current = queue.pollFirst();
for(State s : current.findNext()) {
if(s.goalCheck()) {
s.setParent(current);
return s;
}
if(!states.contains(s)) {
s.setPathDistance(current.getPathDistance() + 1);
s.setParent(current);
states.add(s);
queue.add(s);
}
}
}
return null;
}
Basically here is what I do:
- Solver's solve has a SortedSet. Elements (States) are sorted according to Comparator1, which calculates f(n) = g(n) + h(n), where g(n) is the path cost and h(n) is a heuristic (the number of misplaced tiles).
- I give the starting configuration and look for all the successors.
- If a successor has not been already visited (i.e. if it is not in the global set States) I add it to the queue and to States, setting the current state as its parent and parent's path + 1 as its path cost.
- Dequeue and repeat.
I think it should work because:
- I keep all the visited states so I'm not looping.
- Also, there won't be any useless edge because I immediately store current node's successors. E.g.: if from A I can go to B and C, and from B I could also go to C, there won't be the edge B->C (since path cost is 1 for each edge and A->B is cheaper than A->B->C).
- Each time I choose to expand the path with the minimum f(n), accordin to A*.
But it does not work. Or at least, after a few minutes it still can't find a solution (and I think is a lot of time in this case).
If I try to create a tree structure before executing A*, I run out of memory building it.
EDIT 2
Here are my heuristic functions:
private int estimateManhattanDistance() {
int counter = 0;
int[] expectedCoord = new int[2];
int[] realCoord = new int[2];
for(int value = 1; value < Solver.SIZE * Solver.SIZE; value++) {
realCoord = findCoordinates(value);
expectedCoord[0] = (value - 1) / Solver.SIZE;
expectedCoord[1] = (value - 1) % Solver.SIZE;
counter += Math.abs(expectedCoord[0] - realCoord[0]) + Math.abs(expectedCoord[1] - realCoord[1]);
}
return counter;
}
private int estimateMisplacedTiles() {
int counter = 0;
int expectedTileValue = 1;
for(int i = 0; i < Solver.SIZE; i++)
for(int j = 0; j < Solver.SIZE; j++) {
if(tiles[i][j] != expectedTileValue)
if(expectedTileValue != Solver.ZERO)
counter++;
expectedTileValue++;
}
return counter;
}
If I use a simple greedy algorithm they both work (using Manhattan distance is really quick (around 500 iterations to find a solution), while with number of misplaced tiles it takes around 10k iterations). If I use A* (evaluating also the path cost) it's really slow.
Comparators are like that:
public int compare(State o1, State o2) {
if(o1.getPathDistance() + o1.getManhattanDistance() >= o2.getPathDistance() + o2.getManhattanDistance())
return 1;
else
return -1;
}
EDIT 3
There was a little error. I fixed it and now A* works. Or at least, for the 3x3 if finds the optimal solution with only 700 iterations. For the 4x4 it's still too slow. I'll try with IDA*, but one question: how long could it take with A* to find the solution? Minutes? Hours? I left it for 10 minutes and it didn't end.
There is no need to generate all state space nodes for solving a problem using BFS, A* or any tree search, you just add states you can explore from current state to the fringe and that's why there is a successor function.
If BFS consumes much memory it is normal. But I don't know exactly fro what n it would make problem. Use DFS instead.
For A* you know how many moves you made to come to current state and you can estimate moves need to solve problem, simply by relaxing problem. As an example you can think that any two tiles can replace and then count moves needed to solve the problem. You heuristic just needs to be admissible ie. your estimate be less then actual moves needed to solve the problem.
add a path cost to your state class and every time you go from a parent state P to another state like C do this : c.cost = P.cost + 1 this will compute the path cost for every node automatically
this is also a very good and simple implementation in C# for 8-puzzle solver with A* take a look at it you will learn many things :
http://geekbrothers.org/index.php/categories/computer/12-solve-8-puzzle-with-a

how to Compute the average probe length for success and failure - Linear probe (Hash Tables) [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
I'm doing an assignment for my Data Structures class. we were asked to to study linear probing with load factors of .1, .2 , .3, ...., and .9. The formula for testing is:
The average probe length using linear probing is roughly
Success--> ( 1 + 1/(1-L)**2)/2
or
Failure--> (1+1(1-L))/2.
we are required to find the theoretical using the formula above which I did(just plug the load factor in the formula), then we have to calculate the empirical (which I not quite sure how to do). here is the rest of the requirements
**For each load factor, 10,000 randomly generated positive ints
between 1 and 50000 (inclusive) will
be inserted into a table of the
"right" size, where "right" is
strictly based upon the load factor
you are testing. Repeats are allowed.
Be sure that your formula for randomly
generated ints is correct. There is a
class called Random in java.util. USE
it! After a table of the right (based
upon L) size is loaded with 10,000
ints, do 100 searches of newly
generated random ints from the range
of 1 to 50000. Compute the average
probe length for each of the two
formulas and indicate the denominators
used in each calculationSo, for example, each test for a .5 load would have a table of > > size
approximately 20,000 (adjusted to be
prime) and similarly each test for a
.9 load would have a table of
approximate size 10,000/.9 (again
adjusted to be prime).
The program should run displaying the
various load factors tested, the
average probe for each search (the two
denominators used to compute the
averages will add to 100), and the
theoretical answers using the formula
above. .**
how do I calculate the empirical success?
here is my code so far:
import java.util.Random;
/**
*
* #author Johnny
*/
class DataItem
{
private int iData;
public DataItem(int it)
{iData = it;}
public int getKey()
{
return iData;
}
}
class HashTable
{
private DataItem[] hashArray;
private int arraySize;
public HashTable(int size)
{
arraySize = size;
hashArray = new DataItem[arraySize];
}
public void displayTable()
{
int sp=0;
System.out.print("Table: ");
for(int j=0; j<arraySize; j++)
{
if(sp>50){System.out.println("");sp=0;}
if(hashArray[j] != null){
System.out.print(hashArray[j].getKey() + " ");sp++;}
else
{System.out.print("** "); sp++;}
}
System.out.println("");
}
public int hashFunc(int key)
{
return key %arraySize;
}
public void insert(DataItem item)
{
int key = item.getKey();
int hashVal = hashFunc(key);
while(hashArray[hashVal] != null &&
hashArray[hashVal].getKey() != -1)
{
++hashVal;
hashVal %= arraySize;
}
hashArray[hashVal]=item;
}
public int hashFunc1(int key)
{
return key % arraySize;
}
public int hashFunc2(int key)
{
// non-zero, less than array size, different from hF1
// array size must be relatively prime to 5, 4, 3, and 2
return 5 - key % 5;
}
public DataItem find(int key) // find item with key
// (assumes table not full)
{
int hashVal = hashFunc1(key); // hash the key
int stepSize = hashFunc2(key); // get step size
while(hashArray[hashVal] != null) // until empty cell,
{ // is correct hashVal?
if(hashArray[hashVal].getKey() == key)
return hashArray[hashVal]; // yes, return item
hashVal += stepSize; // add the step
hashVal %= arraySize; // for wraparound
}
return null; // can’t find item
}
}
public class n00645805 {
/**
* #param args the command line arguments
*/
public static void main(String[] args) {
double b=1;
double L;
double[] tf = new double[9];
double[] ts = new double[9];
double d=0.1;
DataItem aDataItem;
int aKey;
HashTable h1Table = new HashTable(100003); //L=.1
HashTable h2Table = new HashTable(50051); //L=.2
HashTable h3Table = new HashTable(33343); //L=.3
HashTable h4Table = new HashTable(25013); //L=.4
HashTable h5Table = new HashTable(20011); //L=.5
HashTable h6Table = new HashTable(16673); //L=.6
HashTable h7Table = new HashTable(14243); //L=.7
HashTable h8Table = new HashTable(12503); //L=.8
HashTable h9Table = new HashTable(11113); //L=.9
fillht(h1Table);
fillht(h2Table);
fillht(h3Table);
fillht(h4Table);
fillht(h5Table);
fillht(h6Table);
fillht(h7Table);
fillht(h8Table);
fillht(h9Table);
pm(h1Table);
pm(h2Table);
pm(h3Table);
pm(h4Table);
pm(h5Table);
pm(h6Table);
pm(h7Table);
pm(h8Table);
pm(h9Table);
for (int j=1;j<10;j++)
{
//System.out.println(j);
L=Math.round((b-d)*100.0)/100.0;
System.out.println(L);
System.out.println("ts "+(1+(1/(1-L)))/2);
System.out.println("tf "+(1+(1/((1-L)*(1-L))))/2);
tf[j-1]=(1+(1/(1-L)))/2;
ts[j-1]=(1+(1/((1-L)*(1-L))))/2;
d=d+.1;
}
display(ts,tf);
}
public static void fillht(HashTable a)
{
Random r = new Random();
for(int j=0; j<10000; j++)
{
int aKey;
DataItem y;
aKey =1+Math.round(r.nextInt(50000));
y = new DataItem(aKey);
a.insert(y);
}
}
public static void pm(HashTable a)
{
DataItem X;
int numsuc=0;
int numfail=0;
int aKey;
Random r = new Random();
for(int j=0; j<100;j++)
{
aKey =1+Math.round(r.nextInt(50000));
X = a.find(aKey);
if(X != null)
{
//System.out.println("Found " + aKey);
numsuc++;
}
else
{
//System.out.println("Could not find " + aKey);
numfail++;
}
}
System.out.println("# of succ is "+ numsuc+" # of failures is "+ numfail);
}
public static void display(double[] s, double[] f)
{
}
}
You should take into account that Java's HashTable uses a closed addressing (no probing) implementation, so you have separate buckets in which many items can be placed. This is not what you are looking for in your benchmarks. I'm not sure about HashMap implementation but I think it uses open addressing too.
So forget about JDK classes.. since you want to calculate empirical values you should write your own version of an hashtable that uses the open addressing implementation with linear probing but you should take care of counting the probe length whenever you try to get a value from the hashmap..
For example you can write your hashmap and then take care of having
class YourHashMap
{
int empiricalGet(K key)
{
// search for the key but store the probe length of this get operation
return probeLength;
}
}
Then you can easily benchmark it by searching how many keys you want and calculating the average probe length.
Otherwise you can just provide the hasmap the ability of storing the total probe length and the count of gets requested and retrieve them after the benchmark run to calculate average value.
This kind of exercises must prove that the empirical value concordates with the theoretical one. So take also into account the fact that you may need many benchmarks, and then do the average of them all, assuring that variance is not too high.

Categories

Resources