As you can see in the screenshot, new_mean's capacity is 0 eventhough I've created it with an initial capacity of 2 therefore I'm getting index out of bounds exception.
Does anyone know what I'm doing wrong?
Update: Here's the code
private static Vector<Double> get_new_mean(
Tuple<Set<Vector<Double>>, Vector<Double>> cluster,
Vector<Double> v, boolean is_being_added) {
Vector<Double> previous_mean = cluster.y;
int n = previous_mean.size(), set_size = cluster.x.size();
Vector<Double> new_mean = new Vector<Double>(n);
if (is_being_added) {
for (int i = 0; i < n; ++i) {
double temp = set_size * previous_mean.get(i);
double updated_mean = (temp + v.get(i)) / (set_size + 1);
new_mean.set(i, updated_mean);
} else {
if (set_size > 1) {
for (int i = 0; i < n; ++i) {
double temp = set_size * previous_mean.get(i);
double updated_mean = (temp - v.get(i)) / (set_size - 1);
new_mean.set(i, updated_mean);
} else {
new_mean = null;
return new_mean;
Capacity is the total number of elements you could store.
Size is the number of elements you have actually stored.
In your code, there is nothing stored in the Vector, so you get an IndexOutOfBoundsException when you try to access element 0.
Use set(int, object) to change an EXISTING element. Use add(int, object) to add a NEW element.
This is explained in the javadoc for Vectors. elementCount should be 0 (it's empty) and capacityIncrement is 0 by default, and is only relevant if you're going to go over the limit you specified (2).
You need to fill your Vector with null values to make it's size equal to the capacity. Capacity is an optimization hint for the collection, it makes no change to the collection usage. Collection will automatically grow as you add elements to it and capacity will increase. So initializing with a higher capacity would requires less expansions and less memory allocations.
My assignment deals with hashing and using Horner's polynomial to create a hash function. I have to computer the theoretical probe length using ( 1 + 1/(1-L)**2)/2 (Usuccessful) or (1+1/(1-L))/2 (successful) for Linear probing and then the same for the correct equations that correspond to quadratic probing. I then have to compare the theoretical values with experimental values for load factors 0.1 through 0.9. I am using the find method and searching for 100 random ints to acquire the experimental data. The problem that I am having is that I am not obtaining the correct probeLength value once the find either succeeds or fails.
I create 10000 random ints to fill with and then 100 random ints that I will search for.
for(i = 0; i<10000; i++)
int x = (int)(java.lang.Math.random() * size);
//Make arraylist of 10000 random ints to fill
for(p = 0; p<100; p++)
int x = (int)(java.lang.Math.random() * size);
Later on I have a loop that does the finding and keeps track of how many times the find succeeds or fails. That part of it is working. It is also supposed to keep track of the probeLength for each find and then add them all together so that it can be divided by the number of successes or failures respectively to find out what the average is. That is where I am having a problem. The probeLength isn't being retrieved correctly and I am not sure why.
This is the section of code that calls the find method and keeps track of those variables as well as the creation and filling.
HashTableLinear theHashTable = new HashTableLinear(primesize);
for(int j=0; j<randomintscopy.length; j++) // insert data
//aKey = (int)(java.lang.Math.random() * size);
aDataItem = new DataItem(randomintscopy[j]);
for(int f = 0; f < randomintsfindcopy.length;f++)
aDataItem = theHashTable.find(randomintsfindcopy[f]);
if(aDataItem != null)
linearsuccess += 1;
experimentallinearsuccess += theHashTable.probeLength;
theHashTable.probeLength = 0;
linearfailure += 1;
experimentallinearfailure += theHashTable.probeLength;
theHashTable.probeLength = 0;
And then the find method in the HashTableLinear class
public DataItem find(int key) // find item with key
int hashVal = hashFunc(key); // hash the key
probeLength = 1;
while(hashArray[hashVal] != null) // until empty cell,
{ // found the key?
if(hashArray[hashVal].getKey() == key)
return hashArray[hashVal]; // yes, return item
++hashVal; // go to next cell
//System.out.println("Find Test: " + probeLength);
hashVal %= arraySize; // wraparound if necessary
return null; // can't find item
When I test printing the probeLength value in the find method and the values that are gotten in the loops calling find are different from each other.
I realized that I was thinking too hard about this. It resolved it by making a getter and a setter and then setting the value once the item is either found or not found and then retrieving the value with the getter.
I'm attempting to resize my hash table however; I am keep getting a NullPointerException.
I know if the size is greater than 0.75 then the table size has to double, if it's less than 0.50 then the table size is halved. So far I have this..
public boolean add(Object x)
int h = x.hashCode();
if (h < 0) { h = -h; }
h = h % buckets.length;
Node current = buckets[h];
while (current != null)
if ( { return false; }
// Already in the set
current =;
Node newNode = new Node(); = x; = buckets[h];
buckets[h] = newNode;
double factor1 = currentSize * load1; //load1 = 0.75
double factor2 = currentSize * load2; //load2 = 0.50
if (currentSize > factor1) { resize(buckets.length*2); }
if (currentSize < factor2) { resize(buckets.length/2); }
return true;
Example. Size = 3. Max Size = 5
if we take the Max Size and multiply by 0.75 we get 3.75.
this is the factor that says if we pass it the Max Size must double
so if we add an extra element into the table the size is 4 and is > 3.75 thus the new Max Size is 10.
However; once we increase the size, the hashcode will change with the addition of a new element, so we call resize(int newSize)
private void resize(int newLength)
HashSet newTable = new HashSet(newLength);
for (int i = 0; i < buckets.length; i++) {
Here is my constructor if the buckets[i] confuses anyone.
public HashSet(int bucketsLength)
buckets = new Node[bucketsLength];
currentSize = 0;
I feel that the logic is correct, unless my resize method is not retrieving the elements.
If that is all your code for resize(), then you are failing to assign newTable to a class attribute, i.e. your old table. Right now you fill it with data and then don't do anything with it, since it is defined inside resize and therefore not available outside of it.
So you end up thinking you have a larger table now, but in fact you are still using the old one ;-)
When I read solution to knapsack problem (, I couldn't understand why there is iteration number n in the argument. It seems we can come to leaf use case by checking the passed limit. Ex. the 15KG backpack problem, solution seems like:
Value(n, W){ // W = limit, n = # items still to choose from
if (n == 0) return 0;
if (arr[n][W] != unknown) return arr[n][W]; // <- add memoize
if (s[n] > W) result = Value(n-1,W);
else result = max{v[n] + Value(n-1, W-w[n]), Value(n-1, W)};
arr[n][W] = result; // <- add memoize
return result;
My non-memoize method looks like the below, which is easier to understand, at least for me, and also could be improved with memoization.
static int n =5;
static int [] w = new int[]{12,2,1,4,1}; //weight
static int [] v = new int[]{4,2,1,10,2}; //value
public static int knapSack(int wt){
int maxValue = 0,vtemp = 0, wtemp =0;
if (wt ==0) return 0;
for (int i=0; i<n; i++){
if (w[i] > wt) continue;
int tmp = v[i] + knapSack(wt - w[i]);
if (tmp > maxValue){
maxValue = tmp;
vtemp = v[i];
wtemp = w[i];
System.out.println("wt="+wt + ",vtemp="+vtemp+",wtemp="+wtemp+",ret max="+maxValue);
return maxValue;
So my question is:
why do we need n for argument?
statement if (s[n] > W) result = Value(n-1,W); make me even harder to understand why
I see the same big O for memoized version of my approach. Any other difference?
You're actually solving a different problem. The first piece of code (with n) solves the 0-1 knapsack problem, where you can choose to take at most one of any particular item (i.e. there is no "copying" of items). In that case, you need n to keep track of which items you've already used up.
In the second piece of code, you're solving the unbounded knapsack problem, in which you can take every item an unlimited number of times.
They're both forms of the NP-complete knapsack problem, but they have different solutions.
trying to initialize my array at 1 and have it double every time it's input fills up. this is what i have right now
int max = 1;
PhoneRecord[] records = new PhoneRecord[max];
int numRecords = 0;
int size = Integer.parseInt(length.records[numRecords]);
if (size >= max) {
size = 2*size;
but it's clearly full of fail. any suggestions or guidance would be great, thanks.
OK, you should use an ArrayList, but several other folks already told you that.
If you still want to use an array, here's how you'd resize it:
int max = 1;
PhoneRecord[] records = new PhoneRecord[max];
int numRecords = 0;
void addRecord(PhoneRecord rec) {
records[numRecords++] = rec;
if(numRecords == max) {
/* out of space, double the array size */
max *= 2;
records = Arrays.copyOf(records, max);
Why not use an ArrayList ? It'll exhibit very similar characteristics automatically.
From the private grow() method:
int newCapacity = oldCapacity + (oldCapacity >> 1);
You can't override the growth behaviour, but unless you really require a doubling due to your app characteristics, I'm sure it'll be sufficient.
Size is only multiplying the size number, not double the array size.
records = Arrays.copyOf(records, records.length*2);
Alright, here's the lowdown: I'm writing a class in Java that finds the Nth Hardy's Taxi number (a number that can be summed up by two different sets of two cubed numbers). I have the discovery itself down, but I am in desperate need of some space saving. To that end, I need the smallest possible data structure where I can relatively easily use or create a method like contains(). I'm not particularly worried about speed, as my current solution can certainly get it to compute well within the time restrictions.
In short, the data structure needs:
To be able to relatively simply implement a contains() method
To use a low amount of memory
To be able to store very large number of entries
To be easily usable with the primitive long type
Any ideas? I started with a hash map (because I needed to test the values the led to the sum to ensure accuracy), then moved to hash set once I guaranteed reliable answers.
Any other general ideas on how to save some space would be greatly appreciated!
I don't think you'd need the code to answer the question, but here it is in case you're curious:
public class Hardy {
// private static HashMap<Long, Long> hm;
* Find the nth Hardy number (start counting with 1, not 0) and the numbers
* whose cubes demonstrate that it is a Hardy number.
* #param n
* #return the nth Hardy number
public static long nthHardyNumber(int n) {
// long i, j, oldValue;
int i, j;
int counter = 0;
long xyLimit = 2147483647; // xyLimit is the max value of a 32bit signed number
long sum;
// hm = new HashMap<Long, Long>();
int hardyCalculations = (int) (n * 1.1);
HashSet<Long> hs = new HashSet<Long>(hardyCalculations * hardyCalculations, (float) 0.95);
long[] sums = new long[hardyCalculations];
// long binaryStorage, mask = 0x00000000FFFFFFFF;
for (i = 1; i < xyLimit; i++){
for (j = 1; j <= i; j++){
// binaryStorage = ((i << 32) + j);
// long y = ((binaryStorage << 32) >> 32) & mask;
// long x = (binaryStorage >> 32) & mask;
sum = cube(i) + cube(j);
if (hs.contains(sum) && !arrayContains(sums, sum)){
// oldValue = hm.get(sum);
// long oldY = ((oldValue << 32) >> 32) & mask;
// long oldX = (oldValue >> 32) & mask;
// if (oldX != x && oldX != y){
sums[counter] = sum;
if (counter == hardyCalculations){
// Arrays.sort(sums);
return sums[n - 1];
} else {
return 0;
private static void bubbleSort(long[] array){
long current, next;
int i;
boolean ordered = false;
while (!ordered) {
ordered = true;
for (i = 0; i < array.length - 1; i++){
current = array[i];
next = array[i + 1];
if (current > next) {
ordered = false;
array[i] = next;
array[i+1] = current;
private static boolean arrayContains(long[] array, long n){
for (long l : array){
if (l == n){
return true;
return false;
private static long cube(long n){
return n*n*n;
Have you considered using a standard tree? In java that would be a TreeSet. By sacrificing speed, a tree generally gains back space over a hash.
For that matter, sums might be a TreeMap, transforming the linear arrayContains to a logarithmic operation. Being naturally ordered, there would also be no need to re-sort it afterwards.
The complaint against using a java tree structure for sums is that java's tree types don't support the k-select algorithm. On the assumption that Hardy numbers are rare, perhaps you don't need to sweat the complexity of this container (in which case your array is fine.)
If you did need to improve time performance of this aspect, you could consider using a selection-enabled tree such as the one mentioned here. However that solution works by increasing the space requirement, not lowering it.
Alternately we can incrementally throw out Hardy numbers we know we don't need. Suppose during the running of the algorithm, sums already contains n Hardy numbers and we discover a new one. We insert it and do whatever we need to preserve collection order, and so now contains n+1 sorted elements.
Consider that last element. We already know about n smaller Hardy numbers, and so there is no possible way this last element is our answer. Why keep it? At this point we can shrink sums again down to size n and toss the largest element out. This is both a space savings, and time savings as we have fewer elements to maintain in sorted order.
The natural data structure for sums in that approach is a max heap. In java there is no native implementation available, but a few 3rd party ones are floating around. You could "make it work" with TreeMap::lastKey, which will be slower in the end, but still faster than quadratic bubbleSort.
If you have an extremely large number of elements, and you effectively want an index to allow fast tests for containment in the underlying dataset, then take a look at Bloom Filters. These are space-efficient indexes whose sole purpose is to enable fast tests for containment in a dataset.
Bloom Filters are probabilistic, which means if they return true for containment, then you actually need to check your underlying dataset to confirm that the element is really present.
If they return false, the element is guaranteed not to be contained in the underlying dataset, and in that case the test for containment would be very cheap.
So it depends on the whether most of the time you expect a candidate to really be contained in the dataset or not.
this is core function to find if a given number is HR-number: it's in C but one should get the idea:
bool is_sum_of_cubes(int value)
int m = pow(value, 1.0/3);
int i = m;
int j = 1;
while(j < m && i >= 0)
int element = i*i*i + j*j*j;
if( value == element )
return true;
if(element < value)
return false;