Averaging multiple results of a method? - java

So, I have a method that is looping through and returning these numbers automatically :
6527.6 6755.6 7009.9 7384.7 7775.9 8170.7 8382.6 8598.8 8867.6 9208.2 9531.8 9821.7 10041.6 10007.2 9847.0 10036.3 10263.5 10449.7 10699.7
I would like to average the first number to the second number, the second to the third number, and so on. What ways should I go about doing this? Adding all these doubles to an array? Or is there a way to get a running total in order to do this?
The problem I'm having is that I can only seem to get all or none of these doubles, not specific ones.
So the output would be the results of something like (6527.6+6755.6)/2, (6755.6+7009.9)/2, and so on. Just printing them, nothing else.
Edit : Code from the parser here : http://pastebin.com/V6yvntcP

What you are describing is known as moving average.
Some formal explanation is here:
http://en.wikipedia.org/wiki/Moving_average
[...] simple moving average (SMA) is the unweighted mean of the previous n data [...]
You want to compute moving average for n = 2.
Below is simple code that do this:
public static void main(String[] args) {
List<Double> list = Arrays.asList(6527.6, 6755.6, 7009.9, 7384.7, 7775.9, 8170.7);
for (int i = 1; i < list.size(); i++) {
double avg = (list.get(i) + list.get(i - 1)) / 2;
System.out.println("avg(" + (i - 1) + "," + i + ") = " + avg);
}
}
And second approach without List or array:
Double prev = null;
// inside loop:
double curr = getMethodResult(...);
if (prev != null) {
double avg = (curr + prev) / 2;
}
prev = curr;
// end of loop

Create and array of double and just do some traversing and swapping and its done. Check it out:
double[] array = { 6527.6, 6755.6, 7009.9, 7384.7, 7775.9, 8170.7 };
double avg = 0;
double sum = 0.0;
double temp = 6527.6;
for (int i = 0; i < array.length - 1; i++) {
sum = temp + array[i + 1];
avg = sum / 2;
System.out.println("(" + temp + "+" + array[i + 1] + ")/2" + "= "
+ avg);
temp = avg;
}

Related

returning infinity in my for loop?

Right now when I run my code it returns infinity in the std dev and variance. How do I fix this?
When I take the //VARIANCE and //STD DEV out of the for loop it gives me a value but it's not the right one. So I'm thinking it's because when you take it out of the for loop "i" isn't working correctly? I know "i" is the problem because it's supposed to be "for the number of elements in the number list, take each element and subtract the average and square it." How do I achieve it?
#Override
public String calculate() throws Exception {
//String firstConfidenceInterval = String.valueOf(SRC_Calc_Type.CI(PH, CV, N));
double total = 0.0;
double total1 = 0.0;
int i;
String delims = ",";
String[] tokens = nums.split(delims);
for(i = 0; i < tokens.length; i++) {
total += (Double.parseDouble(tokens[i]));// SUM
}
double average = total / i; //average
total1 += (Math.pow((i - average), 2) / i); //VARIANCE
double std_dev = Math.pow(total1, 0.5); //STDDEV
return String.valueOf("Sum: " + total + //Works
"\nMean: " + average + //Works
"\nStandard Deviation: " + std_dev + //works
"\nVariance: " + total1); //works
//"\nNumbers Sorted: " + "( " + " )"
}
You need a second loop to calculate the variance
double variance=0;
for(String token : tokens) {
double value = Double.parseDouble(token);
variance += Math.pow(value-average,2);
}
variance = variance/tokens.length;
You are misunderstanding how to calculate the variance and the standard deviation. You don't subtract the average from the number of elements.
You need to calculate the differences of each individual sample from the mean, square them all, then add all of the squares.
double variance = 0.0;
for (i = 0; i < tokens.length; i++)
{
// It would be better to parse the tokens once and
// place them into an array; they are referenced twice.
double diff = Double.parseDouble(tokens[i]) - average;
variance += diff * diff;
}
// of (tokens.length - 1) for "sample" variance
variance /= tokens.length;
Then you can take the square root for the standard deviation.
double std_dev = Math.sqrt(variance);
Replace i with tokens.length() everywhere outside the loop. In the futute, to avoid errors like this, always initialize variables when you declare them, and always declare them in the narrowest scope. fot(int I=0 ... in this case. Note, that your formula for variance is wrong. This is not how you computer variance.

How to assign values to corresponding data in an ArrayList

I have 2 ArrayLists. The first contains 993 float values. I have added every 12 instances of these 993 float values together and stored them in another ArrayList (993/12), giving the second ArrayList a size of 83.
Here is a snippet of the second ArrayList:
919.2, 927.9, 809.39996, 633.8, 626.8, 871.30005, 774.30005, 898.8, 789.6, 936.3, 765.4, 882.1, 681.1, 661.3, 847.9, 683.9, 985.7, 771.1, 736.6, 713.2001, 774.49994, ...
The first of these values i.e 919.2 corresponds to the year 1930.
The second, 927.9 corresponds to the year 1931.
The third, 809.39996 corresponds to the year 1932 and so on... meaning the last 83rd value will correspond to 2012.
The ultimate aim I have is to look at these values in this second ArrayList and find the largest, printing its value AND the year that corresponds with it.
The issue is the program currently has no way of knowing these corresponding year values.
Allowed assumption: the first corresponding year value is 1930.
Currently I am able to successfully print the largest value in the ArrayList which is half the problem.
To achieve this is simply sorted the ArrayList:
System.out.println(depthAdd.get(depthAdd.size() -1));
What im lacking is the corresponding year. How can I do this???
Here is some code for you:
public void depthValues() {
ArrayList<Float> rValue = new ArrayList<>();
...
ArrayList<Float> depthAdd = new ArrayList<>();
Iterator<Float> it = rValue.iterator();
final int MAX = 12;
while(it.hasNext()) {
float sum = 0f;
int counter = 1;
while (counter <= MAX && it.hasNext()) {
sum += it.next();
counter++;
}
depthAdd.add(sum);
}
Collections.sort(depthAdd);
System.out.println("Wettest year: //year needs to go here "
+ depthAdd.get(depthAdd.size() -1));
return;
}
You could copy the original List (before sorting). And then iterate again to determine matching position(s). Another option is to create a custom class to contain the year and the value, and create a custom Comparator on the ammount of rainfall. The first might be implemented like
List<Float> depthCopy = new ArrayList<>(depthAdd);
Collections.sort(depthAdd);
Float max = depthAdd.get(depthAdd.size() - 1);
for (int i = 0; i < depthCopy.size(); i++) {
if (depthCopy.get(i).equals(max)) {
System.out.println("Wettest year: " + (1930 + i) + " "
+ depthAdd.get(depthAdd.size() - 1));
}
}
Just keep track of the index and maximum value in separate variables:
float maxValue;
int maxIndex;
while (it.hasNext()) {
float sum = 0f;
int counter = 1;
while (counter <= MAX && it.hasNext()) {
sum += it.next();
counter++;
}
if (sum > maxValue) {
maxValue = sum;
maxIndex = depthAdd.size(); // The current index
}
depthAdd.add(sum);
}
Then print the value:
System.out.println("The max was " + maxValue + " in " + (1930 + maxIndex));
If you need/want to store the value pairs you could use something like Map<Integer, Float> and directly reference the key value pair after sorting. Otherwise, either of the previous answers should work.
Float max = depthAdd.get(0);
int index = 0;
for (int i = 1; I < r.size(); i++)
if (depthAdd.get(i) > max) {
max = depthAdd.get(i);
index = i;
}
int year = 1930 + index;
And max will be the biggest value;
It sounds like you want a key,value pair. That's a job for a Map, not a List.
Map<Integer,Float> depthAdd = new HashMap<Integer,Float>();
...
int year = 1930;
while(it.hasNext()) {
float sum = 0f;
int counter = 1;
while (counter <= MAX && it.hasNext()) {
sum += it.next();
counter++;
}
depthAdd.put(year, sum);
year++;
}
...

Algorithm to find all possible values of A^5 + B^5 + C^5?

im trying to write an algorithm that will find all the possible values of A^5 + B^5 + C^5 when the user inputs a number 'N'.
For example if N=100 I want to make an array that contains all the possible values where each slot in the array contains a number that was found by plugging in numbers between 1-100 for A^5 + B^5 + C^5. So one of the positions in the array contains 1 from (1^5 + 1^5 + 1^5). Another position in the array contains
the number 355447518 (from 19^5 + 43^5 + 46^5). So there will be 100^3 elements in my array.
public long[] possibleValues(int n)
{
long[] solutionSet = new long[(int) Math.pow(n, 3)];
for(int i=1;i<=n;i++)
{
solutionSet[i] = ((long) Math.pow(i, 5) + (long) Math.pow(i, 5) + (long) Math.pow(i, 5));
//testing purposes
System.out.println(i +"^5 " + "+" + i+"^5 " + "+" + i+"^5" + "=" + solutionSet[i]);
}
return solutionSet;
}
thats what I have so far, but my problem is that it doesn't do all the permutations of N. What is the best way to get all possible permutations of N? Am i making this more complicated than necessary? How would I arrange all possible (A, B, C)'s ?
Use nested forloops:
index=0;
for (int i=1;i<=n;i++){
for (int j=1;i<=n;j++){
for (int k=1;i<=n;k++){
solutionSet[index++] = ((long) Math.pow(i, 5) + (long) Math.pow(j, 5) + (long) Math.pow(k, 5));
}
}
}
You can calculate all powers quicker by using an array containing all fifth powers up to N.
You're using i for all 3 terms, thus you're essentially calculating permutations of
A^5 + A^5 + A^5 = 3A^5.
You need a 3-dimensional array and 3 for loops.
public long[][][] possibleValues(int n)
{
long[][][] solutionSet = new long[n+1][n+1][n+1];
for(int i=1;i<=n;i++)
for(int j=1;j<=n;j++)
for(int k=1;k<=n;k++)
{
solutionSet[i][j][k] = ((long) Math.pow(i, 5) + (long) Math.pow(j, 5) + (long) Math.pow(k, 5));
//testing purposes
System.out.println(i +"^5 " + "+" + j+"^5 " + "+" + k+"^5" + "=" + solutionSet[i][j][k]);
}
return solutionSet;
}
If you indeed only want a 1-dimensional array, you'll do something similar to the above, just have a separate variable for the index:
Since you probably don't want excessive repetition of values, you can probably start j from i and k from j.
public long[] possibleValues(int n)
{
long[] solutionSet = new long[n*n*n];
int c = 0;
for(int i = 1; i <= n; i++)
for(int j = i; j <= n; j++)
for(int k = j; k <= n; k++)
{
solutionSet[c] = ((long) Math.pow(i, 5) + (long) Math.pow(j, 5) + (long) Math.pow(k, 5));
//testing purposes
System.out.println(i +"^5 " + "+" + j+"^5 " + "+" + k+"^5" + "=" + solutionSet[c]);
c++;
}
return solutionSet;
}
Some significant optimizations can still be done:
Math.pow isn't particularly efficient, as Peter mentioned.
For the first version, you can derive values from previous values in certain circumstances.
The really brute-force way to do it would require three nested loops:
for(int a = 1; a <= n; ++a)
{
for(int b = 1; b <= n; ++b)
{
for(int c = 1; c <= n; ++c)
{
// Add this combination to your array, and print it out.
// It may be more convenient to use ArrayList instead of long[].
}
}
}
Note that for this takes O(n^3) time, so n doesn't have to be very large before it will take forever to compute (and also use up all of your memory).
Use three loops. One each for A, B, C. This is a pseudo code and does not adhere to java syntax
for(int A:100){
for(int B:100){
for(int C:100) {
calculate A^5 * B^5 * C^5
}
}
}
I agree with the other answers about nested forloops. For better performance it may be profitable to store the answers in a hash table so that you don't recalculate the same value. For instance, you calculate 15^5 then you store that answer in an array like ans['155'] = 759375. So when you go to calculate 15^5 again you can do an if statement if(ans[num.tostring+'5']) then use that value instead of calculating 15^5 again.
Starting from #Dukeling previous answer:
I use a powers array to compute the powers just n times (not n*n*n)
public static void test(int n){
long[] powers = new long[n+1];
for (int i=0; i<powers.length; i++)
powers[i] = (long) Math.pow(i, 5);
long[][][] solutionSet = new long[n+1][n+1][n+1];
for(int i=1;i<=n;i++)
for(int j=1;j<=n;j++)
for(int k=1;k<=n;k++)
{
solutionSet[i][j][k] = ((long) powers[i] + (long) powers[i] + (long) powers[i]);
//testing purposes
System.out.println(i +"^5 " + "+" + j+"^5 " + "+" + k+"^5" + "=" + solutionSet[i][j][k]);
}
}
I believe you are looking for a combination and not a permutation. It also seems that you want A, B, and C to be all possible values from 1 to N. In that case, you'll want to make your nested for loop as such to only calculate the combinations:
for (int a = 0; a < n; a++) {
for (int b = 0; b <= a; b++) {
for (int c = 0; c <= b; c++) {
pow5(a) + pow5(b) + pow5(c);
}
}
}
You'll also want to use a lookup table which could be loaded from a file. The more values in your lookup table, the faster your algorithm will perform. In my opinion, the best method will reduce the number of operations required. That means not calculating every value at runtime. Alternatively, you could also optimize for memory usage and just use a simple algorithm. Additionally, you'll want to measure the performance of the algorithm. Here is an example.
// for all number > 0 and <= 25
public static final double[] powersOf5 = {1.0, 32.0, 243.0, 1024.0, 3125.0,
7776.0, 16807.0, 32768.0, 59049.0, 100000.0, 161051.0, 248832.0, 371293.0,
537824.0, 759375.0, 1048576.0, 1419857.0, 1889568.0, 2476099.0, 3200000.0,
4084101.0, 5153632.0, 6436343.0, 7962624.0, 9765625.0};
// calc pow(i, 5) and use a lookup table for small values i
public static double pow5(int i) {
if (i > 0 && i <= 25) {
return powersOf5[i-1];
} else {
return Math.pow(i, 5);
}
}
public static void main(String[] args) {
long start = System.currentTimeMillis();
for (int i = 0; i < 100; i++) {
System.out.println(pow5(i));
}
long end = System.currentTimeMillis();
System.out.println("Execution time: " + (end - start) + " ms");
}
have a think at first
1. 1^5 + 2^5 + 3^5 = 3^5 + 2^5 +1^5 , So i<j<k
for(i=0;i<N;i++)
for(j=i;j<N;j++)
for(k=j;k<N;k++)
2. A^5+B^5+C^5=D^5+E^5+F^5
If we use array , there may be lots of same value in it.
we can use Set to save memory, if time is not the most important.
3. A^5 cannot be saved by Long type, when A is too big.
So, do we make sure N is little? otherwise, there may be a bug.
4. Multiplication cost lots of time.
Give a example, if N=100, to get all result, how many times does it spend
calc 5^5.
5^5+1^5+1^5
5^5+1^5+2^5
5^5+1^5+3^5
...
How about if there is an array save the answer
define array[i] = i^5
Then it save our time;
Just think more, Algorithm is something that like this
Now let's talk more about Math.pow();
Yes it's a good method that help you, but this is an algorithm which is impl, we just want to know A^5, not A^N, the second parameter is static;
Why not impl a method by yourself.
First, we try to impl a method like this
public Long powOf5(Long A){
return A*A*A*A*A;
}
Then, we find we can optimize it.
public Long powOf5(Long A){
Long A2 = A*A;
return A2*A2*A;
}
This multiply 3 times, that multiply 4 times;
I am sure this method is faster than Math.pow()

How to Shrink array to specified length in java keeping elements uniformaly distributed?

I have source array, and I want to generate new array from the source array by removing a specified number of elements from the source array, I want the elements in the new array to cover as much as possible elements from the source array (the new elements are uniformly distributed over the source array) and keeping the first and last elements the same (if any).
I tried this :
public static void printArr(float[] arr)
{
for (int i = 0; i < arr.length; i++)
System.out.println("arr[" + i + "]=" + arr[i]);
}
public static float[] removeElements(float[] inputArr , int numberOfElementToDelete)
{
float [] new_arr = new float[inputArr.length - numberOfElementToDelete];
int f = (inputArr.length ) / numberOfElementToDelete;
System.out.println("f=" + f);
if(f == 1)
{
f = 2;
System.out.println("f=" + f);
}
int j = 1 ;
for (int i = 1; i < inputArr.length ; i++)
{
if( (i + 1) % f != 0)
{
System.out.println("i=" + i + " j= " + j);
if(j < new_arr.length)
{
new_arr[j] = inputArr[i];
j++;
}
}
}
new_arr[0] = inputArr[0];
new_arr[new_arr.length - 1] = inputArr[inputArr.length - 1];
return new_arr;
}
public static void main(String[] args)
{
float [] a = {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16};
a = removeElements(a, 6);
printArr(a);
}
I have made a test for(removeElements(a, 5) and removeElements(a, 4) and removeElements(a, 3)) but removeElements(a, 6); gave :
arr[0]=1.0
arr[1]=3.0
arr[2]=5.0
arr[3]=7.0
arr[4]=9.0
arr[5]=11.0
arr[6]=13.0
arr[7]=15.0
arr[8]=0.0
arr[9]=16.0
the problem is (arr[8]=0.0) it must take a value ..
How to solve this? is there any code that can remove a specified number of elements (and keep the elements distributed over the source array without generating zero in some elements)?
EDIT :
examples :
removeElements(a, 1) ==> remove one element from the middle (7) {1,2,3,4,5,6,7,9,10,11,12,13,14,15,16}
removeElements(a, 2) ==> remove two elements at indexes (4,19) or (5,10) or (4,10) (no problem)
removeElements(a, 3) ==> remove three elements at indexes (4,9,14) or (4,10, 15) or(no problem also)
removeElements(a, 4) ==> remove four elements at indexes (3,7,11 , 15) or ( 3 ,7,11,14) for example ..
what I want is if I draw the values in the source array on (chart on Excel for example) and I draw the values from the new array , I must get the same line (or close to it).
I think the main problem in your code is that you are binding the selection to
(inputArr.length ) / numberOfElementToDelete
This way you are not considering the first and the last elements that you don't want to remove.
An example:
if you have an array of 16 elements and you want to delete 6 elements it means that the final array will have 10 elements but, since the first and the last are fixed, you'll have to select 8 elements out of the remaining 14. This means you'll have to select 8/14 (0,57) elements from the array (not considering the first and the last).
This means that you can initialize a counter to zero, scan the array starting from the second and sum the value of the fraction to the counter, when the value of the counter reach a new integer number (ex. at the third element the counter will reach 1,14) you'll have an element to pick and put to the new array.
So, you can do something like this (pseudocode):
int newLength = originalLength - toDelete;
int toChoose = newLength - 2;
double fraction = toChoose / (originalLength -2)
double counter = 0;
int threshold = 1;
int newArrayIndex = 1;
for(int i = 1; i < originalLength-1; i++){
**counter += fraction;**
if(integerValueOf(counter) == threshold){
newArray[newArrayIndex] = originalArray[i];
threshold++;
**newArrayIndex++;**
}
}
newArray[0] = originalArray[0];
newArray[newArray.length-1] = originalArray[originalArray.length-1];
You should check for the particular cases like originalArray of length 1 or removal of all the elements but I think it should work.
EDIT
Here is a Java implementation (written on the fly so I didn't check for nulls etc.)
public class Test {
public static void main(String[] args){
int[] testArray = {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16};
int[] newArray = remove(testArray, 6);
for(int i = 0; i < newArray.length; i++){
System.out.print(newArray[i]+" ");
}
}
public static int[] remove(int[] originalArray, int toDelete){
if(toDelete == originalArray.length){
//avoid the removal of all the elements, save at least first and last
toDelete = originalArray.length-2;
}
int originalLength = originalArray.length;
int newLength = originalLength - toDelete;
int toChoose = newLength - 2;
int[] newArray = new int[newLength];
double fraction = ((double)toChoose) / ((double)originalLength -2);
double counter = 0;
int threshold = 1;
int newArrayIndex = 1;
for(int i = 1; i < originalLength-1; i++){
counter += fraction;
if(((int)counter) == threshold ||
//condition added to cope with x.99999999999999999... cases
(i == originalLength-2 && newArrayIndex == newLength-2)){
newArray[newArrayIndex] = originalArray[i];
threshold++;
newArrayIndex++;
}
}
newArray[0] = originalArray[0];
newArray[newArray.length-1] = originalArray[originalArray.length-1];
return newArray;
}
}
Why cant you just initialize i=0
for (int i = 0; i < inputArr.length; i++) {
if ((i + 1) % f != 0) {
Following is the output:
arr[0]=1.0
arr[1]=1.0
arr[2]=3.0
arr[3]=5.0
arr[4]=7.0
arr[5]=9.0
arr[6]=11.0
arr[7]=13.0
arr[8]=15.0
arr[9]=16.0
This is Reservoir sampling if I understand it right i.e from a large array, create a small array by randomly choosing.

What is the algorithm OR mathematics for Fisher's Exact Test?

I need a Fisher's Exact Test for a matrix n x m. I've been searching for hours and I've only found one example code, but it's written in Fortran. I've been working off of Wolfram and I'm close to finishing, but I'm missing the very last bit.
/**
* Performs Fisher's Exact Test on a matrix m x n
* #param matrix Any matrix m x n.
* #return The Fisher's Exact value of the matrix
* #throws IllegalArgumentException If the rows are not of equal length
* #author Ryan Amos
*/
public static double getFisherExact(int[][] matrix){
System.out.println("Working with matrix: ");
printMatrix(matrix);
for (int[] array : matrix) {
if(array.length != matrix[0].length)
throw new IllegalArgumentException();
}
boolean chiSq = matrix.length != 2 || matrix[0].length != 2;
int[] rows = new int[matrix.length];
int[] columns = new int[matrix[0].length];
int n;
//compute R and C values
for (int i = 0; i < matrix.length; i++) {
for (int j = 0; j < matrix[i].length; j++) {
rows[i] += matrix[i][j];
columns[j] += matrix[i][j];
}
System.out.println("rows[" + i + "] = " + rows[i]);
}
for (int i = 0; i < columns.length; i++) {
System.out.println("columns[" + i + "] = " + columns[i]);
}
//compute n
n = 0;
for (int i = 0; i < columns.length; i++) {
n += columns[i];
}
int[][][] perms = findAllPermutations(rows, columns);
double sum = 0;
//int count = 0;
double cutoff = chiSq ? getChiSquaredValue(matrix, rows, columns, n) : getConditionalProbability(matrix, rows, columns, n);
System.out.println("P cutoff = " + cutoff + "\n");
for (int[][] is : perms) {
System.out.println("Matrix: ");
printMatrix(is);
double val = chiSq ? getChiSquaredValue(is, rows, columns, n) : getConditionalProbability(is, rows, columns, n);
System.out.print("Value: " + val);
if(val <= cutoff){
//count++;
System.out.print(" is below " + cutoff);
// sum += (chiSq) ? getConditionalProbability(is, rows, columns, n) : val;
// sum += val;
double p = getConditionalProbability(is, rows, columns, n);
System.out.print("\np = " + p + "\nsum = " + sum + " + p = ");
sum += p;
System.out.print(sum);
} else {
System.out.println(" is above " + cutoff + "\np = " + getConditionalProbability(is, rows, columns, n));
}
System.out.print("\n\n");
}
return sum;
//return count / (double)perms.length;
}
All of the other methods have been tested and debugged. The issue is that I'm not exactly sure where to go from finding all of the possible matrices (all matrices with the same row and column sums). I'm not sure how to take those matrices and turn them into a p value. I read something about chi-squared, so I found a chi-squared algorithm.
So my question is:
From what I have (all the permutations of the matrix), how do I calculate the p value?
All of my attempts are either in the last for loops or commented out of the last for loop.
Here is the entire code: http://pastie.org/private/f8lga9oj6f8vrxiw348q
edit:
Looking at wolfram, it seems that n x m size problem can be solved with:
public static BigDecimal getHypergeometricDistribution(//
int a[][], int scale, int roundingMode//
) throws OutOfMemoryError, NullPointerException {
ArrayList<Integer> R = new ArrayList<Integer>();
ArrayList<Integer> C = new ArrayList<Integer>();
ArrayList<Integer> E = new ArrayList<Integer>();
int n = 0;
for (int i = 0; i < a.length; i++) {
for (int j = 0; j < a[i].length; j++) {
if (a[i][j] < 0)
return null;
n += a[i][j];
add(C, j, a[i][j]);
add(R, i, a[i][j]);
E.add(a[i][j]);
}
}
BigDecimal term1 = //
new BigDecimal(multiplyFactorials(C).multiply(multiplyFactorials(R)));
BigDecimal term2 = //
new BigDecimal(getFactorial(n).multiply(multiplyFactorials(E)));
return term1.divide(term2, scale, roundingMode);
}
For getBinomialCoefficient, getFactorial and comments, check out my gist.
Factorials grow very quickly, for example:
long can store 20 first factorial values.
double can store 170 first factorial values.
Wolfram example case:
int[][] a = { { 5, 0 }, { 1, 4 } };
System.out.println(hdMM.getHypergeometricDistribution(a, 60, 6));
would result in:
0.023809523809523809523809523809523809523809523809523809523810
edit 2:
My method is fast, but not memory efficient, if sum of input matrix elements exceeds 10000, this can be a problem. Reason for it is memoization of factorials.
Almost equivalent function in Mathematica, without this problem:
FeT1::usage = "Fisher's exact Test, 1 tailed. For more information:
http://mathworld.wolfram.com/FishersExactTest.html";
FeT1[a_List, nr_Integer: 6] := Module[{},
SumRow[array_] := Total[Transpose[array]];
SumTotal[array_] := Total[Total[array]];
SumColumn[array_] := Total[array];
TF[list_] := Times ## (list!);
N[(TF[SumColumn[a]]*TF[SumRow[a]])/(SumTotal[a]!* TF[Flatten[a]]), nr]
];
and example usage:
a = {{5, 0}, {1, 4}};
FeT1[a, 59]
would yield to
0.023809523809523809523809523809523809523809523809523809523810
Mathematica also has statistical packages available where Fisher's Exact Test is implemented. IMHO writing this in Java can be 20% faster, but effort needed is about 200% and development time is 400%.
Here's the probability equation (in LaTeX format):
The conditional probability of getting the actual matrix given the particular row and column sums, given by
[![\begin{equation}
\begin{split}
P &=\prod_{i=1}^r \prod_{j=1}^c \frac{n_{i.}!n_{.j}!}{n_{..}!n_{ij}}\\
&=\frac{(n_{1.}!n_{2.}! \cdots n_{r.}!)(n_{.1}!n_{.2}! \cdots n_{.c}!)}{n_{..}!\prod_i \prod_j n_{ij}!}
\end{split}
\end{equation}]
which is a multivariate generalization of the hypergeometric probability function.
If you use 100,000 iterations, and have smaller tables, say, up to 5x5, you will pretty much be close to convergence of a true exact test.
I have found the answer to my question. After talking with a statistician this morning, he asked me to sum up all of the values and see what came of it. I found that the sum of the values (as expected) was above 1. However, I also found that I could use the sum to scale the p-value to 0
sum of the conditional probability values of the matrices with less than or equal X^2 p-values
DIVIDED BY
sum of all conditional probability values of all matrices
I checked my answer with the R fisher exact test

Categories

Resources