Apache commons-math3 (version 3.6.1) classes like OLSMultipleLinearRegression, SimpleRegression provide a method that calculates RSquare (i.e calculateRSquared(), getRSquare() respectively). But I am not able to find any such method for PolynomialCurveFitter ?
Right now I am doing it myself like below. Is there any such method in common-math which does this?
private PolynomialFunction getPolynomialFitter(List<List<Double>> pointlist) {
final PolynomialCurveFitter fitter = PolynomialCurveFitter.create(2);
final WeightedObservedPoints obs = new WeightedObservedPoints();
for (List<Double> point : pointlist) {
obs.add(point.get(0), point.get(1));
}
double[] fit = fitter.fit(obs.toList());
System.out.printf("\nCoefficient %f, %f, %f", fit[0], fit[1], fit[2]);
final PolynomialFunction fitted = new PolynomialFunction(fit);
return fitted;
}
private double getRSquare(PolynomialFunction fitter, List<List<Double>> pointList) {
final double[] coefficients = fitter.getCoefficients();
double[] predictedValues = new double[pointList.size()];
double residualSumOfSquares = 0;
final DescriptiveStatistics descriptiveStatistics = new DescriptiveStatistics();
for (int i=0; i< pointList.size(); i++) {
predictedValues[i] = predict(coefficients, pointList.get(i).get(0));
double actualVal = pointList.get(i).get(1);
double t = Math.pow((predictedValues[i] - actualVal), 2);
residualSumOfSquares += t;
descriptiveStatistics.addValue(actualVal);
}
final double avgActualValues = descriptiveStatistics.getMean();
double totalSumOfSquares = 0;
for (int i=0; i<pointList.size(); i++) {
totalSumOfSquares += Math.pow( (predictedValues[i] - avgActualValues),2);
}
return 1.0 - (residualSumOfSquares/totalSumOfSquares);
}
final PolynomialFunction polynomial = getPolynomialFitter(trainData);
System.out.printf("\nPolynimailCurveFitter R-Square %f", getRSquare(polynomial, trainData));
This has been answered in apache-commons mailing list. Cross-posting the answer
OLSMultipleLinearRegression, SimpleRegression provide a method that
returns calculateRSquared(),
getRSquare(). But I am not able to find any such method for
PolynomialCurveFitter ?
Right now I am doing it myself like below :-
Is there any such method in common-math which does this?
"PolynomialCurveFitter" is one of the syntactic sugar/wrapper
around the least-squares optimizers.
No state is maintained in the (immutable) instance.
private PolynomialFunction getPolynomialFitter(List<List<Double>>pointlist) {
final PolynomialCurveFitter fitter = PolynomialCurveFitter.create(2);
final WeightedObservedPoints obs = new WeightedObservedPoints();
for (List<Double> point : pointlist) {
obs.add(point.get(0), point.get(1));
}
double[] fit = fitter.fit(obs.toList());
System.out.printf("\nCoefficient %f, %f, %f", fit[0], fit[1], fit[2]);
final PolynomialFunction fitted = new PolynomialFunction(fit);
return fitted;
}
This is indeed one the intended use-cases.
private double getRSquare(PolynomialFunction fitter, List<List<Double>> pointList) {
final double[] coefficients = fitter.getCoefficients();
double[] predictedValues = new double[pointList.size()];
double residualSumOfSquares = 0;
final DescriptiveStatistics descriptiveStatistics = new DescriptiveStatistics();
for (int i=0; i< pointList.size(); i++) {
predictedValues[i] = predict(coefficients, pointList.get(i).get(0));
double actualVal = pointList.get(i).get(1);
double t = Math.pow((predictedValues[i] - actualVal), 2);
residualSumOfSquares += t;
descriptiveStatistics.addValue(actualVal);
}
final double avgActualValues = descriptiveStatistics.getMean();
double totalSumOfSquares = 0;
for (int i=0; i<pointList.size(); i++) {
totalSumOfSquares += Math.pow( (predictedValues[i] - avgActualValues),2);
}
return 1.0 - (residualSumOfSquares/totalSumOfSquares);
}
The "predict" method is not shown here, but note that the argument
which you called "fitter" in the above, is actually a polynomial
function:
http://commons.apache.org/proper/commons-math/apidocs/org/apache/commons/math4/analysis/polynomials/PolynomialFunction.html
Hence:
predictedValues[i] = fitter.value(pointList.get(i).get(0));
But otherwise, yes, the caller is responsible for choosing his
assessement of the quality of the model.
You could directly use the least-squares suite of classes; then
the "Evaluation" object would allow to retrieve various measures
of the fit:
http://commons.apache.org/proper/commons-math/apidocs/org/apache/commons/math4/fitting/leastsquares/LeastSquaresProblem.Evaluation.html
However, they might still not be what you are looking for...
Related
I've recently started the AI-Class at Coursera and I've a question related to my implementation of the gradient descent algorithm.
Here's my current implementation (I actually just "translated" the mathematical expressions into Java code):
public class GradientDescent {
private static final double TOLERANCE = 1E-11;
private double theta0;
private double theta1;
public double getTheta0() {
return theta0;
}
public double getTheta1() {
return theta1;
}
public GradientDescent(double theta0, double theta1) {
this.theta0 = theta0;
this.theta1 = theta1;
}
public double getHypothesisResult(double x){
return theta0 + theta1*x;
}
private double getResult(double[][] trainingData, boolean enableFactor){
double result = 0;
for (int i = 0; i < trainingData.length; i++) {
result = (getHypothesisResult(trainingData[i][0]) - trainingData[i][1]);
if (enableFactor) result = result*trainingData[i][0];
}
return result;
}
public void train(double learningRate, double[][] trainingData){
int iteration = 0;
double delta0, delta1;
do{
iteration++;
System.out.println("SUBS: " + (learningRate*((double) 1/trainingData.length))*getResult(trainingData, false));
double temp0 = theta0 - learningRate*(((double) 1/trainingData.length)*getResult(trainingData, false));
double temp1 = theta1 - learningRate*(((double) 1/trainingData.length)*getResult(trainingData, true));
delta0 = theta0-temp0; delta1 = theta1-temp1;
theta0 = temp0; theta1 = temp1;
}while((Math.abs(delta0) + Math.abs(delta1)) > TOLERANCE);
System.out.println(iteration);
}
}
The code works quite well but only if I choose an very little alpha, here called learningRate. If it's higher than 0.00001, it diverges.
Do you have any suggestions on how to optimize the implementation, or an explanation for the "Alpha-Issue" and a possible solution for it?
Update:
Here's the main including some sample inputs:
private static final double[][] TDATA = {{200, 20000},{300, 41000},{900, 141000},{800, 41000},{400, 51000},{500, 61500}};
public static void main(String[] args) {
GradientDescent gd = new GradientDescent(0,0);
gd.train(0.00001, TDATA);
System.out.println("THETA0: " + gd.getTheta0() + " - THETA1: " + gd.getTheta1());
System.out.println("PREDICTION: " + gd.getHypothesisResult(300));
}
The mathematical expression of gradient descent is as follows:
To solve this issue, it's necessary to normalize the data with this formular: (Xi-mu)/s.
Xi is the current training set value, mu the average of values in the current column and s the maximum value minus the minimum value of the current column. This formula will get the training data approximately into a range between -1 and 1 which allowes to choose higher learning rates and gradient descent to converge faster.
But it's afterwards necessary to denormalize the predicted result.
private double getResult(double[][] trainingData, boolean enableFactor){
double result = 0;
for (int i = 0; i < trainingData.length; i++) {
result = (getHypothesisResult(trainingData[i][0]) - trainingData[i][1]);
if (enableFactor) result = result*trainingData[i][0];
}
return result;
In this func. result variable overwritten each iteration, the old value being lost. When inputing the values only the last item on array is calculating. Rest of them dont matter.
You should use java.math.BigDecimal for your arithematic operations.
double has its rounding-off issues while performing any arithematic.
I am gathering 10 acceleration values from an on-board accelerometer on a mobile device. I am then attempting to normalize these values between the range of -1,1. I am unable to figure out why this isn't working correctly.
Here is the normalization code:
class NormUtil {
private double dataHigh;
private double dataLow;
private double normalizedHigh;
private double normalizedLow;
public NormUtil(double dataHigh, double dataLow) {
this(dataHigh, dataLow, 1, -1);
}
public NormUtil(double dataHigh, double dataLow, double normalizedHigh, double normalizedLow) {
this.dataHigh = dataHigh;
this.dataLow = dataLow;
this.normalizedHigh = normalizedHigh;
this.normalizedLow = normalizedLow;
}
public double normalize(double e) {
return ((e - dataLow)
/ (dataHigh - dataLow))
* (normalizedHigh - normalizedLow) + normalizedLow;
}
On a button press, the highest/lowest acceleration values are found in this code:
A = enrolAcc.get(0);
B = enrolAcc.get(0);
for (Float i: enrolAcc) {
if(i < A) A = i;
if(i > B) B = i;
}
Once the highest/lowest values are found a NormUtil instance is created and this instance is used to normalize the array of acceleration values and then add the normalized values to a new array:
NormUtil norm = new NormUtil(B,A,1,-1);
for(int j = 0; j < enrolAcc.size(); j++) {
double start = enrolAcc.get(j);
double x = norm.normalize(start);
nAcc[j] = x;
}
This nAcc array is then put into a string array and then a single string to display in a text view. The issue is the text view is always initialized with the original non-normalized acceleration values, here is the code I use for that:
String normD[] = new String[10];
for (int i = 0; i < 10; i++) {
normD[i] = String.valueOf(nAcc[i]);
}
StringBuilder strBuilder2 = new StringBuilder();
for (int i = 0; i<normD.length; i++) {
strBuilder2.append(normD[i] + ",");
}
normData = strBuilder.toString();
textNorm.setText("Normalised: " + normData);
So my question is, where am I going wrong with adding the normalized values to the normalized array and is this normalization method correct for what I am trying to achieve? Thanks in advance.
I have all the components, I just am not quite sure This is my output:
Theta-->: 0.09604203456288299, 1.1864676227195392
How do I interpret that? What does it mean?
I essentially just modified the example from this description. But I'm not sure if it's really applicable to my problem. I'm trying to perform binary classification on a set of documents. The documents are rendered as bag-of-words style feature vectors of the form:
Example:
Document 1 = ["I", "am", "awesome"]
Document 2 = ["I", "am", "great", "great"]
Dictionary is:
["I", "am", "awesome", "great"]
So the documents as a vector would look like:
Document 1 = [1, 1, 1, 0]
Document 2 = [1, 1, 0, 2]
This is my gradient descent code:
public static double [] gradientDescent(final double [] theta_in, final double alpha, final int num_iters, double[][] data )
{
final double m = data.length;
double [] theta = theta_in;
double theta0 = 0;
double theta1 = 0;
for (int i = 0; i < num_iters; i++)
{
final double sum0 = gradientDescentSumScalar0(theta, alpha, data );
final double sum1 = gradientDescentSumScalar1(theta, alpha, data);
theta0 = theta[0] - ( (alpha / m) * sum0 );
theta1 = theta[1] - ( (alpha / m) * sum1 );
theta = new double [] { theta0, theta1 };
}
return theta;
}
//data is the feature vector
//this theta is weight
protected static double [] matrixMultipleHthetaByX( final double [] theta, double[][] data )
{
final double [] vector = new double[ data.length ];
int i = 0;
for (final double [] d : data)
{
vector[i] = (1.0 * theta[0]) + (d[0] * theta[1]);
i++;
} // End of the for //
return vector;
}
protected static double gradientDescentSumScalar0(final double [] theta, final double alpha, double[][] data )
{
double sum = 0;
int i = 0;
final double [] hthetaByXArr = matrixMultipleHthetaByX(theta, data );
for (final double [] d : data)
{
final double X = 1.0;
final double y = d[1];
final double hthetaByX = hthetaByXArr[i];
sum = sum + ( (hthetaByX - y) * X );
i++;
} // End of the for //
return sum;
}
protected static double gradientDescentSumScalar1(final double [] theta, final double alpha, double[][] data )
{
double sum = 0;
int i = 0;
final double [] hthetaByXArr = matrixMultipleHthetaByX(theta, data );
for (final double [] d : data)
{
final double X = d[0];
final double y = d[1];
final double hthetaByX = hthetaByXArr[i];
sum = sum + ( (hthetaByX - y) * X );
i++;
} // End of the for //
return sum;
}
public static double [] batchGradientDescent( double [] weights, double[][] data )
{
/*
* From tex:
* \theta_j := \theta_j - \alpha\frac{1}{m} \sum_{i=1}^m ( h_\theta (x^{(i)})
*/
final double [] theta_in = weights;
double [] theta = gradientDescent(theta_in, alpha, iterations, data );
lastTheta = theta;
System.out.println("Theta-->: " + theta[0] + ", " + theta[1]);
return theta;
}
I call it like this:
final int globoDictSize = globoDict.size(); // number of features
double[] weights = new double[globoDictSize + 1];
for (int i = 0; i < weights.length; i++)
{
//weights[i] = Math.floor(Math.random() * 10000) / 10000;
//weights[i] = randomNumber(0,1);
weights[i] = 0.0;
}
int inputSize = trainingPerceptronInput.size();
double[] outputs = new double[inputSize];
final double[][] a = Prcptrn_InitOutpt.initializeOutput(trainingPerceptronInput, globoDictSize, outputs, LABEL);
for (int p = 0; p < inputSize; p++)
{
Gradient_Descent.batchGradientDescent( weights, a );
}
How can I verify that this code is doing what I want? Shouldn't it be outputting a predicted label or something? I've heard I can also apply to it an error function, such as hinge loss, that would come after the call to batch gradient descent as a seperate component, isn't it?
You code is complicated (I used to implement batch gradient descent in Octave, not in OO programming languages). But as far as I see in your code (and it is a common to use this notation) Theta is a parameter vector. After grad descend algorithm converges it returns you optimal Theta vector. After that you could claculate output of your new example with formula:
theta_transposed * X,
where theta_trasponsed is a transposed vector of theta, X is a vector of input features.
On a side note, the example you have referred to is a regression task (it is about linear regression). While the task you describe is a classification problem, where instead of predicting some value (some number - weight, length, smth else) you need to assign a label to input set. It can be completed with lots of different algorithms, but defenetily not with linear regression which is described in article you posted.
I also need to mentioned that it is absolutely not clear what kind of classification you try to perform. In your exmaple you have a bag of words description (matrixes of word counts). But where are classificaiton labels? Is it multi-output classification? Or just multi-class? Or binary?
I really suggest you to take a course on ml. Maybe on coursera. This one is good:
https://www.coursera.org/course/ml
It also covers full implementaion of gradient descent.
I know about the Math.sin() and Math.cos() functions, but I'm wondering if there's a way I can create (or use an already-existing) a faster function, given that I don't care about pinpoint accuracy. I'm looking to execute a basic sin or cos calculation, and have it perform essentially as fast as possible. Would simply iterating the sigma a few times be any faster than Math.sin()?
Since you don't care much about accuracy store it in a table that is precomputed or only computed once, this is what I do when I want to avoid calls to Math which can be expensive when done alot.
Roughly
public class CosSineTable {
double[] cos = new double[361];
double[] sin = new double[361];
private static CosSineTable table = new CosSineTable();
private CosSineTable() {
for (int i = 0; i <= 360; i++) {
cos[i] = Math.cos(Math.toRadians(i));
sin[i] = Math.sin(Math.toRadians(i));
}
}
public double getSine(int angle) {
int angleCircle = angle % 360;
return sin[angleCircle];
}
public double getCos(int angle) {
int angleCircle = angle % 360;
return cos[angleCircle];
}
public static CosSineTable getTable() {
return table;
}
}
I leave the optimization of the loop and methods to you.
A pre-calculated table's the way to go. Here's an implementation:
static final int precision = 100; // gradations per degree, adjust to suit
static final int modulus = 360*precision;
static final float[] sin = new float[modulus]; // lookup table
static {
// a static initializer fills the table
// in this implementation, units are in degrees
for (int i = 0; i<sin.length; i++) {
sin[i]=(float)Math.sin((i*Math.PI)/(precision*180));
}
}
// Private function for table lookup
private static float sinLookup(int a) {
return a>=0 ? sin[a%(modulus)] : -sin[-a%(modulus)];
}
// These are your working functions:
public static float sin(float a) {
return sinLookup((int)(a * precision + 0.5f));
}
public static float cos(float a) {
return sinLookup((int)((a+90f) * precision + 0.5f));
}
On my laptop, these were about 6x faster than Math.sin.
I only used one table -- the cost of shifting a cosine into a sine wasn't really discernible.
I used floats, assuming that's what you'll likely use in your calculations, given your preference for performance over precision. It doesn't make much difference here, since the bottleneck is really just the array lookup.
Here are my benchmarks:
public static void main(String[] args) {
int reps = 1<<23;
int sets = 4;
Q.pl(" Trial sinTab cosTab sinLib");
for(int i = 0; i<sets; i++) {
Q.pf("%7d\t%7.2f\t%7.2f\t%7.2f\n", i, testSinTab(reps), testCosTab(reps), testSinLib(reps));
}
}
private static float[] sample(int n) {
Random rand = new Random();
float[] values = new float[n];
for (int i=0; i<n; i++) {
values[i] = 400*(rand.nextFloat()*2-1);
}
return values;
}
private static float testSinTab(int n) {
float[] sample = sample(n);
long time = -System.nanoTime();
for (int i=0; i<n; i++) {
sample[i] = sin(sample[i]);
}
time += System.nanoTime();
return (time/1e6f);
}
private static float testCosTab(int n) {
float[] sample = sample(n);
long time = -System.nanoTime();
for (int i=0; i<n; i++) {
sample[i] = cos(sample[i]);
}
time += System.nanoTime();
return time/1e6f;
}
private static float testSinLib(int n) {
float[] sample = sample(n);
long time = -System.nanoTime();
for (int i=0; i<n; i++) {
sample[i] = (float) Math.sin(sample[i]);
}
time += System.nanoTime();
return time/1e6f;
}
output:
Trial sinTab cosTab sinLib
0 102.51 111.19 596.57
1 93.72 92.20 578.22
2 100.06 107.20 600.68
3 103.65 102.67 629.86
You can try
http://sourceforge.net/projects/jafama/
It uses look-up tables, so it might actually be slower
than Math, especially if the tables are often evicted from CPU cache,
but for thousands of successive calls it can be quite faster.
It also seems slower during class load (maybe the JIT doesn't kicks in then yet),
so you might want to avoid it in that particular use-case.
I know this question is old, but I think it's the fastest java implementation sintable with precision to 65536 elements.
public class MathHelper {
private static double[] a = new double[65536];
public static final double sin(float f) {
return a[(int) (f * 10430.378F) & '\uffff'];
}
public static final double cos(float f) {
return a[(int) (f * 10430.378F + 16384.0F) & '\uffff'];
}
static {
for (int i = 0; i < 65536; ++i) {
a[i] = Math.sin((double) i * 3.141592653589793D * 2.0D / 65536.0D);
}
}
}
Source: https://github.com/Bukkit/mc-dev/blob/master/net/minecraft/server/MathHelper.java
I created backpropagation Neural Network using Matlab. I tried to implement XOR gate using Matlab, then getting its weight and bias to create neural network in java. Network consist of 2 input neuron, 2 hidden layer each using 2 neuron and 1 output neuron. After train network, i got following weight and bias :
clear;
clc;
i = [0 0 1 1; 0 1 0 1];
o = [0 1 1 0];
net = newff(i,o,{2,2},{'tansig','logsig','purelin'});
net.IW{1,1} = [
-5.5187 -5.4490;
3.7332 2.7697
];
net.LW{2,1} = [
-2.8093 -3.0692;
-1.6685 6.7527
];
net.LW{3,2} = [
-4.9318 -0.9651
];
net.b{1,1} = [
2.1369;
2.6529
];
net.b{2,1} = [
-0.2274;
-4.9512
];
net.b{3,1} = [
1.4848
];
input = net.IW{1,1};
layer = net.LW{2,1};
output = net.LW{3,2};
biasinput = net.b{1,1};
biaslayer = net.b{2,1};
biasoutput= net.b{3,1};
a = sim(net,i);
a;
I simulate it using 1 and 1 as input got following result :
>> f = [1;1]
f =
1
1
>> sim(net,f)
ans =
-0.1639
Then I tried to make simple java code to count this neural network. My code :
public class Xor {
//Value of neuron
static double[] neuroninput = new double[2];
static double[] neuronhidden1 = new double[2];
static double[] neuronhidden2 = new double[2];
static double[] neuronoutput = new double[2];
//Weight variable init
//For first hidden layer
static double[] weighthidden11 = new double[2];
static double[] weighthidden12 = new double[2];
//for second hidden layer
static double[] weighthidden21 = new double[2];
static double[] weighthidden22 = new double[2];
//for output layer
static double[] weightoutput = new double[2];
//End of weight variable init
//Bias value input
static double[] biashidden1 = new double[2];
static double[] biashidden2 = new double[2];
static double[] biasoutput = new double[1];
public static void main(String[] args) {
neuroninput[0] = 1;
neuroninput[1] = 1;
weighthidden11[0] = -5.5187;
weighthidden11[1] = -5.4490;
weighthidden12[0] = 3.7332;
weighthidden12[1] = 2.7697;
weighthidden21[0] = -2.8093;
weighthidden21[1] = -3.0692;
weighthidden22[0] = -1.6685;
weighthidden22[1] = 6.7527;
weightoutput[0] = -4.9318;
weightoutput[1] = -0.9651;
biashidden1[0] = 2.1369;
biashidden1[1] = 2.6529;
biashidden2[0] = -0.2274;
biashidden2[1] = -4.9512;
biasoutput[0] = 1.4848;
//Counting each neuron (Feed forward)
neuronhidden1[0] = sigma(neuroninput,weighthidden11,biashidden1[0]);
neuronhidden1[0] = tansig(neuronhidden1[0]);
neuronhidden1[1] = sigma(neuroninput,weighthidden12,biashidden1[1]);
neuronhidden1[1] = tansig(neuronhidden1[1]);
neuronhidden2[0] = sigma(neuronhidden1,weighthidden21,biashidden2[0]);
neuronhidden2[0] = logsig(neuronhidden2[0]);
neuronhidden2[1] = sigma(neuronhidden1,weighthidden22,biashidden2[1]);
neuronhidden2[1] = logsig(neuronhidden2[1]);
neuronoutput[0] = sigma(neuronhidden2,weightoutput,biasoutput[0]);
neuronoutput[0] = purelin(neuronoutput[0]);
System.out.println(neuronoutput[0]);
}
static double tansig(double x) {
double value = 0;
value = (Math.exp(x) - Math.exp(-x)) / (Math.exp(x) + Math.exp(-x));
return value;
}
static double logsig(double x) {
double value = 0;
value = 1 / (1+Math.exp(-x));
return value;
}
static double purelin(double x) {
double value = x;
return value;
}
static double sigma(double[] val, double[] weight, double hidden) {
double value = 0;
for (int i = 0; i < val.length; i++) {
value += (val[i] * weight[i]);
//System.out.println(val[i]);
}
value += hidden;
return value;
}
}
But it got result as following :
-1.3278721528152158
My question, is there any error or my mistake in exporting weight and bias value from matlab to java? Maybe I made mistake in my java program?
Thank you verymuch..
I think the problem is the normalization:
http://www.mathworks.com/matlabcentral/answers/14590
If you work with 0,1 inputs, you have to use the f(x)=2*x-1 normalization function, which transforms the values to the [-1; 1] interval, then g(x)=(x+1)/2 to transform back the output to [0; 1]. Pseudocode:
g( java_net( f(x), f(y) ) ) = matlab_net(x, y)
I tried this with an other network and worked for me.
Your problem is most surely related to your JAVA version of the Matlab sim() command.
It is a complex Matlab command with many settings affecting the architecture of the network to be simulated. To make debugging easier, try and implement the sim() command yourself in Matlab. Possibly reduce the number of layers until you have a match in Matlab between sim()-builtin and your own sim version. When that is working convert to JAVA.
EDIT:
The Reason for re-implementing the sim() function in Matlab, is that if you can't implement it here, you won't be able to properly implement it in JAVA either. Feed forward networks are quiet easy to implement using Matlab vector notation.