error decrease too slowly on Neural Network BackPropagation Training - java

I tried to implement Neural Network backpropagation using JAVA, I already code it, but the result is unsatifying. the error is decreasing too slow. Below are the example of train result:
epoch:1 current error:0.5051166876846451
epoch:2 current error:0.4982484527652138
epoch:3 current error:0.4965995467118879
epoch:4 current error:0.49585659139683363
epoch:5 current error:0.4953426236386938
epoch:6 current error:0.4948766985413233
epoch:7 current error:0.49441754405152294
epoch:8 current error:0.4939551661406868
epoch:9 current error:0.49348601614718984
epoch:10 current error:0.4930078119902486
epoch:11 current error:0.49251846766886453
Based on this I started to doubt my code and its algorithm. The activation function used are sigmoid. Below are The sample code of the training.
public void learning(int epoch,double learningRateTemp,double desiredErrorTemp,DataSet ds,double momentum){
int processEpoch=0;
double sumSquaredError=0;
DataSetRow dsr;
Connector conTemp;
double sumError=0;
double errorInformation=0;
double activationValue;
double partialDerivative;
do{
processEpoch++;
sumSquaredError=0;
System.out.println("epoch:"+processEpoch);
//data training set
for(int a=0;a<ds.countRows();a++){
dsr=ds.getSpecificRow(a);
sumError=0;
double[]input=dsr.getInput();
double[]output=dsr.getdesiredOutput();
double sumDeltaInput=0;
double weightTempValue=0;
//forward calculation
this.forwardCalculation(input);
//backpropagateofError
//for output unit
for(int k=0;k<NeuralLayers[totalLayer-1].getTotalNode();k++){
activationValue=NeuralLayers[totalLayer-1].getNeuron(k).getValue();
partialDerivative=(activationValue)*(1-activationValue);
Neuron Temp=NeuralLayers[totalLayer-1].getNeuron(k);
errorInformation=(output[k]-Temp.getValue())*partialDerivative;
Temp.SetErrorInformationTerm(errorInformation);
sumError+=Math.pow((output[k]-Temp.getValue()),2);
NeuralLayers[totalLayer-1].setNeuron(k, Temp);
}
//end of output unit
//for hidden Unit
for(int l=totalLayer-2;l>0;l--){
for(int j=1;j<NeuralLayers[l].getTotalNode();j++){
sumDeltaInput=0;
for(int k=0;k<NeuralLayers[l+1].getTotalNode();k++){
conTemp=NeuralLayers[l+1].getConnector(k, j);
if(conTemp.getStatusFrom()==false){
weightTempValue=conTemp.getWeight().getValue();
sumDeltaInput+=(NeuralLayers[l+1].getNeuron(k).GetErrorInformationTerm()*weightTempValue);
}
}
activationValue=NeuralLayers[l].getNeuron(j).getValue();
partialDerivative=(activationValue)*(1-activationValue);
errorInformation= sumDeltaInput*partialDerivative;
Neuron neuTemp=NeuralLayers[l].getNeuron(j);
neuTemp.SetErrorInformationTerm(errorInformation);
NeuralLayers[l].setNeuron(j, neuTemp);
}
}
updateWeight(learningRateTemp,momentum);
sumSquaredError+=sumError;
}
sumSquaredError/=(double)(ds.countRows()*NeuralLayers[totalLayer-1].getTotalNode());
sumSquaredError=Math.sqrt(sumSquaredError);
System.out.println("current error:"+sumSquaredError);
} while(processEpoch<epoch && sumSquaredError>desiredErrorTemp);
}
}
for the forward calculation
private void forwardCalculation(double[] inputValue){
Connector Contemp;
double SumNodeWeight=0;
int start=1;
int count=0;
setNodeValue(inputValue,0);
do{
count++;
if("output".equals(NeuralLayers[count].statusLayer))
start=0;
else start=1;
//get sum of all input
for(int j=start;j<NeuralLayers[count].getTotalNode();j++){
for(int i=0;i<NeuralLayers[count].sizeConnector(j);i++){
Contemp=NeuralLayers[count].getConnector(j, i);
SumNodeWeight+=Contemp.getCombinedweightInput();
}
SumNodeWeight=(1/(1+Math.exp(-SumNodeWeight)));
NeuralLayers[count].setNeuronValue(j, SumNodeWeight);
SumNodeWeight=0;
}
}while(!"output".equals(NeuralLayers[count].statusLayer));
}
and to update the weights
private void updateWeight(double learningRateTemp,double momentum){
double newWeight;
double errorInformation;
Connector conTemp;
for(int LayerPosition=totalLayer-1;LayerPosition>0;LayerPosition--){
for(int node=1;node<NeuralLayers[LayerPosition].getTotalNode();node++){
errorInformation=NeuralLayers[LayerPosition].getNeuron(node).GetErrorInformationTerm();
//for bias weight
newWeight=learningRateTemp*errorInformation;
conTemp=NeuralLayers[LayerPosition].getConnector(node, 0);
conTemp.updateWeight(newWeight,false,0);
NeuralLayers[LayerPosition].updateConnector(conTemp, node, 0);
/////////////////////
//for other node weight
for(int From=1;From<NeuralLayers[LayerPosition].sizeConnector(node);From++){
conTemp=NeuralLayers[LayerPosition].getConnector(node, From);
double weightCorrection=learningRateTemp*errorInformation*NeuralLayers[LayerPosition-1].getNeuron(From).getValue();
conTemp.updateWeight(weightCorrection,true,momentum);
NeuralLayers[LayerPosition].updateConnector(conTemp,node,From);
}
}
}
}
am I on the right Track? I already searched for the bugs in few days, and it still nothing. Does my formula to calculate the error is correct? thank you very much!

Well I'm not any kind of expert on this, nor Java programming, but It might be affecting, you put you variable sumError declared as 0 at the beggining, then you add the error from the outputs, and then in the for cycle of the hidden layers it appears again added to sumSquaredError variable, but if you are going to calc the error of training, why is it inside the "hidden layer cucle"?
for(int l=totalLayer-2;l>0;l--){
for(int j=1;j<NeuralLayers[l].getTotalNode();j++){
}
updateWeight(learningRateTemp,momentum);
sumSquaredError+=sumError;
}
Shouldn't it be outside?
I´ll make reference to the pseudocode of someone who answered me before.
link
Hope this helps!

Related

Shortest path in Rat in a Maze with option to remove one wall

This is the problem:
You have maps of parts of the space station, each starting at a prison exit and ending at the door to an escape pod. The map is represented as a matrix of 0s and 1s, where 0s are passable space and 1s are impassable walls. The door out of the prison is at the top left (0,0) and the door into an escape pod is at the bottom right (w-1,h-1).
Write a function answer(map) that generates the length of the shortest path from the prison door to the escape pod, where you are allowed to remove one wall as part of your remodeling plans. The path length is the total number of nodes you pass through, counting both the entrance and exit nodes. The starting and ending positions are always passable (0). The map will always be solvable, though you may or may not need to remove a wall. The height and width of the map can be from 2 to 20. Moves can only be made in cardinal directions; no diagonal moves are allowed.
To Summarize the problem: It is a simple rat in a maze problem with rat starting at (0,0) in matrix and should reach (w-1,h-1). Maze is a matrix of 0s and 1s. 0 means path and 1 means wall.You have the ability to remove one wall(change it from 0 to 1). Find the shortest path.
I've solved the problem but 3 of 5 testcases fail and I don't know what those test cases are. and I'm unable to figure out why. Any help would be greatly appreciated.Thanks in Advance. Here is my code:
import java.util.*;
class Maze{//Each cell in matrix will be this object
Maze(int i,int j){
this.flag=false;
this.distance=0;
this.x=i;
this.y=j;
}
boolean flag;
int distance;
int x;
int y;
}
class Google4_v2{
public static boolean isPresent(int x,int y,int r,int c)
{
if((x>=0&&x<r)&&(y>=0&&y<c))
return true;
else
return false;
}
public static int solveMaze(int[][] m,int x,int y,int loop)
{
int r=m.length;
int c=m[0].length;
int result=r*c;
int min=r*c;
Maze[][] maze=new Maze[r][c];//Array of objects
for(int i=0;i<r;i++)
{
for(int j=0;j<c;j++)
{
maze[i][j]=new Maze(i,j);
}
}
Queue<Maze> q=new LinkedList<Maze>();
Maze start=maze[x][y];
Maze[][] spare=new Maze[r][c];
q.add(start);//Adding source to queue
int i=start.x,j=start.y;
while(!q.isEmpty())
{
Maze temp=q.remove();
i=temp.x;j=temp.y;
int d=temp.distance;//distance of a cell from source
if(i==r-1 &&j==c-1)
{
result=maze[i][j].distance+1;
break;
}
maze[i][j].flag=true;
if(isPresent(i+1,j,r,c)&&maze[i+1][j].flag!=true)//check down of current cell
{
if(m[i+1][j]==0)//if there is path, add it to queue
{
maze[i+1][j].distance+=1+d;
q.add(maze[i+1][j]);
}
if(m[i+1][j]==1 && maze[i+1][j].flag==false && loop==0)//if there is no path, see if breaking the wall gives a path.
{
int test=solveMaze(m,i+1,j,1);
if(test>0)
{
test+=d+1;
min=(test<min)?test:min;
}
maze[i+1][j].flag=true;
}
}
if(isPresent(i,j+1,r,c)&&maze[i][j+1].flag!=true)//check right of current cell
{
if(m[i][j+1]==0)
{
maze[i][j+1].distance+=1+d;
q.add(maze[i][j+1]);
}
if(m[i][j+1]==1 && maze[i][j+1].flag==false && loop==0)
{
int test=solveMaze(m,i,j+1,1);
if(test>0)
{
test+=d+1;
min=(test<min)?test:min;
}
maze[i][j+1].flag=true;
}
}
if(isPresent(i-1,j,r,c)&&maze[i-1][j].flag!=true)//check up of current cell
{
if(m[i-1][j]==0)
{
maze[i-1][j].distance+=1+d;
q.add(maze[i-1][j]);
}
if(m[i-1][j]==1 && maze[i-1][j].flag==false && loop==0)
{
int test=solveMaze(m,i-1,j,1);
if(test>0)
{
test+=d+1;
min=(test<min)?test:min;
}
maze[i-1][j].flag=true;
}
}
if(isPresent(i,j-1,r,c)&&maze[i][j-1].flag!=true)//check left of current cell
{
if(m[i][j-1]==0)
{
maze[i][j-1].distance+=1+d;
q.add(maze[i][j-1]);
}
if(m[i][j-1]==1 && maze[i][j-1].flag==false && loop==0)
{
int test=solveMaze(m,i,j-1,1);
if(test>0)
{
test+=d+1;
min=(test<min)?test:min;
}
maze[i][j-1].flag=true;
}
}
}
return ((result<min)?result:min);
}
public static int answer(int[][] m)
{
int count;
int r=m.length;
int c=m[0].length;
count=solveMaze(m,0,0,0);
return count;
}
public static void main(String[] args)
{
Scanner sc=new Scanner(System.in);
System.out.println("enter row size ");
int m=sc.nextInt();
System.out.println("enter column size ");
int n=sc.nextInt();
int[][] maze=new int[m][n];
System.out.println("Please enter values for maze");
for(int i=0;i<m;i++)
{
for(int j=0;j<n;j++)
{
maze[i][j]=sc.nextInt();
}
}
int d=answer(maze);
System.out.println("The maze can be solved in "+d+" steps");
}
}
Found the problem. maze[i][j].flag=true; needs to be put as soon as the cell is visited, inside the if(m[i+1][j]==0) condition. Otherwise, the distance for same cell can be added by more than one cells
Unfortunately it's quite hard to help you because your code is very difficult to read. The variables are generally single characters which makes it impossible to know what they are supposed to represent. Debugging it would be more help than most of us are willing to give :-)
I suggest you go about debugging your code as follows:
Split your solveMaze method into a number of smaller methods that each perform much simpler functions. For example, you have very similar code repeated 4 times for each direction. Work to get that code in a single method which can be called 4 times. Move your code to create the array into a new method. Basically each method should do one simple thing. This approach makes it much easier to find problems when they arise.
Write unit tests to ensure each of those methods do exactly what you expect before attempting to calculate the answer for entire mazes.
Once all the methods are working correctly, generate some mazes starting from very simple cases to very complex cases.
When a case fails, use an interactive debugger to walk through your code and see where it is going wrong.
Good luck.

Displaying arraylist of objects containing one attribute whose value is superior to a certain value

I am trying to display an arraylist of recordings for mountain climbing where the mountain height was superior to 5K.
I am invoking an object method getheight() from another class but every time I try to compile, I am told that double can't be dereferenced. what am I doing wrong here? is there a better way of displaying a arraylist of objects containing an attribute whose value is superior to a certain number in Java? I have a feeling that I am close but yet far off the target. any tips?
public void Displayrecording()
{
double highestheights;
for(int i=0; i< 5; i++)
if(highestheights.getheight() > 5)
{
System.out.print(records.get(i));
}
}
A double is just a number. It does not have a height, or anything else. It's just a number.
So all you need is highestheights > 5.
If getHeight is going to give you altitude it looks you're looking for something like:
public void Displayrecording()
{
double highestheights;
for(int i = 0; i < 5; i++) {
highestheights = /*maybe some static class here*/getheight(/*maybe some parameter here*/);
if(highestheights > 5000)
{
System.out.print(records.get(i));
}
}
}
Generally your code is missing an array with mountain names or ids.

K-means|| aka K-Means++ Scalable - Problems with implementation

EDIT: Code updated, Comments, performance info
I'm trying to code K-means|| in Java. (http://vldb.org/pvldb/vol5/p622_bahmanbahmani_vldb2012.pdf)
However, it doesn't work well. I wasn't surprised that the running time increases compared to standard K-means. I'm much more wondering why the detection rate of my program trained with K-means|| is lower compared to a training with standard K-means. How could choosen clusterpoints be worse than ones picked by chance?
UPDATE: If've found some errors while my internet was down, k-means|| performs now as good as k-means standard - however not a bit better.
I'm quite sure that my code is wrong, but after hours of searching I've got no idea where I've done a mistake (frankly, I'm quite new to this topic).
So I hope you see what I've done wrong. Here is my code for both seeding options:
public void training(int stop, int numberIt, double epsilon, boolean advanced){
double d=Double.MAX_VALUE,s=0;
int nearestprototype=0;
int [] myprototype=new int[trainingsSet.size()];
Random random=new Random();
//
long t1=System.currentTimeMillis();
if(!advanced){//standard random k-means seeding; random datapoints are choosen as prototypes
for(int i=0; i<k; i++){
int rand = random.nextInt(trainingsSet.size());
prototypes[i]=trainingsSet.getVectorAtIndex(rand);
}
}else{ //state-of-the-art k-means|| a.k.a k-means++ scalable seeding; explanation here: http://vldb.org/pvldb/vol5/p622_bahmanbahmani_vldb2012.pdf
prototypes[0]=trainingsSet.getVectorAtIndex(random.nextInt(trainingsSet.size())); //first protoype, chosen randomly
Vector<DataVector>kproto=new Vector<DataVector>(); //saves the prototypes
kproto.add(prototypes[0]);
for(int i=0;i<trainingsSet.size();i++){ //gets distance to all data points, sum it up
s+=trainingsSet.getVectorAtIndex(i).distance2(kproto.elementAt(0));
}
double it=Math.floor(Math.log(s)); // calculates how often the loop for step 4 and 5 is executed
for(int c=0; c<it; c++){
int[]psi=new int[trainingsSet.size()];//saves minimum distance to a protoype for every element
for(int i=0; i<trainingsSet.size();i++){
double min=Double.POSITIVE_INFINITY;
for(int j=0;j<kproto.size();j++){
double dist=trainingsSet.getVectorAtIndex(i).distance2(kproto.elementAt(j));
if(min>dist){
min=dist;
}
}
psi[i]=(int) min;
}
double phi_c=0;
for(int i=0; i<trainingsSet.size();i++)
phi_c+=psi[i]; //sums up squared distances
for(int f=0; f<trainingsSet.size();f++){
double p_x=5*psi[f]/phi_c; //oversampling factor 0.1*k (k is 50 in my case)
if(p_x>random.nextDouble()){
kproto.addElement(trainingsSet.getVectorAtIndex(f));//adds data point to the prototype set with a probability
//depending on its distance to the next prototype
}
}
}
int[]w=new int[kproto.size()]; //every prototype gets a value in w; the value is increased if the prototype has a minimum distance to a data point
for(int i=0; i<trainingsSet.size();i++){
double min=trainingsSet.getVectorAtIndex(i).distance2(kproto.elementAt(0));
if(min==0)
continue;
int index=0;
for(int j=1; j<kproto.size();j++){
double save=trainingsSet.getVectorAtIndex(i).distance2(kproto.elementAt(j));
if(min>save){
min=save;
index=j;
}
}
w[index]++;
}
int[]wtotal=new int[kproto.size()]; //wtotal sums the w values up
for(int i=0;i<kproto.size();i++){
for(int st=0; st<=i;st++){
wtotal[i]+=w[st];
}
}
int[]cselect=new int[k];//cselect saves the final prototypes
int stoppoint=0;
boolean repeat=false; //repeat lets the choosing process repeat if the prototype has already been selected
for(int kk=0;kk<k;kk++){
do{
repeat=false;
int stopper=random.nextInt(wtotal[kproto.size()-1]);//randomly choose a int and check in which interval it lies
for(int st=wtotal.length-1;st>=0;st--){
if(stopper>=wtotal[st]){
stoppoint=wtotal.length-st-1;
break;
}
}
for(int i=0; i<kk;i++){
if(cselect[i]==stoppoint)
repeat=true;
}
}while(repeat);
//are all prototypes overwritten?
prototypes[kk]=kproto.get(stoppoint);//the number of the interval is connected to a prototype; the prototype is added to the final set of prototypes "prototypes"
cselect[kk]=stoppoint;
}
}
long t2=System.currentTimeMillis();
System.out.println(advanced+" Init time: "+(t2-t1));
The performance shows that both options (standard, k-means||) reach the same level of correct clustering (around 85%). However, the running time for initalisation differs.
The seeding is quasi-immediatly for standard k-means, whereas k-means|| needs 600-900ms (for 1000 data points). The convergence afterwards with standard maximazation/expectation needs the same time for both (around 1900-2500ms). This is irritation because k-means|| should converge much faster.
I hope you spot some error or maybe explain me if I expect something else than k-means|| can deliver.
Thanks for your help!

Java TSP 2-opt swap

This is my first year with programming at university and we have just started using java. I've already wrote a bunch of codes for calculating "shortest" path through all points, but it has one problem. Sometimes the path will be overlapping each other. I have been looking for 2-opt swap, but have no clue on how to implement this to my code. Would be awesome with help. Here is my code for calculating distances between points ( cities ):
public void calculate(){
Point current = null;
current = points.get(0);
Point nearestPoint = null;
ArrayList<Point> remainingPoints = new ArrayList<Point>(points);
remainingPoints.remove(current);
lines.clear();
while(!remainingPoints.isEmpty()){
double minimumDistance = -1;
for (int i = 0; i < remainingPoints.size(); i ++){
if (minimumDistance == - 1 || current.distance(remainingPoints.get(i)) < minimumDistance){
minimumDistance = current.distance(remainingPoints.get(i));
nearestPoint = remainingPoints.get(i);
}
}
lines.add(new Point[] { current, nearestPoint });
remainingPoints.remove(current);
current = nearestPoint;
}
lines.add(new Point[] { points.get(0), current });
}
What does it do? Well it is quiet basic. It starts with the first point, then will find the nearest point. This will be saved in an array called lines. This will continue like this until no points are left. Line-array will then be sorted by distances so we can draw lines between them. My question is how can i prevent overlapping? See the links bellow for better description:
I dont want this
I want this

Fast interpolation between a collection of points

I've built a model of the solar system in Java. In order to determine the position of a planet it does do a whole lot of computations which give a very exact value. However I am often satisfied with the approximate position, if that could make it go faster. Because I'm using it in a simulation speed is important, as the position of the planet will be requested millions of times.
Currently I try to cache the position of a planet throughout its orbit and then use those coordinates over and over. If a position in between two values is requested I perform a linear interpolation. This is how I store values:
for(int t=0; t<tp; t++) {
listCoordinates[t]=super.coordinates(ti+t);
}
interpolator = new PlanetOrbit(listCoordinates,tp);
PlanetOrbit has the interpolation code:
package cometsim;
import org.apache.commons.math3.util.FastMath;
public class PlanetOrbit {
final double[][] coordinates;
double tp;
public PlanetOrbit(double[][] coordinates, double tp) {
this.coordinates = coordinates;
this.tp = tp;
}
public double[] coordinates(double julian) {
double T = julian % FastMath.floor(tp);
if(coordinates.length == 1 || coordinates.length == 0) return coordinates[0];
if(FastMath.round(T) == T) return coordinates[(int) T];
int floor = (int) FastMath.floor(T);
if(floor>=coordinates.length) floor=coordinates.length-5;
double[] f = coordinates[floor];
double[] c = coordinates[floor+1];
double[] retval = f;
retval[0] += (T-FastMath.floor(T))*(c[0]-f[0]);
retval[1] += (T-FastMath.floor(T))*(c[1]-f[1]);
retval[2] += (T-FastMath.floor(T))*(c[2]-f[2]);
return retval;
}
}
You can think of FastMath as Math but faster. However, this code is not much of a speed improvement over calculating the exact value every time. Do you have any ideas for how to make it faster?
There are a few issues I can see, the main ones I can see are as follows
PlanetOrbit#coordinates seems to actually change the values in the variable coordinates. As this method is supposed to only interpolate I expect that your orbit will actually corrupt slightly everytime you run though it (because it is a linear interpolation the orbit will actually degrade towards its centre).
You do the same thing several times, most clearly T-FastMath.floor(T) occures 3 seperate times in the code.
Not a question of efficiency or accuracy but the variable and method names are very opaque, use real words for variable names.
My proposed method would be as follows
public double[] getInterpolatedCoordinates(double julian){ //julian calendar? This variable name needs to be something else, like day, or time, or whatever it actually means
int startIndex=(int)julian;
int endIndex=(startIndex+1>=coordinates.length?1:startIndex+1); //wrap around
double nonIntegerPortion=julian-startIndex;
double[] start = coordinates[startIndex];
double[] end = coordinates[endIndex];
double[] returnPosition= new double[3];
for(int i=0;i< start.length;i++){
returnPosition[i]=start[i]*(1-nonIntegerPortion)+end[i]*nonIntegerPortion;
}
return returnPosition;
}
This avoids corrupting the coordinates array and avoids repeating the same floor several times (1-nonIntegerPortion is still done several times and could be removed if needs be but I expect profiling will show it isn't significant). However, it does create a new double[] each time which may be inefficient if you only need the array temporarily. This can be corrected using a store object (an object you used previously but no longer need, usually from the previous loop)
public double[] getInterpolatedCoordinates(double julian, double[] store){
int startIndex=(int)julian;
int endIndex=(startIndex+1>=coordinates.length?1:startIndex+1); //wrap around
double nonIntegerPortion=julian-startIndex;
double[] start = coordinates[startIndex];
double[] end = coordinates[endIndex];
double[] returnPosition= store;
for(int i=0;i< start.length;i++){
returnPosition[i]=start[i]*(1-nonIntegerPortion)+end[i]*nonIntegerPortion;
}
return returnPosition; //store is returned
}

Categories

Resources