Recently I've been brushing up on my machine learning, and as such decided to implement a basic neural network in Java using the back propagation algorithm. I've gone over the maths and checked against various other tutorials, but am still having problems. Apologies for the size of this post.
I'll first let you know the problems I have been testing on, before going into more detail about the algorithm.
Test problem 1:
A single output neuron with linear activation, learning regression of the function x/2 + 2. This works pretty well, but doesn't really use back propagation yet.
Algorithm works, and converges to near zero error (no pic, since I can't post more than 2 links).
Test problem 2:
My next test was to learn the XOR problem. For this, I tried a simple network with 2 input nodes, 2 hidden nodes and 2 output nodes (input nodes only provide the input and are not trained).
Algorithm always gets stuck on an average of 0.5 error
It doesn't matter how many epochs I run the algorithm for, all errors seem to converge to this point, and the network performs poorly.
Implementation
To implement the algorithm, I represent nodes as objects, and also have objects to represent activation functions.
public class LogisticActivationFunction implements ActivationFunction {
#Override
public double apply(double in) {
return 1.0 / (1.0 + Math.exp(-in));
}
#Override
public double applyDerivative(double in) {
double sig = apply(in);
return sig * (1.0 - sig);
}
}
First, the feed forward process is run like so:
public List<Double> evaluate(List<Double> inputs, boolean training) {
// Set the weights in the first layer.
setInputWeights(inputs);
// Iterate through non-input layers one by one and evaluate.
NodeLayer previousLayer = layers.get(0);
for (int layerIndex = 1; layerIndex < layers.size(); layerIndex++) {
NodeLayer layer = layers.get(layerIndex);
for (int nodeIndex = 0; nodeIndex < layer.size(); nodeIndex++) {
Node node = layer.get(nodeIndex);
evaluateNode(node, previousLayer, training);
}
previousLayer = layer;
}
return getOutputWeights();
}
private void evaluateNode(Node node, NodeLayer previousLayer, boolean training) {
double sum = node.getBias();
// Create sum from all connected nodes.
for (int link : node.links()) {
if (training) {
previousLayer.get(link).registerDownstreamNode(node.getId());
}
sum += node.getUpstreamLinkStrength(link) * previousLayer.get(link).getOutput();
}
// apply the activation function.
double activation = node.getActivation().apply(sum);
node.setHiddenNode(sum, activation);
}
Next, error values are propagated backwards across the network:
protected void backPropogate(List<Double> correct) {
//float error = norm(correct, getOutputWeights());
// Final layer error.
NodeLayer outputLayer = getOutputLayer();
List<Double> output = getOutputWeights();
for (int i = 0; i < outputLayer.size(); i++) {
// Calculate error on the ith output.
double error = correct.get(i) - output.get(i);
System.out.println("error " + i + " = " + error + " = " + correct.get(i) + " - " + output.get(i));
// Set the delta to the error in dimension i multiplied by the activation derivative of the input.
Node node = outputLayer.get(i);
node.setDelta(error * node.getActivation().applyDerivative(node.getInput()));
}
NodeLayer layer = outputLayer.getUpstream(this);
while (layer != getInputLayer()) {
for (Node node : layer) {
double sum = 0;
for (Node downstream : node.downstreamNodes(this, layer)) {
sum += downstream.getDelta() * downstream.getUpstreamLinkStrength(node.getId());
}
node.setDelta(sum * node.getActivation().applyDerivative(node.getInput()));
}
layer = layer.getUpstream(this);
}
}
Finally, weights are updated using gradient descent. Note, I'm using the negative error, so this works by adding the delta * learning rate * output.
private void updateParameters(double learningRate) {
for (NodeLayer layer : this) {
if (layer == getInputLayer()) {
continue;
}
for (Node node : layer) {
double oldBias = node.getBias();
node.offsetBias(node.getDelta() * learningRate);
for (Node upstream : node.upstreamNodes(this, layer)) {
double oldW = node.getUpstreamLinkStrength(upstream);
node.offsetWeight(upstream.getId(), learningRate * node.getDelta() * upstream.getOutput());
}
}
}
}
To tie these all together, I use the train method:
public void trainExample(List<Double> inputs, List<Double> correct, double learningRate) {
System.out.println("training example... " + Data.toString(inputs) + " -> " + Data.toString(correct));
evaluate(inputs, true);
backPropogate(correct);
updateParameters(learningRate);
}
And to do this for a training set I use the following logic:
public List<Double> train(NodeNetwork network, List<List<Double>> trainingInput, List<List<Double>> trainingLabels, double learningRate, int epochs, boolean verbose) {
List<Double> errorLog = new ArrayList<>();
for (int i = 0; i < epochs; i++) {
for (int j = 0; j < trainingInput.size(); j++) {
int example = random.nextInt(trainingInput.size());
network.trainExample(trainingInput.get(example), trainingLabels.get(example), learningRate);
}
if (verbose) {
double error = network.checkErrorSet(trainingInput, trainingLabels);
errorLog.add(error);
System.out.println(i + " " + error);
}
}
return errorLog;
}
Does anyone have any ideas on how I might go about getting this to work? I've spent the last day doing various checks, and seem to be getting no closer to an answer.
Code is viewable on my github (sami016) which I cannot link due to URL restrictions.
I'd really appreciate if anyone could point me in the right direction. Thanks for your help!
Related
I am trying to save the best solution for my algorithm that prints out a solution for a binary scales problem.
My code is saving the best Fitness result (the lowest number) but it's not saving the best Scales solution (the binary code) that should be the Scales solution of the best Fitness.
public static ScalesSolution RMHC(ArrayList<Double> weights, int iter) {
if(weights == null || weights.size() == 0 || iter < 1) {
return null;
}
ScalesSolution sol = new ScalesSolution(weights.size());
ScalesSolution newSol = new ScalesSolution(sol.GetSol());
double oldSolFitness = sol.ScalesFitness(weights);
double newSolFitness = 0;
for(int i = 1; i <= iter; i++) {
newSol.SmallChange();
newSolFitness = newSol.ScalesFitness(weights);
if(newSolFitness < oldSolFitness) {
oldSolFitness = newSolFitness;
sol = new ScalesSolution(newSol.GetSol());
}
else if(newSolFitness > oldSolFitness) {
newSolFitness = oldSolFitness;
sol = new ScalesSolution(newSol.GetSol());
}
oldSolFitness = newSolFitness;
System.out.println(newSol.GetSol() + "; " + newSolFitness);
}
return sol;
}
In conclusion I want to save the Solution (binary) of the best Fitness found, instead of saving the last Solution found.
If any more info is needed please message me, thanks in advance!
hello guys for simulating the queue blocking time for an M/M/1 I came up with this very solution, but it is not Object-oriented unfortunately, also the problem is I want to simulate it with M/M/2 system,for instance I initialized lambda with 19 and mu with 20 just for ease up the calculation any solution, hint, code example will be greatly appreciated.
public class Main {
public static void main(String[] args) {
final int MAX_ENTITY = 100000;
final int SYSTEM_CAPACITY = 5;
final int BUSY = 1;
final int IDLE = 0;
double lambda = 19, mu = 20;
int blocked = 0;
int queue_length = 0;
int server_state = IDLE;
int entity = 0;
double next_av = getArivalRand(lambda);
double next_dp = next_av + getDeparturedRand(lambda);
while (entity <= MAX_ENTITY) {
//Arrival
if (next_av <= next_dp) {
entity++;
if (server_state == IDLE) {
server_state = BUSY;
} else if (queue_length < SYSTEM_CAPACITY - 1) {
queue_length++;
} else {
blocked++;
}
next_av += getArivalRand(lambda);
} // Departure
else if (queue_length > 0) {
queue_length--;
next_dp = next_dp + getDeparturedRand(mu);
} else {
server_state = IDLE;
next_dp = next_av + getDeparturedRand(mu);
}
}
System.out.println("Blocked Etity:" + blocked + "\n");
}
public static double getArivalRand(double lambda) {
return -1 / lambda * Math.log(1 - Math.random());
}
public static double getDeparturedRand(double mu) {
return -1 / mu * Math.log(1 - Math.random());
}
}
EDIT:
check here if u don't know about the queue theory
oh boy you're code needs serious refactoring in order to achieve M/M/2.
I created a gist file here which I think implements what you wanted,
In the gist file I created a Dispatcher class for balancing two queues in two servers and also I've simulated it with two seeds, it is much more Object-Oriented approach,
here is an example code from gist file which is for balancing load of
the tasks
if (server1.getQueueLength() < server2.getQueueLength())
currentServer = server1;
else if (server1.getQueueLength() > server2.getQueueLength())
currentServer = server2;
else if (currentServer == server1)
currentServer = server2;
else
currentServer = server1;
I am working for a project for my university and i am getting always the same error message
java.lang.NullPointerException at
Assignment2.ColumnGen$SubProblem.createModel(ColumnGen.java:283)
The problem is in these lines
double M = 0;
for (int i=0; i<all_customers.size(); i++) {
for (int j=0; j<all_customers.size(); j++) {
double val = all_customers.get(i).time_to_node(all_customers.get(j)) + all_customers.get(i).time_at_node();
if (M<val) M=val;
}
}
When I delete these lines everything works perfectly but obviously I am not getting the best result as long as my algorithm, because i miss this parameter.
I know what is a null pointer exception but i tried everything and still i miss something.
My all other declarations for the things that you see in code are
public Map<Integer, Customer> all_customers = new HashMap<Integer, Customer>();
public double a() {
return ready_time;
}
public double b() {
return due_date;
}
public Node(int external_id, double x, double y, double t) {
this.id = all_nodes.size();
this.id_external = external_id;
this.xcoord = x;
this.ycoord = y;
this.t_at_node = t;
all_nodes.put(this.id, this);
}
public double time_to_node(Node node_to) {
return Math.round(Math.sqrt(Math.pow(this.xcoord - node_to.xcoord, 2) + Math.pow(this.ycoord - node_to.ycoord, 2)));
}
public double time_at_node() {
return t_at_node;
}
What i do wrong?
I think your exception comes from all_customers.get(i) , debug your code and make sure all the keys u request are in the map, or you can add a condition to check whether the map contains your key
Your problem is that one of your Map.get() operations returns a null. Obviously, one of the keys is missing from your map. You are not showing us how you populate your map, so the problem is not in the code that you are showing us.
Replace the following line of spaghetti code:
double val = all_customers.get(i).time_to_node(all_customers.get(j)) + all_customers.get(i).time_at_node();
with the following block of code:
Customer ci = all_customers.get(i);
assert ci != null : "Not found:" + i;
Customer cj = all_customers.get(j);
assert cj != null : "Not found:" + j;
double val = ci.time_to_node(cj) + ci.time_at_node();
and run your program passing the -enableassertions argument to the VM. (-ea for short.) This should give you a very good hint as to what is going wrong.
So I have a program written so far that reads in a csv file of cities and distances in the following format:
Alaska Mileage Chart,Anchorage,Anderson,Cantwell,
Anchorage,0,284,210,
Anderson,284,0,74,
Cantwell,210,74,0,
So the algorithm works and outputs the cities in the order they should be visited following the shortest path using the nearest neighbor algorithm always starting with Anchorage as the city of origin or starting city.
Using this data, the example output for the algorithm is: 1,3,2. I have ran this with a 27 element chart and had good results as well. I am using this small one for writing and debugging purposes.
Ideally the output I am looking for is the Name of the City and a cumulative milage.
Right now I am having working on trying to get the cities into an array that I can print out. Help with both parts would be appreciated or help keeping in mind that is the end goal is appreciated as well.
My thought was that ultimately I may want to create an array of {string, int}
so my output would look something like this..
Anchorage 0
Cantwell 210
Anderson 284
I am able to set the first element of the array to 1, but can not get the 2nd and 3rd element of the new output array to correct
This is the code I am having a problem with:
public class TSPNearestNeighbor {
private int numberOfNodes;
private Stack<Integer> stack;
public TSPNearestNeighbor()
{
stack = new Stack<>();
}
public void tsp(int adjacencyMatrix[][])
{
numberOfNodes = adjacencyMatrix[1].length;
// System.out.print(numberOfNodes);
// System.out.print(Arrays.deepToString(adjacencyMatrix));
int[] visited = new int[numberOfNodes];
// System.out.print(Arrays.toString(visited));
visited[1] = 1;
// System.out.print(Arrays.toString(visited));
stack.push(1);
int element, dst = 0, i;
int min = Integer.MAX_VALUE;
boolean minFlag = false;
System.out.print(1 + "\n");
//System.arraycopy(arr_cities, 0, arr_final, 0, 1); // Copies Anchorage to Pos 1 always
//System.out.print(Arrays.deepToString(arr_final)+ "\n");
while (!stack.isEmpty())
{
element = stack.peek();
i = 1;
min = Integer.MAX_VALUE;
while (i <= numberOfNodes-1)
{
if (adjacencyMatrix[element][i] > 1 && visited[i] == 0)
{
if (min > adjacencyMatrix[element][i])
{
min = adjacencyMatrix[element][i];
dst = i;
minFlag = true;
}
}
i++;
}
if (minFlag)
{
visited[dst] = 1;
stack.push(dst);
System.out.print(dst + "\n");
minFlag = false;
continue;
}
stack.pop();
}
}
Given the existing structure you are using, you can output the cities in the path using:
public void printCities(Stack<Integer> path, int[][] distances, List<String> names) {
int cumulativeDistance = 0;
int previous = -1;
for (int city: path) {
if (previous != -1)
cumulativeDistance += distances[previous][city];
System.out.println(names.get(city) + " " + cumulativeDistance);
previous = city;
}
}
I'd like to answer your question slightly indirectly. You are making life hard for yourself by using arrays of objects. They make the code difficult to read and are hard to access. Things would become easier if you create a City class with appropriate methods to help you with the output.
For example:
class City {
private final String name;
private final Map<City,Integer> connections = new HashMap<>();
public static addConnection(City from, City to, int distance) {
from.connections.put(to, distance);
to.connections.put(from, distance);
}
public int getDistanceTo(City other) {
if (connections.containsKey(other))
return connections.get(other);
else
throw new IllegalArgumentException("Non connection error");
}
}
I've left out constructor, getters, setters for clarity.
Now outputting your path becomes quite a bit simpler:
public void outputPath(List<City> cities) {
int cumulativeDistance = 0;
City previous = null;
for (City current: cities) {
if (previous != null)
cumulativeDistance += previous.getDistanceTo(current);
System.out.println(current.getName + " " + cumulativeDistance);
previous = current;
}
}
I have implemented two algorithms in Java and when testing depth first search it seems to be taking an incredible amount of time when there are 12 nodes, when using A* it completes it in seconds, I was just wondering if this is to be expected or am I doing something wrong? Its running the search in the background now as I type this and has been going for a few minutes.
I wouldnt normally mind but ive got to test up to 500 nodes which could take days at this rate, is this something I should expect or am I doing something wrong?
Thanks!
import java.util.*;
#SuppressWarnings({ "rawtypes", "unchecked" })
public class DepthFirstSearch {
Routes distances;
static Routes routes;
int firstNode;
String result = new String();
ArrayList firstRoute, bestRoute;
int nodes = 0;
int routeCost = 0;
int bestCost = Integer.MAX_VALUE;
public DepthFirstSearch(Routes matrix, int firstNode) { //new instance
distances = matrix;
this.firstNode = firstNode;
}
public void run () { //run algorithm
long startTime = System.nanoTime();
firstRoute = new ArrayList();
firstRoute.add(firstNode);
bestRoute = new ArrayList();
nodes++;
long endTime = System.nanoTime();
System.out.println("Depth First Search\n");
search(firstNode, firstRoute);
System.out.println(result);
System.out.println("Visited Nodes: "+nodes);
System.out.println("\nBest solution: "+bestRoute.toString() + "\nCost: "+bestCost);
System.out.println("\nElapsed Time: "+(endTime-startTime)+" ns\n");
}
/**
* #param from node where we start the search.
* #param route followed route for arriving to node "from".
*/
public void search (int from, ArrayList chosenRoute) {
// we've found a new solution
if (chosenRoute.size() == distances.getCitiesCount()) {
chosenRoute.add(firstNode);
nodes++;
// update the route's cost
routeCost += distances.getCost(from, firstNode);
if (routeCost < bestCost) {
bestCost = routeCost;
bestRoute = (ArrayList)chosenRoute.clone();
}
result += chosenRoute.toString() + " - Cost: "+routeCost + "\n";
// update the route's cost (back to the previous value)
routeCost -= distances.getCost(from, firstNode);
}
else {
for (int to=0; to<distances.getCitiesCount(); to++){
if (!chosenRoute.contains(to)) {
ArrayList increasedRoute = (ArrayList)chosenRoute.clone();
increasedRoute.add(to);
nodes++;
// update the route's cost
routeCost += distances.getCost(from, to);
search(to, increasedRoute);
// update the route's cost (back to the previous value)
routeCost -= distances.getCost(from, to);
}
}
}
}
}
you are not updating chosenRoute correctly; you always add "firstNode" with the same value to your arraylist, I think you should add the visited node.
I will try to check that later