I have a question regarding Genetic Programming. I am going to work on a genetic algorithm for a game called Battleships.
My question is: How would I decide upon a "decision" model for the AI to evolve? And how does that work?
I have read multiple papers and multiple answers that just speak about using different models, but could not find something specific, which, unfortunately, I apparently need to wrap my head around the problem.
I want it to evolve over multiple iterations and "learn" what works best, but not sure how to save these "decisions" (I know to a file, but "encoded" how?)
in a good way, so it will learn to take a stance to previous actions and base off info from the current board state.
I have been contemplating a "Tree structure" for the AI to base decisions on, but I don't actually know how to get started.
If someone could either point me in the right direction (a link? Some pseudo-code? Something like that), that'd be very much appreciated, I tried to google as much as possible, watch multiple youtube videos about the subject, but I think I just need that little nudge in the right direction.
I may also just not know what exactly to search for, and this is why I come up blank with results on what and how I implement this.
ANSWER PART I: The basis for a genetic algorithm is a having a group of actors, some of which reproduce. The fittest are chosen for reproduction and the offspring are copies of the parents that are slightly mutated. It's a pretty simple concept, but to program it you have to have actions that can be randomly chosen and dynamically modified. For the battleship simulation I created a class called a Shooter because it 'shoots' at a position. The assumption here is that the first position has been hit, and the shooter is now trying to sink the battleship.
public class Shooter implements Comparable<Shooter> {
private static final int NUM_SHOTS = 100;
private List<Position> shots;
private int score;
// Make a new set of random shots.
public Shooter newShots() {
shots = new ArrayList<Position>(NUM_SHOTS);
for (int i = 0; i < NUM_SHOTS; ++i) {
shots.add(newShot());
}
return this;
}
// Test this shooter against a ship
public void testShooter(Ship ship) {
score = shots.size();
int hits = 0;
for (Position shot : shots) {
if (ship.madeHit(shot)) {
if (++hits >= ship.getSize())
return;
} else {
score = score - 1;
}
}
}
// get the score of the testShotr operation
public int getScore() {
return score;
}
// compare this shooter to other shooters.
#Override
public int compareTo(Shooter o) {
return score - o.score;
}
// getter
public List<Position> getShots() {
return shots;
}
// reproduce this shooter
public Shooter reproduce() {
Shooter offspring = new Shooter();
offspring.mutate(shots);
return offspring;
}
// mutate this shooter's offspring
private void mutate(List<Position> pShots) {
// copy parent's shots (okay for shallow)
shots = new ArrayList<Position>(pShots);
// 10% new mutations, in random locations
for (int i = 0; i < NUM_SHOTS / 10; i++) {
int loc = (int) (Math.random() * 100);
shots.set(loc, newShot());
}
}
// make a new random move
private Position newShot() {
return new Position(((int) (Math.random() * 6)) - 3, ((int) (Math.random() * 6)) - 3);
}
}
The idea here is that a Shooter has up to 100 shots, randomly chosen between +-3 in the X and +- 3 in the Y. Yea, 100 shots is overkill, but hey, whatever. Pass a Ship to this Shooter.testShooter and it will score itself, 100 being the best score, 0 being the worst.
This Shooter actor has reproduce and mutate methods that will return an offspring that has 10% of its shots randomly mutated. The general idea is that the best Shooters have 'learned' to shoot their shots in a cross pattern ('+') as quickly as possible, since a ship is oriented in one of four ways (North, South, East, West).
The program that runs the simulation, ShooterSimulation, is pretty simple:
public class ShooterSimulation {
private int NUM_GENERATIONS = 1000;
private int NUM_SHOOTERS = 20;
private int NUM_SHOOTERS_NEXT_GENERATION = NUM_SHOOTERS / 10;
List<Shooter> shooters = new ArrayList<Shooter>(NUM_SHOOTERS);
Ship ship;
public static void main(String... args) {
new ShooterSimulation().run();
}
// do the work
private void run() {
firstGeneration();
ship = new Ship();
for (int gen = 0; gen < NUM_GENERATIONS; ++gen) {
ship.newOrientation();
testShooters();
Collections.sort(shooters);
printAverageScore(gen, shooters);
nextGeneration();
}
}
// make the first generation
private void firstGeneration() {
for (int i = 0; i < NUM_SHOOTERS; ++i) {
shooters.add(new Shooter().newShots());
}
}
// test all the shooters
private void testShooters() {
for (int mIdx = 0; mIdx < NUM_SHOOTERS; ++mIdx) {
shooters.get(mIdx).testShooter(ship);
}
}
// print the average score of all the shooters
private void printAverageScore(int gen, List<Shooter> shooters) {
int total = 0;
for (int i = 0, j = shooters.size(); i < j; ++i) {
total = total + shooters.get(i).getScore();
}
System.out.println(gen + " " + total / shooters.size());
}
// throw away the a tenth of old generation
// replace with offspring of the best fit
private void nextGeneration() {
for (int l = 0; l < NUM_SHOOTERS_NEXT_GENERATION; ++l) {
shooters.set(l, shooters.get(NUM_SHOOTERS - l - 1).reproduce());
}
}
}
The code reads as pseudo-code from the run method: make a firstGeneration then iterate for a number of generations. For each generation, set a newOrientation for the ship, then do testShooters, and sort the results of the test with Collections.sort. printAverageScore of the test, then build the nextGeneration. With the list of average scores you can, cough cough, do an 'analysis'.
A graph of the results looks like this:
As you can see it starts out with pretty low average scores, but learns pretty quickly. However, the orientation of the ship keeps changing, causing some noise in addition to the random component. Every now and again a mutation messes up the group a bit, but less and less as the group improves overall.
Challenges, and the reason for many papers to be sure, is to make more things mutable, especially in a constructive way. For example, the number of shots could be mutable. Or, replacing the list of shots with a tree that branches depending on whether the last shot was a hit or miss might improve things, but it's difficult to say. That's where the 'decision' logic considerations come in. Is it better to have a list of random shots or a tree that decides which branch to take depending on the prior shot? Higher level challenges include predicting what changes will make the group learn faster and be less susceptible to bad mutations.
Finally, consider that there could be multiple groups, one group a battleship hunter and one group a submarine hunter for example. Each group, though made of the same code, could 'evolve' different internal 'genetics' that allow them to specialize for their task.
Anyway, as always, start somewhere simple and learn as you go until you get good enough to go back to reading the papers.
PS> Need this too:
public class Position {
int x;
int y;
Position(int x, int y ) {this.x=x; this.y=y;}
#Override
public boolean equals(Object m) {
return (((Position)m).x==x && ((Position)m).y==y);
}
}
UDATE: Added Ship class, fixed a few bugs:
public class Ship {
List<Position> positions;
// test if a hit was made
public boolean madeHit(Position shot) {
for (Position p: positions) {
if ( p.equals(shot)) return true;
}
return false;
}
// make a new orientation
public int newOrientation() {
positions = new ArrayList<Position>(3);
// make a random ship direction.
int shipInX=0, oShipInX=0 , shipInY=0, oShipInY=0;
int orient = (int) (Math.random() * 4);
if( orient == 0 ) {
oShipInX = 1;
shipInX = (int)(Math.random()*3)-3;
}
else if ( orient == 1 ) {
oShipInX = -1;
shipInX = (int)(Math.random()*3);
}
else if ( orient == 2 ) {
oShipInY = 1;
shipInY = (int)(Math.random()*3)-3;
}
else if ( orient == 3 ) {
oShipInY = -1;
shipInY = (int)(Math.random()*3);
}
// make the positions of the ship
for (int i = 0; i < 3; ++i) {
positions.add(new Position(shipInX, shipInY));
if (orient == 2 || orient == 3)
shipInY = shipInY + oShipInY;
else
shipInX = shipInX + oShipInX;
}
return orient;
}
public int getSize() {
return positions.size();
}
}
I would suggest you another approach. This approach is based on the likelihood where a ship can be. I will show you an example on a smaller version of the game (the same idea is for all other versions). In my example it is 3x3 area and has only one 1x2 ship.
Now you take an empty area, and put the ship in all possible positions (storing the number of times the part of the ship was in the element of the matrix). If you will do this for a ship 1x2, you will get the following
1 2 1
1 2 1
1 2 1
Ship can be in another direction 2x1 which will give you the following matrix:
1 1 1
2 2 2
1 1 1
Summing up you will get the matrix of probabilities:
2 3 2
3 4 3
2 3 2
This means that the most probable location is the middle one (where we have 4). Here is where you should shoot.
Now lets assume you hit the part of the ship. If you will recalculate the likelihood matrix, you will get:
0 1 0
1 W 1
0 1 0
which tells you 4 different possible positions for a next shoot.
If for example you would miss on the previous step, you will get the following matrix:
2 2 2
2 M 2
2 2 2
This is the basic idea. The way how you try to reposition the ships is based on the rules how the ships can be located and also what information you got after each move. It can be missed/got or missed/wounded/killed.
ANSWER PART III: As you can see, the Genetic Algorithm is generally not the hard part. Again, it's a simple piece of code that is really meant to exercise another piece of code, the actor. Here, the actor is implemented in a Shooter class. These actor's are often modelled in the fashion of Turning Machines, in the sense that the actor has a defined set of outputs for a set of inputs. The GA helps you to determine the optimal configuration of the state table. In the prior answers to this question, the Shooter implemented a probability matrix like what was described by #SalvadorDali in his answer.
Testing the prior Shooter thoroughly, we find that the best it can do is something like:
BEST Ave=5, Min=3, Max=9
Best=Shooter:5:[(1,0), (0,0), (2,0), (-1,0), (-2,0), (0,2), (0,1), (0,-1), (0,-2), (0,1)]
This shows it takes 5 shots average, 3 at a minimum, and 9 at a maximum to sink a 3X3 battleship. The locations of the 9 shots are shown a X/Y coordinate pairs. The question "Can this be done better?" depends on human ingenuity. A Genetic Algorithm can't write new actors for us. I wondered if a decision tree could do better than a probability matrix, so I implemented one to try it out:
public class Branch {
private static final int MAX_DEPTH = 10;
private static final int MUTATE_PERCENT = 20;
private Branch hit;
private Branch miss;
private Position shot;
public Branch() {
shot = new Position(
(int)((Math.random()*6.0)-3),
(int)((Math.random()*6.0)-3)
);
}
public Branch(Position shot, Branch hit, Branch miss) {
this.shot = new Position(shot.x, shot.y);
this.hit = null; this.miss = null;
if ( hit != null ) this.hit = hit.clone();
if ( miss != null ) this.miss = miss.clone();
}
public Branch clone() {
return new Branch(shot, hit, miss);
}
public void buildTree(Counter c) {
if ( c.incI1() > MAX_DEPTH ) {
hit = null;
miss = null;
c.decI1();
return;
} else {
hit = new Branch();
hit.buildTree(c);
miss = new Branch();
miss.buildTree(c);
}
c.decI1();
}
public void shoot(Ship ship, Counter c) {
c.incI1();
if ( ship.madeHit(shot)) {
if ( c.incI2() == ship.getSize() ) return;
if ( hit != null ) hit.shoot(ship, c);
}
else {
if ( miss != null ) miss.shoot(ship, c);
}
}
public void mutate() {
if ( (int)(Math.random() * 100.0) < MUTATE_PERCENT) {
shot.x = (int)((Math.random()*6.0)-3);
shot.y = (int)((Math.random()*6.0)-3);
}
if ( hit != null ) hit.mutate();
if ( miss != null ) miss.mutate();
}
#Override
public String toString() {
StringBuilder sb = new StringBuilder();
sb.append(shot.toString());
if ( hit != null ) sb.append("h:"+hit.toString());
if ( miss != null ) sb.append("m:"+miss.toString());
return sb.toString();
}
}
The Branch class is a node in a decision tree (ok, maybe poorly named). At every shot, the next branch chosen depends on whether the shot was awarded a hit or not.
The shooter is modified somewhat to use the new decisionTree.
public class Shooter implements Comparable<Shooter> {
private Branch decisionTree;
private int aveScore;
// Make a new random decision tree.
public Shooter newShots() {
decisionTree = new Branch();
Counter c = new Counter();
decisionTree.buildTree(c);
return this;
}
// Test this shooter against a ship
public int testShooter(Ship ship) {
Counter c = new Counter();
decisionTree.shoot(ship, c);
return c.i1;
}
// compare this shooter to other shooters, reverse order
#Override
public int compareTo(Shooter o) {
return o.aveScore - aveScore;
}
// mutate this shooter's offspring
public void mutate(Branch pDecisionTree) {
decisionTree = pDecisionTree.clone();
decisionTree.mutate();
}
// min, max, setters, getters
public int getAveScore() {
return aveScore;
}
public void setAveScore(int aveScore) {
this.aveScore = aveScore;
}
public Branch getDecisionTree() {
return decisionTree;
}
#Override
public String toString() {
StringBuilder ret = new StringBuilder("Shooter:"+aveScore+": [");
ret.append(decisionTree.toString());
return ret.append(']').toString();
}
}
The attentive reader will notice that while the methods themselves have changed, which methods a Shooter needs to implement is not different from the prior Shooters. This means the main GA simulation has not changed except for one line related to mutations, and that probably could be worked on:
Shooter child = shooters.get(l);
child.mutate( shooters.get(NUM_SHOOTERS - l - 1).getDecisionTree());
A graph of a typical simulation run now looks like this:
As you can see, the final best average score evolved using a Decision Tree is one shot less than the best average score evolved for a Probability Matrix. Also notice that this group of Shooters has taken around 800 generations to train to their optimum, about twice as long than the simpler probability matrix Shooters. The best decision tree Shooter gives this result:
BEST Ave=4, Min=3, Max=6
Best=Shooter:4: [(0,-1)h:(0,1)h:(0,0) ... ]
Here, not only does the average take one shot less, but the maximum number of shots is 1/3 lower than a probability matrix Shooter.
At this point it takes some really smart guys to determine whether this actor has achieved the theoretical optimum for the problem domain, i.e., is this the best you can do trying to sink a 3X3 ship? Consider that the answer to that question would become more complex in the real battleship game, which has several different size ships. How would you build an actor that incorporates the knowledge of which of the boats have already been sunk into actions that are randomly chosen and dynamically modified? Here is where understanding Turing Machines, also known as CPUs, becomes important.
PS> You will need this class also:
public class Counter {
int i1;
int i2;
public Counter() {i1=0;i2=0;}
public int incI1() { return ++i1; }
public int incI2() { return ++i2; }
public int decI1() { return --i1; }
public int decI2() { return --i2; }
}
ANSWER PART II: A Genetic Algorithm is not a end unto itself, it is a means to accomplish an end. In the case of this example of battleship, the end is to make the best Shooter. I added the a line to the prior version of the program to output the best shooter's shot pattern, and noticed something wrong:
Best shooter = Shooter:100:[(0,0), (0,0), (0,0), (0,-1), (0,-3), (0,-3), (0,-3), (0,0), (-2,-1) ...]
The first three shots in this pattern are at coordinates (0,0), which in this application are guaranteed hits, even though they are hitting the same spot. Hitting the same spot more than once is against the rules in battleship, so this "best" shooter is the best because it has learned to cheat!
So, clearly the program needs to be improved. To do that, I changed the Ship class to return false if a position has already been hit.
public class Ship {
// private class to keep track of hits
private class Hit extends Position {
boolean hit = false;
Hit(int x, int y) {super(x, y);}
}
List<Hit> positions;
// need to reset the hits for each shooter test.
public void resetHits() {
for (Hit p: positions) {
p.hit = false;
}
}
// test if a hit was made, false if shot in spot already hit
public boolean madeHit(Position shot) {
for (Hit p: positions) {
if ( p.equals(shot)) {
if ( p.hit == false) {
p.hit = true;
return true;
}
return false;
}
}
return false;
}
// make a new orientation
public int newOrientation() {
positions = new ArrayList<Hit>(3);
int shipInX=0, oShipInX=0 , shipInY=0, oShipInY=0;
// make a random ship orientation.
int orient = (int) (Math.random() * 4.0);
if( orient == 0 ) {
oShipInX = 1;
shipInX = 0-(int)(Math.random()*3.0);
}
else if ( orient == 1 ) {
oShipInX = -1;
shipInX = (int)(Math.random()*3.0);
}
else if ( orient == 2 ) {
oShipInY = 1;
shipInY = 0-(int)(Math.random()*3.0);
}
else if ( orient == 3 ) {
oShipInY = -1;
shipInY = (int)(Math.random()*3.0);
}
// make the positions of the ship
for (int i = 0; i < 3; ++i) {
positions.add(new Hit(shipInX, shipInY));
if (orient == 2 || orient == 3)
shipInY = shipInY + oShipInY;
else
shipInX = shipInX + oShipInX;
}
return orient;
}
public int getSize() {
return positions.size();
}
}
After I did this, my shooters stopped "cheating", but that got me to thinking about the scoring in general. What the prior version of the application was doing was scoring based on how many shots missed, and hence a shooter could get a perfect score if none of the shots missed. However, that is unrealistic, what I really want is shooters that shoot the least shots. I changed the shooter to keep track of the average of shots taken:
public class Shooter implements Comparable<Shooter> {
private static final int NUM_SHOTS = 40;
private List<Position> shots;
private int aveScore;
// Make a new set of random shots.
public Shooter newShots() {
shots = new ArrayList<Position>(NUM_SHOTS);
for (int i = 0; i < NUM_SHOTS; ++i) {
shots.add(newShot());
}
return this;
}
// Test this shooter against a ship
public int testShooter(Ship ship) {
int score = 1;
int hits = 0;
for (Position shot : shots) {
if (ship.madeHit(shot)) {
if (++hits >= ship.getSize())
return score;
}
score++;
}
return score-1;
}
// compare this shooter to other shooters, reverse order
#Override
public int compareTo(Shooter o) {
return o.aveScore - aveScore;
}
... the rest is the same, or getters and setters.
}
I also realized that I had to test each shooter more than once in order to be able to get an average number of shots fired against battleships. For that, I subjected each shooter individually to a test multiple times.
// test all the shooters
private void testShooters() {
for (int i = 0, j = shooters.size(); i<j; ++i) {
Shooter current = shooters.get(i);
int totalScores = 0;
for (int play=0; play<NUM_PLAYS; ++play) {
ship.newOrientation();
ship.resetHits();
totalScores = totalScores + current.testShooter(ship);
}
current.setAveScore(totalScores/NUM_PLAYS);
}
}
Now, when I run the simulation, I get the average of the averages an output. The graph generally looks something like this:
Again, the shooters learn pretty quickly, but it takes a while for random changes to bring the averages down. Now my best Shooter makes a little more sense:
Best=Shooter:6:[(1,0), (0,0), (0,-1), (2,0), (-2,0), (0,1), (-1,0), (0,-2), ...
So, a Genetic Algorithm is helping me to set the configuration of my Shooter, but as another answer here pointed out, good results can be achieved just by thinking about it. Consider that if I have a neural network with 10 possible settings with 100 possible values in each setting, that's 10^100 possible settings and the theory for how those settings should be set may a little more difficult than battleship shooter theory. In this case, a Genetic Algorithm can help determine optimal settings and test current theory.
Related
So I've built this program to build different stair cases. Essentially the problem is: Given an integer N, how many different ways can you build the staircase. N is guaranteed to be larger than 3 and smaller than 200. Any previous step can not be larger than its following step otherwise it defeats the purpose of the staircase.
So given N = 3
You can build one staircase: 2 steps and then 1 step following that
Given N = 4
You can build one staircase: 3 steps and then 1 step following that
Given N = 5
You can build two staircases: 3 steps and then 2 steps OR 4 steps and then 1 step.
My method is below and it works, except its runtime is far too slow. So I was thinking of trying to make a memoization for the method, but to be honest I do not fully understand how to implement this. If I could get some help on how to do so that'd be great.
public static void main(String [] args)
{
System.out.println(answer(200));
}
public static int answer(int n) {
return bricks(1,n) -1;
}
public static int bricks(int height, int bricksLeft)
{
if(bricksLeft == 0)
{
return 1;
}
else if(bricksLeft < height)
{
return 0;
}
else
{
return bricks(height +1, bricksLeft - height) + bricks(height +1, bricksLeft);
}
}
Overview
So what you have here is a recursive solution. That works well for this type of problem. In this particular recursive solution, your recursive step will be called with the same arguments many times.
One really common optimization pattern for recursive solutions where the same calculation is being made many times is Dynamic Programming. The idea is that instead of doing the same calculation many times, we just cache each calculation the first time we do it. Then every following time, if we need to calculate the exact same value, we can just read the result from the cache.
Solution
With that in mind, this solution should work. It uses exactly the same logic as your original version, it just caches all results for the recursive step in a HashMap so that it never needs to calculate the same thing twice. It also uses a Staircase object to track pairs of (bricks, height). This is because we cannot insert pairs into a HashMap, we can only insert single objects.
Just change the variable bricks to whatever value you want to solve for.
public class Staircase {
private static HashMap<Staircase, Integer> cache;
public static void main(String[] args) {
cache = new HashMap<>();
int bricks = 6;
Staircase toBuild = new Staircase(1, bricks);
System.out.println(toBuild.waysToBuild() - 1);
}
public final int height;
public final int bricksLeft;
public Staircase(int height, int bricksLeft) {
this.height = height;
this.bricksLeft = bricksLeft;
}
public int waysToBuild() {
if (cache.containsKey(this)) {
return cache.get(this);
}
int toReturn;
if (bricksLeft == 0) {
toReturn = 1;
} else if (bricksLeft < height) {
toReturn = 0;
} else {
Staircase component1 = new Staircase(height + 1, bricksLeft - height);
Staircase component2 = new Staircase(height + 1, bricksLeft);
toReturn = component1.waysToBuild() + component2.waysToBuild();
}
cache.put(this, toReturn);
return toReturn;
}
#Override
public boolean equals(Object other) {
if (other instanceof Staircase) {
if (height != ((Staircase) other).height) {
return false;
}
if (bricksLeft != ((Staircase) other).bricksLeft) {
return false;
}
return true;
}
return false;
}
#Override
public int hashCode() {
int hash = 5;
hash = 73 * hash + this.height;
hash = 73 * hash + this.bricksLeft;
return hash;
}
}
Analysis
I tested it out and the performance is much faster than your previous version. It computes values up to 200 instantly.
Your original function was O(2^n). That is because we make 2 recursive calls for each value from 1 to n, so the total number of calls is doubled for each time n is incremented.
The Dynamic Programming solution is O(n) since at most it will need to calculate the number of ways to make a staircase out of n bricks once for each value of n.
Additional Reading
Here is some more reading about Dynamic Programming: https://en.wikipedia.org/wiki/Dynamic_programming
Use a small class to hold the pairs (height, bricks), say:
private static class Stairs {
private int height;
private int bricks;
Stairs(int height, int bricks) {
this.height = height; this.bricks = bricks;
}
}
Then use a global HashMap<Stairs, Integer>, initialized in the main():
map = new HashMap<Stairs, Integer>();
In the bricks() function, check if the solution for a particular (height, bricks) pair is in the map. If yes, just return it from the map via a call to the get() method. Otherwise, do the computation:
Stairs stairsObj = new Stairs(height, bricks);
if(map.get(stairsObj) == null) {
// Put your compute code here
}
Before every return statement in the function, add two additional statements. Something like:
int result = <whatever you are returning right now>;
map.put(stairsObj, result);
return result;
I work on a genetic algorithm for a robotic assembly line balancing problem (assigning assembly operations and robots to stations to minimize the cycle time for a given number of stations). The solution is represented by an ArrayList (configuration) which holds all the operations in the sequence assigned to different stations. Furthermore, I have two more ArrayLists (robotAssignment, operationPartition) which indicate where a new station starts and which robot is assigned to a station. For example, a solution candidate looks like this (configuration, robotAssignment, operationPartition from top to bottom):
Initial cycle time: 50.0
|2|7|3|9|1|5|4|6|8|10|
|2|1|3|2|
|0|2|5|7|
From this solution representation we know that operations 3, 9, and 1 are assigned to the second sation and robot 1 is used.
I need to keep track of the station an operation is assigned to. I tried a lot to store this in the Object Operation itself but I always ended up in problems and therefore I want to write a method that gives me the stations index of an operation.
Here is what I have coded so far:
// Get the station of an operation
public int getStation(Operation operation) {
int stationIndex = 0;
int position = configuration.indexOf(operation);
for (int i = 0; i < GA_RALBP.numberOfStations ; i++ ) {
if (i < GA_RALBP.numberOfStations - 1 && operationPartition.get(i) != null) {
if (isBetween(position, (int) operationPartition.get(i), (int) operationPartition.get(i + 1))) {
return stationIndex + 1;
} else {
stationIndex++;
}
}
else if (i >= GA_RALBP.numberOfStations - 1 && operationPartition.get(i) != null) {
if (isBetween(position, (int) operationPartition.get(i), configurationSize())) {
return stationIndex + 1;
}
}
}
return -1;
}
// Check if value x is between values left and right including left
public static boolean isBetween(int x, int left, int right) {
if (left <= x && x < right ) {
return true;
}
else {
return false;
}
}
However, this does not seem to be (a) very elegant and (b) if I have to do this for a large number of operations the runtime could become a problem. Has anoyone an idea how to solve this more efficiently?
Why not make the partitioning explicit (replaces your operationPartition) - something like:
Map<Integer, Integer> operationToStationMapping = new HashMap<>();
operationToStationMapping.put(2,0);
operationToStationMapping.put(7,0);
operationToStationMapping.put(3,2);
operationToStationMapping.put(9,2);
operationToStationMapping.put(1,2);
operationToStationMapping.put(5,5);
operationToStationMapping.put(6,7);
operationToStationMapping.put(8,-1);
operationToStationMapping.put(10,-1);
Then getStation() becomes:
getStation(int operation) {return operationToStationMapping.get(operation);}
I've got an assignment to write a NIM game with a human player and an AI player. The game is played "Misere" (last one that has to pick a stick loses). The AI is supposed to be using the Minimax algorithm, but it's making moves that make it lose faster and I can't figure out why. I've been on a dead end for days now.
The point of the Minimax algorithm is to not lose, and if it's in a losing position, delay losing as much moves as possible, right?
Consider the following:
NIMBoard board = new NIMBoard(34, 2);
34 = Binary coded positions of the sticks, 2 piles of 2 sticks
2 = the number of piles
So we start of with this scenario, the * character representing a stick:
Row 0: **
Row 1: **
In this particular board situation, the Minimax algorithm always comes up with the move "Remove 2 sticks from Row 1". This is clearly a bad move as it leaves 2 sticks in row 0, where the human player can then pick 1 stick from row 0 and win the game.
The AI player should choose to pick one stick from either pile. That leaves this for the human player:
Row 0: *
Row 1: **
So no matter which move the human player makes now, when the computer makes the next move after that, the human player will always lose. Clearly a better strategy, but why isn't the algorithm suggesting this move?
public class Minimax
{
public Move nextMove;
public int evaluateComputerMove(NIMBoard board, int depth)
{
int maxValue = -2;
int calculated;
if(board.isFinal())
{
return -1;
}
for(Move n : this.generateSuccessors(board))
{
NIMBoard newBoard = new NIMBoard(board.getPos(), board.getNumPiles());
newBoard.parseMove(n);
calculated = this.evaluateHumanMove(newBoard, depth + 1);
if(calculated > maxValue)
{
maxValue = calculated;
if(depth == 0)
{
System.out.println("Setting next move");
this.nextMove = n;
}
}
}
if(maxValue == -2)
{
return 0;
}
return maxValue;
}
public int evaluateHumanMove(NIMBoard board, int depth)
{
int minValue = 2;
int calculated;
if(board.isFinal())
{
return 1;
}
for(Move n : this.generateSuccessors(board))
{
NIMBoard newBoard = new NIMBoard(board.getPos(), board.getNumPiles());
newBoard.parseMove(n);
calculated = this.evaluateComputerMove(newBoard, depth + 1);
// minValue = Integer.min(this.evaluateComputerMove(newBoard, depth + 1), minValue);
if(calculated < minValue)
{
minValue = calculated;
}
}
if(minValue == 2)
{
return 0;
}
return minValue;
}
public ArrayList<Move> generateSuccessors(NIMBoard start)
{
ArrayList<Move> successors = new ArrayList<Move>();
for(int i = start.getNumPiles() - 1; i >= 0; i--)
{
for(long j = start.getCountForPile(i); j > 0; j--)
{
Move newMove = new Move(i, j);
successors.add(newMove);
}
}
return successors;
}
}
public class NIMBoard
{
/**
* We use 4 bits to store the number of sticks which gives us these
* maximums:
* - 16 piles
* - 15 sticks per pile
*/
private static int PILE_BIT_SIZE = 4;
private long pos;
private int numPiles;
private long pileMask;
/**
* Instantiate a new NIM board
* #param pos Number of sticks in each pile
* #param numPiles Number of piles
*/
public NIMBoard(long pos, int numPiles)
{
super();
this.pos = pos;
this.numPiles = numPiles;
this.pileMask = (long) Math.pow(2, NIMBoard.PILE_BIT_SIZE) - 1;
}
/**
* Is this an endgame board?
* #return true if there's only one stick left
*/
public boolean isFinal()
{
return this.onePileHasOnlyOneStick();
}
/**
* Figure out if the board has a pile with only one stick in it
* #return true if yes
*/
public boolean onePileHasOnlyOneStick()
{
int count = 0;
for(int i = 0; i < this.numPiles; i++)
{
count += this.getCountForPile(i);
}
if(count > 1)
{
return false;
}
return true;
}
public int getNumPiles()
{
return this.numPiles;
}
public long getPos()
{
return this.pos;
}
public long getCountInPile(int pile)
{
return this.pos & (this.pileMask << (pile * NIMBoard.PILE_BIT_SIZE));
}
public long getCountForPile(int pile)
{
return this.getCountInPile(pile) >> (pile * NIMBoard.PILE_BIT_SIZE);
}
public void parseMove(Move move)
{
this.pos = this.pos - (move.getCount() << (move.getPile() * NIMBoard.PILE_BIT_SIZE));
}
#Override
public String toString()
{
String tmp = "";
for(int i = 0; i < this.numPiles; i++)
{
tmp += "Row " + i + "\t";
for(int j = 0; j < this.getCountForPile(i); j++)
{
tmp += "*";
}
tmp += System.lineSeparator();
}
return tmp.trim();
}
}
The move that you suppose is a better move for the AI is not actually a better move. In that board situation, the human player would take two sticks from Row 1, and the computer is still stuck taking the last stick. That doesn't guarantee your program is working correctly, but I think you should try some different test cases. For example, see what the AI does if you give it the situation where you supposed the human player would lose.
You are not supposed to have a different function for the human player. You should assume both players use the best strategy and since you're implementing it it should be the same code for both players.
The idea of the algorithm is not to assign a state id to the current state equal to the minimum state id that doesn't overlap any state id of the states that you could end up. If you can make a move and reach state with ids 0, 1, and 3 then the current state should have state id 2.
Any losing state should have id 0.
If your current state has state id 0 you lose no mater what move you make. Otherwise you can find a move that moves the board into a state with id 0 which means the other player will lose.
I'm trying to implement a program to solve the n-puzzle problem.
I have written a simple implementation in Java that has a state of the problem characterized by a matrix representing the tiles. I am also able to auto-generate the graph of all the states giving the starting state. On the graph, then, I can do a BFS to find the path to the goal state.
But the problem is that I run out of memory and I cannot even create the whole graph.
I tried with a 2x2 tiles and it works. Also with some 3x3 (it depends on the starting state and how many nodes are in the graph). But in general this way is not suitable.
So I tried generating the nodes at runtime, while searching. It works, but it is slow (sometimes after some minutes it still have not ended and I terminate the program).
Btw: I give as starting state only solvable configurations and I don't create duplicated states.
So, I cannot create the graph. This leads to my main problem: I have to implement the A* algorithm and I need the path cost (i.e. for each node the distance from the starting state), but I think I cannot calculate it at runtime. I need the whole graph, right? Because A* does not follow a BFS exploration of the graph, so I don't know how to estimate the distance for each node. Hence, I don't know how to perform an A* search.
Any suggestion?
EDIT
State:
private int[][] tiles;
private int pathDistance;
private int misplacedTiles;
private State parent;
public State(int[][] tiles) {
this.tiles = tiles;
pathDistance = 0;
misplacedTiles = estimateHammingDistance();
parent = null;
}
public ArrayList<State> findNext() {
ArrayList<State> next = new ArrayList<State>();
int[] coordZero = findCoordinates(0);
int[][] copy;
if(coordZero[1] + 1 < Solver.SIZE) {
copy = copyTiles();
int[] newCoord = {coordZero[0], coordZero[1] + 1};
switchValues(copy, coordZero, newCoord);
State newState = checkNewState(copy);
if(newState != null)
next.add(newState);
}
if(coordZero[1] - 1 >= 0) {
copy = copyTiles();
int[] newCoord = {coordZero[0], coordZero[1] - 1};
switchValues(copy, coordZero, newCoord);
State newState = checkNewState(copy);
if(newState != null)
next.add(newState);
}
if(coordZero[0] + 1 < Solver.SIZE) {
copy = copyTiles();
int[] newCoord = {coordZero[0] + 1, coordZero[1]};
switchValues(copy, coordZero, newCoord);
State newState = checkNewState(copy);
if(newState != null)
next.add(newState);
}
if(coordZero[0] - 1 >= 0) {
copy = copyTiles();
int[] newCoord = {coordZero[0] - 1, coordZero[1]};
switchValues(copy, coordZero, newCoord);
State newState = checkNewState(copy);
if(newState != null)
next.add(newState);
}
return next;
}
private State checkNewState(int[][] tiles) {
State newState = new State(tiles);
for(State s : Solver.states)
if(s.equals(newState))
return null;
return newState;
}
#Override
public boolean equals(Object obj) {
if(this == null || obj == null)
return false;
if (obj.getClass().equals(this.getClass())) {
for(int r = 0; r < tiles.length; r++) {
for(int c = 0; c < tiles[r].length; c++) {
if (((State)obj).getTiles()[r][c] != tiles[r][c])
return false;
}
}
return true;
}
return false;
}
Solver:
public static final HashSet<State> states = new HashSet<State>();
public static void main(String[] args) {
solve(new State(selectStartingBoard()));
}
public static State solve(State initialState) {
TreeSet<State> queue = new TreeSet<State>(new Comparator1());
queue.add(initialState);
states.add(initialState);
while(!queue.isEmpty()) {
State current = queue.pollFirst();
for(State s : current.findNext()) {
if(s.goalCheck()) {
s.setParent(current);
return s;
}
if(!states.contains(s)) {
s.setPathDistance(current.getPathDistance() + 1);
s.setParent(current);
states.add(s);
queue.add(s);
}
}
}
return null;
}
Basically here is what I do:
- Solver's solve has a SortedSet. Elements (States) are sorted according to Comparator1, which calculates f(n) = g(n) + h(n), where g(n) is the path cost and h(n) is a heuristic (the number of misplaced tiles).
- I give the starting configuration and look for all the successors.
- If a successor has not been already visited (i.e. if it is not in the global set States) I add it to the queue and to States, setting the current state as its parent and parent's path + 1 as its path cost.
- Dequeue and repeat.
I think it should work because:
- I keep all the visited states so I'm not looping.
- Also, there won't be any useless edge because I immediately store current node's successors. E.g.: if from A I can go to B and C, and from B I could also go to C, there won't be the edge B->C (since path cost is 1 for each edge and A->B is cheaper than A->B->C).
- Each time I choose to expand the path with the minimum f(n), accordin to A*.
But it does not work. Or at least, after a few minutes it still can't find a solution (and I think is a lot of time in this case).
If I try to create a tree structure before executing A*, I run out of memory building it.
EDIT 2
Here are my heuristic functions:
private int estimateManhattanDistance() {
int counter = 0;
int[] expectedCoord = new int[2];
int[] realCoord = new int[2];
for(int value = 1; value < Solver.SIZE * Solver.SIZE; value++) {
realCoord = findCoordinates(value);
expectedCoord[0] = (value - 1) / Solver.SIZE;
expectedCoord[1] = (value - 1) % Solver.SIZE;
counter += Math.abs(expectedCoord[0] - realCoord[0]) + Math.abs(expectedCoord[1] - realCoord[1]);
}
return counter;
}
private int estimateMisplacedTiles() {
int counter = 0;
int expectedTileValue = 1;
for(int i = 0; i < Solver.SIZE; i++)
for(int j = 0; j < Solver.SIZE; j++) {
if(tiles[i][j] != expectedTileValue)
if(expectedTileValue != Solver.ZERO)
counter++;
expectedTileValue++;
}
return counter;
}
If I use a simple greedy algorithm they both work (using Manhattan distance is really quick (around 500 iterations to find a solution), while with number of misplaced tiles it takes around 10k iterations). If I use A* (evaluating also the path cost) it's really slow.
Comparators are like that:
public int compare(State o1, State o2) {
if(o1.getPathDistance() + o1.getManhattanDistance() >= o2.getPathDistance() + o2.getManhattanDistance())
return 1;
else
return -1;
}
EDIT 3
There was a little error. I fixed it and now A* works. Or at least, for the 3x3 if finds the optimal solution with only 700 iterations. For the 4x4 it's still too slow. I'll try with IDA*, but one question: how long could it take with A* to find the solution? Minutes? Hours? I left it for 10 minutes and it didn't end.
There is no need to generate all state space nodes for solving a problem using BFS, A* or any tree search, you just add states you can explore from current state to the fringe and that's why there is a successor function.
If BFS consumes much memory it is normal. But I don't know exactly fro what n it would make problem. Use DFS instead.
For A* you know how many moves you made to come to current state and you can estimate moves need to solve problem, simply by relaxing problem. As an example you can think that any two tiles can replace and then count moves needed to solve the problem. You heuristic just needs to be admissible ie. your estimate be less then actual moves needed to solve the problem.
add a path cost to your state class and every time you go from a parent state P to another state like C do this : c.cost = P.cost + 1 this will compute the path cost for every node automatically
this is also a very good and simple implementation in C# for 8-puzzle solver with A* take a look at it you will learn many things :
http://geekbrothers.org/index.php/categories/computer/12-solve-8-puzzle-with-a
Given the adjacency matrix of a graph, I need to obtain the chromatic number (minimum number of colours needed to paint every node of a graph so that adjacent nodes get different colours).
Preferably it should be a java algorithm, and I don't care about performance.
Thanks.
Edit:
recently introduced a fix so the answer is more accurately. now it will recheck his position with his previous positions.
Now a new question comes up. Which will be better to raise his 'number-color'? the node in which i am standing, or the node i am visiting (asking if i am adjacent to it)?
public class Modelacion {
public static void main(String args[]) throws IOException{
// given the matrix ... which i have hidden the initialization here
int[][] matriz = new int[40][40];
int color[] = new int[40];
for (int i = 0 ; i<40;i++)
color[i]=1;
Cromatico c = new Cromatico(matriz, color);
}
}
import java.io.IOException;
public class Cromatico {
Cromatico(int[][]matriz, int[] color, int fila) throws IOException{
for (int i = 0; i<fila;i++){
for (int j = 0 ; j<fila;j++){
if (matriz[i][j] == 1 && color[i] == color [j]){
if (j<i)
color [i] ++;
else
color [j] ++;
}
}
}
int numeroCromatico = 1;
for (int k = 0; k<fila;k++){
System.out.print(".");
numeroCromatico = Math.max(numeroCromatico, color[k]);
}
System.out.println();
System.out.println("el numero cromatico del grafo es: " + numeroCromatico);
}
}
Finding the chromatic number of a graph is NP-Complete (see Graph Coloring). It is NP-Complete even to determine if a given graph is 3-colorable (and also to find a coloring).
The wiki page linked to in the previous paragraph has some algorithms descriptions which you can probably use.
btw, since it is NP-Complete and you don't really care about performance, why don't you try using brute force?
Guess a chromatic number k, try all possibilities of vertex colouring (max k^n possibilities), if it is not colorable, new guess for chromatic number = min{n,2k}. If it is k-colorable, new guess for chromatic number = max{k/2,1}. Repeat, following the pattern used by binary search and find the optimal k.
Good luck!
And to answer your edit.
Neither option of incrementing the color will work. Also, your algorithm is O(n^2). That itself is enough to tell it is highly likely that your algorithm is wrong, even without looking for counterexamples. This problem is NP-Complete!
Super slow, but it should work:
int chromaticNumber(Graph g) {
for (int ncolors = 1; true; ncolors++) {
if (canColor(g, ncolors)) return ncolors;
}
}
boolean canColor(Graph g, int ncolors) {
return canColorRemaining(g, ncolors, 0));
}
// recursive routine - the first colors_so_far nodes have been colored,
// check if there is a coloring for the rest.
boolean canColorRemaining(Graph g, int ncolors, int colors_so_far) {
if (colors_so_far == g.nodes()) return true;
for (int c = 0; c < ncolors; c++) {
boolean ok = true;
for (int v : g.adjacent(colors_so_far)) {
if (v < colors_so_far && g.getColor(v) == c) ok = false;
}
if (ok) {
g.setColor(colors_so_far, c);
if (canColorRemaining(g, ncolors, colors_so_far + 1)) return true;
}
}
return false;
}