I'm making my own 5 in a row game where the board is bigger than 5x5, the size isn't decided but let's say 10x10. I'm designing this with a minimax algorithm and alfa-beta pruning. I've decided that a winning situation has the utility function of: (empty places + chosen place)*5, so if computer finds a win where there will be 2 empty places left then the value will be (2+1)*5 = 15. Same calculation for lose but times -5 instead of 5. For a draw it'll evaluate 0. Now, the issue that I'm writing this for is this: How do I determine the utility for an unfinished scenario where the depth limit is reached? Simply making it 0 isn't good enough, there must be some kind of "guess" or ranking.
An idea I had was to count all the X's in every row, every column and every possible diagonal and do empty places+chosen place times biggest found sequence of X's. The issue there is that the calculations take time and are a pain to write and then you'd also have to take into account the edges of the sequences found: are they bordered by O's? Should we count X's with empty gaps?
It seems awfully complicated and I wondered therefore if anyone has any good advise for me in regards to the unfinished scenarios. Thank you!
I have no idea what you are talking about in your first paragraph, a win should be scored as a win (minus depth to always find shortest win). If I understand you correctly you are looking to create an evaluation/heuristics function to be able to input a board state and output a static score for that state? Gomoku (5 in a row) is a popular game that has many engines, Google "evaluation function gomoku" and you will find lots of information and paper on the subject.
For a game like chess the evaluation is done e.g. by counting each sides material balance, piece positions, king safety etc. For X-in-a-row the usual approach is to give all possible scenarios a score and then add them up together. The scenarios for Gomoku are usually as follows:
FiveInRow (Win)
OpenFour
ClosedFour
OpenThree
ClosedThree
OpenTwo
ClosedTwo
Open means there are no blocking enemy pieces next to the friendly pieces:
OpenThree: -O--XXX---
ClosedThree: --OXXX---
How to sum them all up, and what score to give each scenario, is up to you and that is where the fun and experimenting begins!
Related
I'm trying to implement a connect4 game with a different concept using Minimax algorithm in Java.
I completely understand the minimax algorithm. However when coming to the implementation, I cant figure out what will be the terminal values in this case.
In the videos and notes that I've referred, they always showed a terminal value at the terminal, so my question is How do i get those terminal values for the connect4 game.
Do i put some kind of probability of making a set of 4 at those terminal or something?.Please help.
Thank you
As connect can be played perfect with todays computers, there can be values +1, 0 and -1 assigned to every terminal node dependig on the result (win, draw, loose).
If your program is unable to search the whole three, you will have to write socalled evaluation heuristics, returning a number determining wether a position is good or bad. (So yes, in your words: the winning probability of a position)
You can achieve that for example by counting the amounts of 2 in a row and 3 in a row.
Better heuristics will result in better play of the engine.
Can you explain me how to build the tree?
I quite understood how the nodes are chosen, but a nicer explanation would really help me implementing this algorithm. I already have a board representing the game state, but I don't know (understand) how to generate the tree.
Can someone points me to a well commented implementation of the algorithm (I need to use it for AI)? Or better explanation/examples of it?
I didn't found a lot of resources on the net, this algorithm is rather new...
The best way to generate the tree is a series of random playouts. The trick is being able to balance between exploration and exploitation (this is where the UCT comes in). There are some good code samples and plenty of research paper references here : https://web.archive.org/web/20160308043415/http://mcts.ai:80/index.html
When I implemented the algorithm, I used random playouts until I hit an end point or termination state. I had a static evaluation function that would calculate the payoff at this point, then the score from this point is propagated back up the tree. Each player or "team" assumes that the other team will play the best move for themselves, and the worst move possible for their opponent.
I would also recommend checking out the papers by Chaslot and his phd thesis as well as some of the research that references his work (basically all the MCTS work since then).
For example: Player 1's first move could simulate 10 moves into the future alternating between player 1 moves and player 2 moves. Each time you must assume that the opposing player will try to minimize your score whilst maximizing their own score. There is an entire field based on this known as Game Theory. Once you simulate to the end of 10 games, you iterate from the start point again (because there is no point only simulating one set of decisions). Each of these branches of the tree must be scored where the score is propagated up the tree and the score represents the best possible payoff for the player doing the simulating assuming that the other player is also choosing the best moves for themselves.
MCTS consists of four strategic steps, repeated as long as there is time left. The steps are as follows.
In the selection step the tree is traversed from the
root node until we reach a node E, where we select a position that is not added to the tree yet.
Next, during the play-out step moves are played in self-play until the end of the game is reached. The result R of this “simulated” game is +1 in case of a win for Black (the first player in LOA), 0 in case of a draw, and −1 in case of a win for White.
Subsequently, in the expansion step children of E are added to the tree.
Finally, R is propagated back along the path from E to the root node in the backpropagation step. When time is up, the move played by the program is the child of the root with the highest value.
(This example is taken from this paper - PDF
www.ru.is/faculty/yngvi/pdf/WinandsBS08.pdf
Here are some implementations:
A list of libraries and games using some MCTS implementations
http://senseis.xmp.net/?MonteCarloTreeSearch
and a game independent open source UCT MCTS library called Fuego
http://fuego.sourceforge.net/fuego-doc-1.1/smartgame-doc/group__sguctgroup.html
From http://mcts.ai/code/index.html:
Below are links to some basic MCTS implementations in various
programming languages. The listings are shown with timing, testing
and debugging code removed for readability.
Java
Python
I wrote this one if you're interrested : https://github.com/avianey/mcts4j
Honestly, I only knew of such a game recently and I wonder how one can create a solving algorithm using the recursive search method?
There are 15 holes in total in the triangular board. Making that 14 pegs with a total of 13 moves.
I don't know where to start with this in C++ nor Java. I have studied C++ for an year before. So I'm familiar with the concepts of stacks, linked lists etc.
I just don't know how to start the code. The program firstly asks the user where they want to start (How is this done?)
Then once it solves it , a certain number of pegs more than just one will be left and the program will ask the user for a better solution (like this until the board is left to just one peg.)
I certainly cannot think of how to make the moves possible ( How do I write a code that "SHOWS" that one peg moves over a hole ,into another?)
I'd love some coding assistance here. It would really be appreciated.
Try treating the board a linked list of holes or positions. Each data field in a node would represent a hole. Each node would have a vector of links to other holes, depending on its position relative to the edge of the board.
To move a peg, iterate over the possible links.
This is just one method, there are probably better ones out there.
Take a look at my answer here: Timeout on a php Peg Puzzle solver. The 15-peg puzzle was the first program that I ever wrote (over 10 years ago), after just learning c++.
The posted solution is a re-write that I did several years later.
Since I was the answerer I can tell you that a triplet is a term that I made up for a move. There are three pegs involved in a move (a triplet). Each peg is represented by a bit. In order for a move to be legal two consecutive pegs (bits) need to be set and the other (the target location) needs to be clear.
The entire solution involves a depth first search using a 16-bit integer to represent the board. Simple bit manipulation can be used to update the board state. For example if you xor the current state with a move (triplet) it will make the move against the board. The same operation applied a second time against the board is an undo of the move.
To be successful as a programmer, you need to develop the skill of examining a problem, figuring out how it can be solved, and come up with a reasonable in-program representation that enables you to solve it.
For problems like this, I find it helpful to have the puzzle in front of me, so I can try it out by hand. Failing that, at least having a picture of the thing will help:
http://www.obsidiandesigns.com/pyramidsol.jpg
Here's another way to look at it, and possibly consider representing it in memory like this:
OXXXX
XXXX
XXX
XX
X
Each "X" is a peg. The "O" is a hole. One can change "OXX" to "XOO" by jumping the 3rd X over the middle one, into the hole. Also consider vertical and diagonal jumps.
Linked lists could make a lot of sense. You might actually want "2D" links, instead of the usual "1D" links. That is each "hole" instance could contain pointers to the two or three holes next to it.
Once you have a representation, you need a way to find valid moves. And to make valid moves.
For this "game," you may want to search the whole space of possible moves, so you'll probably want to either have each move produce a new board, or to have a way to "undo" moves.
A valid move is where one peg jumps over another peg, into an empty hole. The jumped peg is removed. You can't jump from "just anywhere" to anywhere, over any peg you wish -- the three holes have to be next to each other and in line.
The above should give you some hints that will help you get started.
I am creating a tic tac toe with min/max so I can expand it to alpha-beta pruning. So during my min/max I find if a path with lead to +1 (X win) -1 (O win) or 0 (Draw) however for a board config such as this:
During 0 turns it picks the bottom left since that move leads to its win. Should I check each table for a block, then it would not run as fast and I don't think thats how min/max should be implemented.
0|x|0
-|x|-
-|-|-
Can someone explain why the min/max is not smart enough to detect that. I though that it looked at the left nodes and return +1/-1/0.
Edit: I've been mixing up "pure" minimax, with minimax + heuristic. I've edited my answer to resolve this.
Maybe it would help to define minmax. From An article by a UC Berkeley student:
minimax(player,board)
if(game over in current board position)
return winner
children = all legal moves for player from this board
if(max's turn)
return maximal score of calling minimax on all the children
else (min's turn)
return minimal score of calling minimax on all the children
With minimax, you are trying to minimize your losses, not maximize your gains. So, "your" turn is min's turn. With this definition, if you could ever lose by selecting a square, then it will be marked -1. If you could ever tie, but will never lose, it will be marked 0. Only if it is a guaranteed win will it be marked 1.
Should I check each table for a block
If you are defining your score and algorithm correctly (matching the right players to the right logic), you need not "check for a block". Any game sub-tree where the player didn't block should implicitly be evaluated -1, because at some point (probably very quickly) it will evaluate to a loss, and that loss will bubble up.
The real problem with this algorithm (and where you may be getting results that you aren't expecting) is when all sub-trees result in possible losses. At that point, you will need to use a heuristic to get any better information on which move you should take. You will need something better than simply {-1, 0, 1}, because some moves could allow you to win, but you'd block them out because you could also lose.
I am not quite sure about your problem. As has been pointed out before, min/max has problems when more than one path leads to a win or all paths lead to a loss. In such a case is mathematically correct to pick any or the winning paths or any path at all for the loss. However if playing with a non perfect adversary it is often more sensible to pick the shortest winning path and the longest loosing path (as to hope the adversary does not play perfect and picks a wrong choice).
This behavior is quite easy to implement in min/max using a decay for each recursion. I.e. whenever you return something from a recursive call multiply the result by 0.9 or something like this. This will lead to higher scores for longer negative paths and smaller scores for longer positive paths.
This does however lead to problems, once you start using a heuristic to break out.
I am trying to write a simple AI for a "Get four" game.
The basic game principles are done, so I can throw in coins of different color, and they stack on each other and fill a 2D Array and so on and so forth.
until now this is what the method looks like:
public int insert(int x, int color) //0 = empty, 1=player1 2=player2"
X is the horizontal coordinate, as the y coordinate is determined by how many stones are in the array already, I think the idea is obvious.
Now the problem is I have to rate specific game situations, so find how many new pairs, triplets and possible 4 in a row I can get in a specific situation to then give each situation a specific value. With these values I can setup a "Game tree" to then decide which move would be best next (later on implementing Alpha-Beta-Pruning).
My current problem is that I can't think of an efficient way to implement a rating of the current game situation in a java method.
Any ideas would be greatly appreciated!
I'm guessing that this is a homework assignment, and that you mean you want to write the evaluation function and don't know what tricks to use?
The game is called "Connect 4" in English, so you can google for
"connect 4 evaluation function".
You can find enough people discussion heuristics.
Please don't copy an actual source code, it's an important exercise :)
The search space for Connect 4 isn't impossibly large. For a simple implementation, albeit one that'll take a while to run (perhaps tens of minutes) do a minimax search until someone wins, or the game ends. Assign +1 or -1 for a win for one player or the other, and 0 for a draw.
bollocks. search space is huge. you need to use a predefined table if you want to do that.