I am creating a tic tac toe with min/max so I can expand it to alpha-beta pruning. So during my min/max I find if a path with lead to +1 (X win) -1 (O win) or 0 (Draw) however for a board config such as this:
During 0 turns it picks the bottom left since that move leads to its win. Should I check each table for a block, then it would not run as fast and I don't think thats how min/max should be implemented.
0|x|0
-|x|-
-|-|-
Can someone explain why the min/max is not smart enough to detect that. I though that it looked at the left nodes and return +1/-1/0.
Edit: I've been mixing up "pure" minimax, with minimax + heuristic. I've edited my answer to resolve this.
Maybe it would help to define minmax. From An article by a UC Berkeley student:
minimax(player,board)
if(game over in current board position)
return winner
children = all legal moves for player from this board
if(max's turn)
return maximal score of calling minimax on all the children
else (min's turn)
return minimal score of calling minimax on all the children
With minimax, you are trying to minimize your losses, not maximize your gains. So, "your" turn is min's turn. With this definition, if you could ever lose by selecting a square, then it will be marked -1. If you could ever tie, but will never lose, it will be marked 0. Only if it is a guaranteed win will it be marked 1.
Should I check each table for a block
If you are defining your score and algorithm correctly (matching the right players to the right logic), you need not "check for a block". Any game sub-tree where the player didn't block should implicitly be evaluated -1, because at some point (probably very quickly) it will evaluate to a loss, and that loss will bubble up.
The real problem with this algorithm (and where you may be getting results that you aren't expecting) is when all sub-trees result in possible losses. At that point, you will need to use a heuristic to get any better information on which move you should take. You will need something better than simply {-1, 0, 1}, because some moves could allow you to win, but you'd block them out because you could also lose.
I am not quite sure about your problem. As has been pointed out before, min/max has problems when more than one path leads to a win or all paths lead to a loss. In such a case is mathematically correct to pick any or the winning paths or any path at all for the loss. However if playing with a non perfect adversary it is often more sensible to pick the shortest winning path and the longest loosing path (as to hope the adversary does not play perfect and picks a wrong choice).
This behavior is quite easy to implement in min/max using a decay for each recursion. I.e. whenever you return something from a recursive call multiply the result by 0.9 or something like this. This will lead to higher scores for longer negative paths and smaller scores for longer positive paths.
This does however lead to problems, once you start using a heuristic to break out.
Related
I'm making my own 5 in a row game where the board is bigger than 5x5, the size isn't decided but let's say 10x10. I'm designing this with a minimax algorithm and alfa-beta pruning. I've decided that a winning situation has the utility function of: (empty places + chosen place)*5, so if computer finds a win where there will be 2 empty places left then the value will be (2+1)*5 = 15. Same calculation for lose but times -5 instead of 5. For a draw it'll evaluate 0. Now, the issue that I'm writing this for is this: How do I determine the utility for an unfinished scenario where the depth limit is reached? Simply making it 0 isn't good enough, there must be some kind of "guess" or ranking.
An idea I had was to count all the X's in every row, every column and every possible diagonal and do empty places+chosen place times biggest found sequence of X's. The issue there is that the calculations take time and are a pain to write and then you'd also have to take into account the edges of the sequences found: are they bordered by O's? Should we count X's with empty gaps?
It seems awfully complicated and I wondered therefore if anyone has any good advise for me in regards to the unfinished scenarios. Thank you!
I have no idea what you are talking about in your first paragraph, a win should be scored as a win (minus depth to always find shortest win). If I understand you correctly you are looking to create an evaluation/heuristics function to be able to input a board state and output a static score for that state? Gomoku (5 in a row) is a popular game that has many engines, Google "evaluation function gomoku" and you will find lots of information and paper on the subject.
For a game like chess the evaluation is done e.g. by counting each sides material balance, piece positions, king safety etc. For X-in-a-row the usual approach is to give all possible scenarios a score and then add them up together. The scenarios for Gomoku are usually as follows:
FiveInRow (Win)
OpenFour
ClosedFour
OpenThree
ClosedThree
OpenTwo
ClosedTwo
Open means there are no blocking enemy pieces next to the friendly pieces:
OpenThree: -O--XXX---
ClosedThree: --OXXX---
How to sum them all up, and what score to give each scenario, is up to you and that is where the fun and experimenting begins!
I am implementing the minimax for a Stratego game (where the computer has perfect knowledge of all the pieces). However, I find that the computer will often not attack a piece that it can easily destroy. From what I understand, the minimax scores comes from the leaf nodes of a move tree (where each level is a turn and each score for the leaf node is calculated using an evaluation function for the board in that position). So if I have a depth of 3 levels, the computer can choose to attack on move 1 or attack on move 3. According to the minimax algorithm, it has the same score associated with it (the resulting board position has the same score). So how do I influence the minimax algorithm to prefer immediate rewards over eventual rewards? i.e. I would like the score to decay over time, but with the way minimax works I don't see how this is possible. Minimax always uses the leaf nodes to determine the intermediate nodes.
As mentioned by others in the comments, minimax should be able to notice if there is a danger in delaying to capture a piece automatically, and changing the evaluation function to force it to prefer earlier captures is likely to be detrimental to playing performance.
Still, if you really want to do it, I think the only way would be to start storing extra information in your game states (not only the board). You'll want to store timestamps in memory for every game state which allow you to still tell in hindsight exactly at what time (in which turn) a piece was previously captured. Using that information you could implement a decay factor in the evaluation function used in leaf nodes of the search tree.
A different solution may be to simply make sure that you search to an even depth level; 2 or 4 instead of 3. That way, your algorithm will always evaluate game states where the opponent had the last move, instead of your computer player. All evaluations will become more pessimistic, and this may encourage your agent to prefer earlier rewards in some situations.
This effect where odd search depths typically result in different evaluations from even search depths is referred to as the odd-even effect. You may be interested in looking into that more (though it's typically discussed for different reasons than what your question is about).
I am creating a strategy game in Java, for which I am now writing a map editor. Before the game starts, the player makes a map with a number of islands and a number of resources on each island. After saving the map, the number of players is chosen. Each player has a base, and the bases must be located at the farthest distance from each other.
So, suppose I load a map with 5 islands and have 2 players when the game starts – each player must have one island. These islands must be at the largest distance from each other, so it should be like this: player 1's island, neutral island, neutral island, neutral island, player 2's island.
I have no idea what my algorithm should be for this.
This problem appears to be equivalent to this question: https://cs.stackexchange.com/questions/22767/choosing-a-subset-to-maximize-the-minimum-distance-between-points . Solving this efficiently and exactly may be an open problem in theoretical CS! Since this is for a game, I am not sure how much effort you want to solving this problem in an exactly optimal way.
It should be pretty easy, fast, and close to correct to generate a random guess and repeatedly (perturb it, measure the badness of the perturbed guess, and if the perturbed guess's badness is better than the current guess make the perturbed guess the current guess).
As for what you consider the badness of a possible guess, my suggestion is "average (the distance to the closest player-inhabited island) across all player-inhabited islands".
Assuming that your number of islands and your number of player are quite small, i think a simple exhaustive search would be the easiest and fastest to implement way to do it.
make a matrix that holds a distance from one island to each other islands. (only necessary
systematically iterate over all combinations of player placements, sum up each distances from one to all other players and store the the maximum
Can you explain me how to build the tree?
I quite understood how the nodes are chosen, but a nicer explanation would really help me implementing this algorithm. I already have a board representing the game state, but I don't know (understand) how to generate the tree.
Can someone points me to a well commented implementation of the algorithm (I need to use it for AI)? Or better explanation/examples of it?
I didn't found a lot of resources on the net, this algorithm is rather new...
The best way to generate the tree is a series of random playouts. The trick is being able to balance between exploration and exploitation (this is where the UCT comes in). There are some good code samples and plenty of research paper references here : https://web.archive.org/web/20160308043415/http://mcts.ai:80/index.html
When I implemented the algorithm, I used random playouts until I hit an end point or termination state. I had a static evaluation function that would calculate the payoff at this point, then the score from this point is propagated back up the tree. Each player or "team" assumes that the other team will play the best move for themselves, and the worst move possible for their opponent.
I would also recommend checking out the papers by Chaslot and his phd thesis as well as some of the research that references his work (basically all the MCTS work since then).
For example: Player 1's first move could simulate 10 moves into the future alternating between player 1 moves and player 2 moves. Each time you must assume that the opposing player will try to minimize your score whilst maximizing their own score. There is an entire field based on this known as Game Theory. Once you simulate to the end of 10 games, you iterate from the start point again (because there is no point only simulating one set of decisions). Each of these branches of the tree must be scored where the score is propagated up the tree and the score represents the best possible payoff for the player doing the simulating assuming that the other player is also choosing the best moves for themselves.
MCTS consists of four strategic steps, repeated as long as there is time left. The steps are as follows.
In the selection step the tree is traversed from the
root node until we reach a node E, where we select a position that is not added to the tree yet.
Next, during the play-out step moves are played in self-play until the end of the game is reached. The result R of this “simulated” game is +1 in case of a win for Black (the first player in LOA), 0 in case of a draw, and −1 in case of a win for White.
Subsequently, in the expansion step children of E are added to the tree.
Finally, R is propagated back along the path from E to the root node in the backpropagation step. When time is up, the move played by the program is the child of the root with the highest value.
(This example is taken from this paper - PDF
www.ru.is/faculty/yngvi/pdf/WinandsBS08.pdf
Here are some implementations:
A list of libraries and games using some MCTS implementations
http://senseis.xmp.net/?MonteCarloTreeSearch
and a game independent open source UCT MCTS library called Fuego
http://fuego.sourceforge.net/fuego-doc-1.1/smartgame-doc/group__sguctgroup.html
From http://mcts.ai/code/index.html:
Below are links to some basic MCTS implementations in various
programming languages. The listings are shown with timing, testing
and debugging code removed for readability.
Java
Python
I wrote this one if you're interrested : https://github.com/avianey/mcts4j
Honestly, I only knew of such a game recently and I wonder how one can create a solving algorithm using the recursive search method?
There are 15 holes in total in the triangular board. Making that 14 pegs with a total of 13 moves.
I don't know where to start with this in C++ nor Java. I have studied C++ for an year before. So I'm familiar with the concepts of stacks, linked lists etc.
I just don't know how to start the code. The program firstly asks the user where they want to start (How is this done?)
Then once it solves it , a certain number of pegs more than just one will be left and the program will ask the user for a better solution (like this until the board is left to just one peg.)
I certainly cannot think of how to make the moves possible ( How do I write a code that "SHOWS" that one peg moves over a hole ,into another?)
I'd love some coding assistance here. It would really be appreciated.
Try treating the board a linked list of holes or positions. Each data field in a node would represent a hole. Each node would have a vector of links to other holes, depending on its position relative to the edge of the board.
To move a peg, iterate over the possible links.
This is just one method, there are probably better ones out there.
Take a look at my answer here: Timeout on a php Peg Puzzle solver. The 15-peg puzzle was the first program that I ever wrote (over 10 years ago), after just learning c++.
The posted solution is a re-write that I did several years later.
Since I was the answerer I can tell you that a triplet is a term that I made up for a move. There are three pegs involved in a move (a triplet). Each peg is represented by a bit. In order for a move to be legal two consecutive pegs (bits) need to be set and the other (the target location) needs to be clear.
The entire solution involves a depth first search using a 16-bit integer to represent the board. Simple bit manipulation can be used to update the board state. For example if you xor the current state with a move (triplet) it will make the move against the board. The same operation applied a second time against the board is an undo of the move.
To be successful as a programmer, you need to develop the skill of examining a problem, figuring out how it can be solved, and come up with a reasonable in-program representation that enables you to solve it.
For problems like this, I find it helpful to have the puzzle in front of me, so I can try it out by hand. Failing that, at least having a picture of the thing will help:
http://www.obsidiandesigns.com/pyramidsol.jpg
Here's another way to look at it, and possibly consider representing it in memory like this:
OXXXX
XXXX
XXX
XX
X
Each "X" is a peg. The "O" is a hole. One can change "OXX" to "XOO" by jumping the 3rd X over the middle one, into the hole. Also consider vertical and diagonal jumps.
Linked lists could make a lot of sense. You might actually want "2D" links, instead of the usual "1D" links. That is each "hole" instance could contain pointers to the two or three holes next to it.
Once you have a representation, you need a way to find valid moves. And to make valid moves.
For this "game," you may want to search the whole space of possible moves, so you'll probably want to either have each move produce a new board, or to have a way to "undo" moves.
A valid move is where one peg jumps over another peg, into an empty hole. The jumped peg is removed. You can't jump from "just anywhere" to anywhere, over any peg you wish -- the three holes have to be next to each other and in line.
The above should give you some hints that will help you get started.