I'm trying to figure out how to increase the speed of this algorithm. It works perfectly for two games (2-person games, CPU vs Human), but the problems is when I assign more than three piles (that contains a number of stones, so each player can pick up more than one), the computer player takes forever to compute the moves:
public Object[] minimax(int depth, int player) {
if(hasPlayer1Won(player)){
return new Object[]{get_default_input(1),1};
}else if(hasPlayer2Won(player)){
return new Object[]{get_default_input(1),-1};
}
List<T> movesAvailable = getNextStates();
if(movesAvailable.isEmpty()){
return new Object[]{get_default_input(0), 0};
}
int min = Integer.MAX_VALUE;
int max = Integer.MIN_VALUE;
T computersMove = getNextStates().get(0);
int i = 0;
for (T move: movesAvailable) {
makeAMove(move, player);
Object[] result = minimax(depth + 1, player == G.PLAYER1 ? G.PLAYER2 : G.PLAYER1);
int currentScore = (int)result[1];
if(player == G.PLAYER1){
max = Math.max(currentScore, max);
if(currentScore >= 0 && depth == 0) {
computersMove = move;
}
if(currentScore == 1){
resetMove(move);
break;
}
if(i==movesAvailable.size() - 1 && max < 0){
if (depth == 0){
computersMove = move;
}
}
}else{
min = Math.min(currentScore, min);
if(min == -1) {
resetMove(move);
break;
}
}
i++;
resetMove(move);
}
return new Object[]{computersMove, player == G.PLAYER1 ? max: min};
}
I have sucessfully tested the following methods for improving minimax (used it to play Tic-Tac-Toe and Domineering):
Alpha beta pruning - used a special variant of this type of pruning, in conjunction with Lazy evaluation - basically instead of generating the whole tree I just generated an optimal move on each layer and kept Lazy holders for the other state-action pairs (applying the Lazy evaluation method, by making use of a supplier and calling it when a move different than the one I held was made).
Heuristic pruning - see the chapter on heuristics in that book. I basically only generated the first d branches of the tree and instead of having a deterministic outcome, I applied the heuristic function described in that book to the current state to determine a heuristic outcome. Whenever move (d+1) was made, I generated another branch using the same approach.
Here, d is the level that you choose (safest way is by testing)
Parallel computing also have a look at this, you may find it harder to implement but it pays off
First 2 options brought me a lot of computational time save, such that I was able to play Domineering optimally up to a 5x5 board and heuristically up to 10x10 (it can be better depending on how well you want it to play).
Related
My chess algorithm is based on negamax. The relevant part is:
private double deepEvaluateBoard(Board board, int currentDepth, double alpha, double beta, Move initialMove) {
if (board.isCheckmate() || board.isDraw() || currentDepth <= 0) {
this.moveHistorys.put(initialMove, board.getMoveHistory()); // this is not working
return evaluateBoard(board); // evaluateBoard evaluates from the perspective of color whose turn it is.
} else {
double totalPositionValue = -1e40;
List<Move> allPossibleMoves = board.getAllPossibleMoves();
for (Move move : allPossibleMoves) {
board.makeMove(move);
totalPositionValue = max(-deepEvaluateBoard(board, currentDepth - 1, -beta, -alpha, initialMove), value);
board.unMakeMove(1);
alpha = max(alpha, totalPositionValue);
if (alpha >= beta) {
break;
}
}
return totalPositionValue;
}
}
It would greatly help debugging if I was be able to access the move sequence that the negamax algorithm bases its evaluation on (where on the decision tree the evaluated value is found).
Currently I am trying to save the move history of the board into a hashmap that is a field of the enclosing class. However, it is not working for some reason, as the produced move sequences are not optimal.
Since developing an intuition for negamax is not very easy, I have ended up on banging my head against the wall on this one for quite some time now. I would much appreciate if someone could point me in the right direction!
I'm creating a chess engine as a practice in Java, I know it's not recommended due to speed issues but I'm doing it just for practice.
After implementing minimax with alpha-beta pruning, I thought of implementing a time-limit to find the score of a given move.
Here is the code
private int minimax(MoveNode node, MoveNodeType nodeType, int alpha, int beta, Side side, int depth) throws Exception {
// isInterestingLine(prevscores, node, side);
if (depth <= 0) {
count++;
return node.evaluateBoard(side);
}
// Generate Child nodes if we haven't.
if (node.childNodes == null || node.childNodes.size() == 0) {
node.createSingleChild();
}
if (nodeType == MoveNodeType.MAX) {
int bestValue = -1000;
for (int i = 0; i < node.childNodes.size(); i++) {
if (node.childNodes.get(i) == null) continue;
int value = minimax(node.childNodes.get(i), MoveNodeType.MIN, alpha, beta, side, depth - 1);
bestValue = Math.max(bestValue, value);
alpha = Math.max(alpha, bestValue);
if (beta <= alpha) {
break;
}
node.createSingleChild();
}
// reCalculateScore();
return bestValue;
} else {
int bestValue = 1000;
for (int i = 0; i < node.childNodes.size(); i++) {
if (node.childNodes.get(i) == null) continue;
int value = minimax(node.childNodes.get(i), MoveNodeType.MAX, alpha, beta, side, depth - 1);
bestValue = Math.min(bestValue, value);
beta = Math.min(beta, bestValue);
if (beta <= alpha) {
break;
}
node.createSingleChild();
}
// reCalculateScore();
return bestValue;
}
}
and the driver code.
void evaluateMove(Move mv, Board brd) throws Exception {
System.out.println("Started Comparing! " + this.tree.getRootNode().getMove().toString());
minmaxThread = new Thread(new Runnable() {
#Override
public void run() {
try {
bestMoveScore = minimax(tree.getRootNode(), MoveNodeType.MIN, -1000, 1000, side, MAX_DEPTH);
} catch (Exception e) {
e.printStackTrace();
}
}
});
minmaxThread.start();
}
This is how I implemented time-limit.
long time = System.currentTimeMillis();
moveEvaluator.evaluateMove(move, board.clone());
while((System.currentTimeMillis() - time) < secToCalculate*1000 && !moveEvaluator.minmaxThread.isAlive()) {
}
System.out.println("Time completed! score = " + moveEvaluator.bestMoveScore + " move = " + move + " depth = " + moveEvaluator.searchDepth) ;
callback.callback(move, moveEvaluator.bestMoveScore);
Now, Here is the problem
You see, it only calculated Bb7, because of the depth-first search time runs out before even calculating another line.
So I want a way to calculate like following in a time-limit based solution.
Here are a few solutions I taught of.
Implementing an isInteresting() function. which takes all the previous scores and checks if the current line is interesting/winning if yes then and only then calculates next child nodes.
e.g.
[0,0,0,0,0,0] can be interpreted as a drawn line.
[-2,-3,-5,-2,-1] can be interpreted as a losing line.
Searching for small depth-first and then elimination all losing lines.
for (int i = min_depth; i <= max_depth; i ++) {
scores = [];
for(Node childnode : NodesToCalculate) {
scores.push(minimax(childnode, type, alpha, beta, side, i));
}
// decide which child node to calculate for next iterations.
}
But, none of the solutions is perfect and efficient, In the first one, we are just making a guess and In second on we are calculating one node more than once.
Is there a better way to do this?
The solution to this problem used by every chess engine is iterative deepening.
Instead of searching to a fixed depth (MAX_DEPTH in your example) you start by searching to a depth of one, then when this search is done you start again with a depth of two and you continue to increase depth like this until you are out of time. When you are out of time you can play the move of the last search that was completed.
It may seem like lots of time will be spent on lower depth iteration that are later replaced by deeper search and that the time sent doing so is completely lost, but in practice it's not true. Since searching to a depth N is so much longer than searching at depth N-1 the time spent on the lower depth search is always much less than the time spent on the last (deeper) search.
If your engine use a transposition table, the data in the transposition table from previous iteration will help the later iterations. The alpha-beta algorithm's performance is really sensitive to the order move are searched. The time saved by alpha-beta over minimax is optimal when the best move is searched first. If you did a search for depth N-1 before the search for depth N, the transposition table will probably contains a good guess of the best move for most positions that can then be searched first.
In practice, in a engine using a transposition table and ordering move at the root based on previous iteration it's faster to use iterative deepening than not using it. I mean for exemple it's faster to do a depth 1 search, then en depth 2 search, then a depth3 search until say a depth 10 search than it is doing a depth 10 search right away. Plus you get the option to stop the search whenever you want and still have a move to play,=.
I have looked everywhere for answers for fixing my code but after long hours spent trying to debug it I find myself hopelessly stuck. The problem is that my minimax function will not return the correct values for the best possible move, I even attempted to fix it by storing the best first moves (when depth = 0), but if the solution is not obvious, then the algorithm fails horribly. I also tried modifying the return values from the base cases in order to prioritize earlier wins, but this didn't solve the problem.
Currently I am testing the function on a TicTacToe board and the helper classes (Eg getMoves() or getWinner are working properly), I know my style is not the most efficient but I needed the code to be fairly explicit.
By adding a bunch of print statements I realized that under some circumstances my bestFinalMoves ArrayList was not modified, so this may be related to the issue. Another related problem is that unless the algorithm finds a direct win (in the next move), then instead of choosing a move that may lead to a future win or tie by blocking a square that leads to an immediate block, it just yields the space for the minimizing player to win.
For example on the board:
aBoard= new int[][] {
{0,1,0}, // 1 is MAX (AI), -1 is MIN (Human)
{-1,0,0},
{-1,0,0}
};
Yields the incorrect result of 2,0, where it is obvious that it should be 0,0, so that it blocks the win for the minimizing player, and the bestFinalMoves ArrayList is empty.
private result miniMaxEnd2(Board tmpGame, int depth){
String winner = tmpGame.whoWon();
ArrayList<Move> myMoves = tmpGame.getMoves();
if (winner == 'computer'){ //Base Cases
return new result(1000);
}else if (winner == 'human'){
return new result(-1000);
}
else if (winner == 'tie'){
return new result(0);
}
if (tmpGame.ComputerTurn) {//MAX
bestScore = -99999;
for (Move m : tmpGame.getMoves()){
Board newGame = new Board(tmpGame,!tmpGame.ComputerTurn, m);
result aScore = miniMaxEnd2(newGame, depth+1);
if (aScore.score > bestScore) {
bestScore = aScore.score;
bestMove = m;
if (depth == 0) {
bestFinalMoves.add(m);
}
}
}
return new result(bestScore, bestMove);
} else {//MIN
bestScore = 99999;
for (Move m : tmpGame.getMoves()) {
Board newGame = new Board(tmpGame,!tmpGame.ComputerTurn, m);
result aScore = miniMaxEnd2(newGame, depth + 1);
if (aScore.score < bestScore) {
bestScore = aScore.score;
bestMove = m;
}
}
return new result(bestScore,bestMove);
}
}
I know this was a long post, but I really appreciate your help. The full code can be accessed at https://github.com/serch037/UTC_Connect
The bestScore and bestMove variables must be declared as local variables inside the miniMaxEnd2 method for this logic to work properly.
Those variables' values are being replaced by the recursive call.
I have been tasked to write a function that will find the best move for the computer to make for part of a backtracking algorithm. My solution finds a winnable answer but not a best answer. I am having trouble figuring out a way to keep a value assigned to the different options that won't get reset during the next recursive call. So if it goes through moves 1,2,3,4 and 2 and 3 both lead to a winnable solution it will take 3 and not 2 even if 2 would be the better choice. I can see why in my code this happens but I can't seem to think through how to fix it. I tried with the wins and total wins variable but this doesn't seem to be working. So once again the function works to find a winnable avenue but won't always pick the best of the winnable moves. Any help would be much appreciated
Move bestMove = null;
int totalwins= 0;
public Move findbest(Game g) throws GameException {
int wins = 0;
PlayerNumber side = g.SECOND_PLAYER;
PlayerNumber opp = g.FIRST_PLAYER;
Iterator<Move> moves = g.getMoves();
while(moves.hasNext()){
Move m = moves.next();
//System.out.println(m + " Totalwins " + totalwins);
Game g1 = g.copy();
g1.make(m);
//System.out.println("Turn: " + g.whoseTurn());
if(!g1.isGameOver()){
bestMove = findbest(g1);
}else{
if(g1.winner() == side ){
bestMove = m;
wins++;
}else if(g1.winner() == opp){
wins--;
}
if(wins > totalwins){
totalwins += wins;
bestMove = m;
}
}
if(bestMove == null){//saftey so it won't return a null if there is no winnable move.
bestMove = m;
}
}
//System.out.println("Totalwins = " + totalwins);
return bestMove;
}
As stated in the comments, you need to have some sort of rating system to determine which move really is the best.
Then, make a global variable Move bestMove and, instead of having findBest return the "best move", simply have it check to see if the current move is a winning one and, if so, also check to see if its rating is better than that of the current bestMove. If both of these conditions are true, then assign the current move to bestMove.
I am trying to write a small AI algorithm in Java implementing the miniMax algorithm.
The game upon which this is based is a two-player game where both players make one move per turn, and each board position resulting in each player having a score. The "quality" of a position for player X is evaluated by subtracting the opponent's score from player X's score for that position. Each move is represented by an integer (i.e. Move one is made by inputting 1, move two by inputting 2 etc)
I understand that miniMax should be implemented using recursion. At the moment I have:
An evaluate() method, which takes as parameters an object representing the board state (Ie "BoardState" object and a boolean called "max" (the signature would be evaluate(BoardState myBoard, boolean max)).
max is true when it is player X's turn. Given a board position, it will evaluate all possible moves and return that which is most beneficial for player X. If it is the opponent's turn, max will be false and the method will return the move which is LEAST beneficial for player X (ie: most beneficial for player y)
However, I am having difficulties writing the actual miniMax method. My general structure would be something like:
public int miniMax(GameState myGameState, int depth)
Whereby I submit the initial gameState and the "depth" I want it to look into.
I would then be having something like:
int finalMove = 0;
while(currentDepth < depth) {
GameState tmp = myGameState.copy();
int finalMove = evaluate(tmp, true or false);
iniMax(tmp.makeMove(finalMove));
}
return finalMove;
Would this sound like a plausible implementation? Any suggestions? :)
Thanks!
that wont work.
details :
it will cause infinite loop. currentdepth never gets incremented
your definition of evaluation seems to be different than the majority. Normally evaluation function will return the predicted value of the game state. Isnt your definition of evaluate function is just the same as what the minimax function do ?
is miniMax and MiniMax different? because if you meant recursion then you need to pass depth-1 when calling the next miniMax
the idea of minimax is depth first search. and only evaluate leaf nodes(nodes with maximum depth or nodes that is a win or tie) and pick one that is max if the current player is the maximizing one and pick min if the current player is the minimizing one.
this is how i implemented it :
function miniMax(node, depth)
if(depth == 0) then --leaf node
local ret = evaluate(node.state)
return ret
else -- winning node
local winner = whoWin(node.state)
if(winner == 1) then -- P1
return math.huge
elseif(winner == -1) then -- P2
return math.huge*-1
end
end
local num_of_moves = getNumberOfMoves(node.state)
local v_t = nil
local best_move_index = nil
if(getTurn(node.state) == 1) then -- maximizing player
local v = -math.huge
for i=0, num_of_moves-1 do
local child_node = simulate(node.state, i)) -- simulate move number i
v_t = miniMax(child_node, depth-1)
if(v_t > v) then
v = v_t
best_move_index = i
end
end
if(best_move_index == nil) then best_move_index = random(0, num_of_moves-1) end
return v, best_move_index
else -- minimizing player
local v = math.huge
for i=0, num_of_moves-1 do
local child_node = simulate(node.state, i)
v_t = miniMax(child_node, depth-1)
if(v_t < v) then
v = v_t
best_move_index = i
end
end
if(best_move_index == nil) then best_move_index = random(0, num_of_moves-1) end
return v, best_move_index
end
end
Note:
return v, best_move_index means returning two values of v and best_move_index(above code is in lua and lua can return multiple values)
evaluate function returns the same score for both players(ie game state A in point of view P1 is scored 23, and in point of view P2 is also scored 23)
this algo will only work if the two player run alternately(no player can run two moves consecutively), you can trick this restriction by giving the opponent one move, that is move PASS(skip his/her turn) if the other player need to move twice.
this minimax can be further optimized(sorted from the easiest one) :
alpha-beta pruning
iterative deepening
move ordering
I made an implementation of minimax in lua. I hope it helps give you an idea of how to tackle the algorithm form a Java perspective, the code should be quite similar mind you. It is designed for a game of tic-tac-toe.
--caller is the player who is using the minimax function
--initial state is the board from which the player must make a move
local function minimax(caller,inital_state)
local bestState = {}; --we use this to store the game state the the player will create
--this recurse function is really the 'minimax' algorithim
local function recurse(state,depth)
--childPlayer is the person who will have their turn in the current state's children
local ChildPlayer = getTurn(state)
--parentPlayer is the person who is looking at their children
local ParentPlayer = getPreviousTurn(state)
--this represents the worst case scenario for a player
local alpha = - (ChildPlayer == caller and 1 or -1 );
--we check for terminal conditions (leaf nodes) and return the appropriate objective value
if win(state) then
--return +,- inf depending on who called the 'minimax'
return ParentPlayer == caller and 1 or -1;
elseif tie(state) then
--if it's a tie then the value is 0 (neither win or loss)
return 0;
else
--this will return a list of child states FROM the current state
children = getChildrenStates(state,ChildPlayer)
--enumerate over each child
for _,child in ipairs(children) do
--find out the child's objective value
beta = recurse(child,depth+1);
if ChildPlayer == caller then
--We want to maximize
if beta >= alpha then
alpha = beta
--if the depth is 0 then we can add the child state as the bestState (this will because the caller of minimax will always want to choose the GREATEST value on the root node)
if depth == 0 then
bestState = child;
end
end
--we want to MINIMIZE
elseif beta <= alpha then
alpha = beta;
end
end
end
--return a non-terminal nodes value (propagates values up the tree)
return alpha;
end
--start the 'minimax' function by calling recurse on the initial state
recurse(inital_state,0);
--return the best move
return bestState;
end