In the interest of space last time, I had to leave out an advanced topic on optimizing a "next best action" algorithm. Again, you can look at the full source we're discussing by just using the web browser's View Source on this page.

The optimization is known as

**alpha-beta pruning**. In the code snippet below, you see that we break the*j*-loop that is scoring the response moves of a given move based on some condition involving the variables*alpha*and*beta*. Why does it make sense to stop looking at the competitive response moves for a given move? To see why, I've added the function declaration so we can discuss where the alpha value comes from and what it means.```
KalahGame.scoreMove = function(board, player, move, alpha, lookahead) {
...
this.copyBoard(newBoard, board);
extraTurn = this.makeMove(newBoard, player, move);
moveScore = this.evalBoard(newBoard, player) - this.evalBoard(board, player);
...
player ^= 3;
beta = -10000;
for (j = 1; j <= 6; j++) {
if (newBoard[player][j] > 0) {
lookaheadScore = this.scoreMove(newBoard, player, j, beta, lookahead);
if (beta < lookaheadScore) {
beta = lookaheadScore;
}
if (moveScore - beta < alpha)
break;
}
}
moveScore -= beta;
...
return moveScore;
};
```

Understanding

*alpha-beta*pruning requires you to take a more global view of the recursion that is doing the evaluation. The*alpha*values passed into*scoreMove()*are the*beta*values from the calling level of the**Minimax algorithm**. It will help to keep at least the player's moves and the opponents responses in mind as we go through this.
Let's say that

*scoreMove()*has been called to score a player's*K*^{th}move. Beforehand, moves 1 to*K*-1 will have been fully explored by depth-first recursion, including the opponents responses, the player's counter-responses, and so on. The*alpha*value received by*scoreMove()*for move*K*reflects the best fully explored "net" score for the player on moves 1 to*K*-1. Within*scoreMove()*, we first compute the raw benefit of the new move*K*, storing the result in*moveScore*. Now comes the**alpha-beta pruning**trick. The*j*-loop successively explores each opponent response move for the player's move*K*, and clearly the*beta*value takes on the value of the highest scoring response move that the opponent can make. The final score for move*K*is the raw benefit to the player of move*K***minus**the benefit*beta*that the opponent can realize in response.**Thought-provoking question:**Do we really need to know the absolute best move that the opponent can make in response to the player's move

*K*? Or do we just need to find an opponent move that is good enough that, when subtracted from the raw benefit of move

*K*, proves that the player would be better off choosing the earlier move associated with the alpha value? Of course, the answer is that we only need a good enough opponent move, and this is why we break the

*j*-loop when we find that move. If we were to continue the

*j*-loop, all we do is unnecessary work that might (or might not) find an even better opponent response move that would make move

*K*look like an even worse decision for the player. But there is no need to do this extra work. Once the expression "

*(moveScore-beta < alpha)*" becomes true, we have proven that move

*K*is less beneficial than one of the moves 1 to

*K*-1.

From a practical standpoint, this optimization averages

**better than double the run-time performance**of the "what-if" logic. Who doesn't want double, right? Well, this "what-if" analysis is a combinatorial explosion of analysis; to put that in perspective, you get less than one extra move of lookahead due to this optimization. Yet despite this dash of cold water about how much deeper you could take the "what if" logic due to alpha-beta pruning, it remains true that, for a given level of explorative depth, everybody wants the result twice as fast or more, so alpha-beta pruning is very handy.

## Comments (0)

There are no comments to displayAdd a Comment