Game-theoretic optimal strategy is a subject of game theory that has become popular in the poker community in the last years. The term optimal may lead to confusion as it can give to understand that it is the most profitable strategy we can follow. However, it is usually much more profitable to play far away from optimal. For this reason, I’d rather call it balanced play to avoid this confusion, because a GTO strategy is balanced regardless of the opponent’s strategy, whereas it is optimal – in terms of profitability – only if the opponent is also playing optimal. Specifically in poker, balanced play might be defined as the set of legitimate actions given a distribution of hands, the potential reward, and the potential risk. But, wait… if most of time (if not all) it is much more profitable playing non-optimal strategies, why does optimal play matter then? Is it worth investing your time in it? Yes, it is. Let’s see why.
Successful poker players are good at finding opponent’s weaknesses (also called leaks) and designing profitable strategies against those weaknesses. So far so good, but, what is actually a weakness? A weakness is an unbalanced strategy, therefore, exploitable. Optimal play is precisely built on balanced strategies, therefore, unexploitable. Thus, as you can guess, the key to identify leaks relies on knowing the optimal play.
As a final remark, notice that balanced play only applies to games with hidden information. Balanced play in fact protects you from the potential advantage that hidden information provides (your hole cards in poker). In games with complete information, such as chess or go, you do not have to balance your play. Instead, you should always play the movement with highest value.
Case study: AKQ game
Consider the following toy game:
- Two players (P1 and P2)
- The deck is only composed of aces, kings and queens
- Limit betting
- The pot is P bets
- P1 acts first and is forced to check
- P2 can bet or check
- If P2 bets, P1 can call or fold
Before analyzing the optimal strategy for this game, we first introduce the concept of domination.
A strategy dominates a strategy
if
That is, a strategic action for a given player dominates another strategic action if its expectation isn’t worse against any possible opponent’s response, and it is better against at least one of these responses.
The most intuitive example of domination is value betting when holding the nuts and being last to act (or raising if facing a bet), but other spots also include dominated strategies as we will see later.
P1‘s strategy
When being first to act, P1 does not have any decision to make as the rules of the game force him to check. When facing a bet, he must decide whether to call or fold. If P1 holds an ace, it is clear that he must always call as it dominates folding, and if he holds a queen it is also clear he must always fold since calling is dominated. The non-trivial decision arises when holding a king:
P1 must make P2 indifferent from betting to checking his bluffs:
Making we find the P1‘s optimal call frequency when holding kings
Hence, the balanced strategy for P1 is built as: calling all his aces, calling of his kings, and folding all his queens.
P2‘s strategy
P2 always faces a check from P1, and has to decide whether to bet or check any possible hand. Betting aces dominates checking aces (you are betting the nuts being last to act), and checking kings dominates betting kings (kings can neither extract value nor bluffing). We have therefore defined the strategy with aces and kings, and the strategy with queens is as follows:
P2 must make P1 indifferent from calling to folding his bluffs catchers:
Making we find the P2‘s optimal bet frequency when holding queens
Summing up, the balanced strategy for P2 is built as: betting all his aces, checking all his kings, and betting of his queens.
Moving to unbalance
Now we know how both players should play in order to be unexploitable. But, what happens when a player decides to play unbalanced?
I wrote a simulator for this game that plots the variation of the expectation for several P2‘s strategies and P1‘s responses. The graph below represents the expectation for several bluffing frequencies and calling frequencies on a spot where the pot is 4.
You can see three noticeable results. First, when P2 plays balanced, the EV remains constant for any P1‘s call frequency, which is consistent with P2‘s motivation of making P1 indifferent from calling or folding. Second, once P2 plays unbalanced, P1 can improve (and also worsen!) his expectation by uniliterally changing his strategy; when P2 moves from equilibrium by underbluffing, the more P1 decreases his call frequency the more he increases his expectation, and, when P2 moves from equilibrium by overbluffing, the more P1 increases his call frequency the more he increases his expectation. And third, the structural asymmetry of this spot because P1 is not allowed to value bet or bluff, and because of the P2‘s positional advantage.
No-limit betting
Fixed-limit was quite popular in early 2000’s, but it was massively displaced by the no-limit variant where the players can size their bets as large as their whole remaining stack. This turns out in a funnier game, but it also increases the complexity of the game by introducing another decision to make. Let’s analyze how the balanced play for the AKQ game presented before is affected by this variant. In order to make calculus clearer, we normalize the pot, in such a way that the pot is 1, and is ratio of the pot.
P1‘s strategy
The strategy when holding aces and queens is exactly the same as in the limit version reasoned above. With kings is as follows:
P1 must make P2 indifferent from betting to checking his bluffs:
Making we find the P1‘s optimal call frequency when holding kings
P2‘s strategy
The strategy when holding aces and kings is exactly the same as in the limit version reasoned above. When holding queen is as follows:
P2 must make P1 indifferent from calling to folding his bluffs catchers:
Making we find the P2‘s optimal bet frequency when holding queens
Optimal bet size
We have defined the optimal frequencies both for P1 and P2. However, P2 still has another decision to make: the size of his bets (value bets and bluffs). Let’s find out the P2‘s expected value analyzing what happens in each possible hand dealing according to the frequencies obtained before.
{card matchup}: ∑(reward·ocurrence)
{AK}: 0
{AQ}: -α·bP2
{KA}: 1·(1 – cP1) + (α + 1)·cP1
{KQ}: -α·bP2cP1 + 1·bP2(1 – cP1)
{QA}: 1
{QK}: 1
Being
We can find its maximum by equaling its derivative to zero
Solving we find that
Hence, the balanced strategy for P1 is built as: calling all his aces, calling of his kings, and folding all his queens. P2 should bet all his aces with a sizing of
, check all his kings, and bet
of his queens with a sizing of
This last graph shows the expectation of P2 as a function of the size of his bets, and how both the bluff and bluff catching frequencies are affected also by the P2‘s bet size. Notice the bigger P2‘s bets the more should bluff, whereas P1 should decrease his bluffcacthing frequency, to the point of always folding if P2‘s bets are pot sized. I also wrote a simulator for this game that is available for download here, where you can play with the strategies parameters.
The next article in this series will get a little closer to real poker strategies by introducing more strategic options in the AKQ game, such as allowing P1 both to value bet and bluff, or allowing both players to raise when facing a bet.
Bien Fran!