Game-theoretic optimal strategy (I)

Game-theoretic optimal strategy is a subject of game theory that has become popular in the poker community in the last years. The term optimal may lead to confusion as it can give to understand that it is the most profitable strategy we can follow. However, it is usually much more profitable to play far away from optimal. For this reason, I’d rather call it balanced play to avoid this confusion, because a GTO strategy is balanced regardless of the opponent’s strategy, whereas it is optimal – in terms of profitability – only if the opponent is also playing optimal. Specifically in poker, balanced play might be defined as the set of legitimate actions given a distribution of hands, the potential reward, and the potential risk. But, wait… if most of time (if not all) it is much more profitable playing non-optimal strategies, why does optimal play matter then? Is it worth investing your time in it? Yes, it is. Let’s see why.

Successful poker players are good at finding opponent’s weaknesses (also called leaks) and designing profitable strategies against those weaknesses. So far so good, but, what is actually a weakness? A weakness is an unbalanced strategy, therefore, exploitable. Optimal play is precisely built on balanced strategies, therefore, unexploitable. Thus, as you can guess, the key to identify leaks relies on knowing the optimal play.

As a final remark, notice that balanced play only applies to games with hidden information. Balanced play in fact protects you from the potential advantage that hidden information provides (your hole cards in poker). In games with complete information, such as chess or go, you do not have to balance your play. Instead, you should always play the movement with highest value.

Case study: AKQ game

Consider the following toy game:

Two players (P₁ and P₂)
The deck is only composed of aces, kings and queens
Limit betting
The pot is P bets
P₁ acts first and is forced to check
P₂ can bet or check
If P₂ bets, P₁ can call or fold

Before analyzing the optimal strategy for this game, we first introduce the concept of domination.

A strategy $s_{D}$ dominates a strategy $s_{d}$ if

$(\forall S_{P_{2}}, \mathrm{E}_{s_{D}} \geqslant \mathrm{E}_{s_{d}}) \wedge (\exists S_{P_{2}}, \mathrm{E}_{s_{D}} > \mathrm{E}_{s_{d}})$

That is, a strategic action for a given player dominates another strategic action if its expectation isn’t worse against any possible opponent’s response, and it is better against at least one of these responses.

The most intuitive example of domination is value betting when holding the nuts and being last to act (or raising if facing a bet), but other spots also include dominated strategies as we will see later.

P₁‘s strategy
When being first to act, P₁ does not have any decision to make as the rules of the game force him to check. When facing a bet, he must decide whether to call or fold. If P₁ holds an ace, it is clear that he must always call as it dominates folding, and if he holds a queen it is also clear he must always fold since calling is dominated. The non-trivial decision arises when holding a king:

P₁ must make P₂ indifferent from betting to checking his bluffs:

${P_{2}}_{check}^Q=0$

${P_{2}}_{bet}^Q=\frac{1}{2}(-1) + \frac{1}{2}(c_{{P_{1}}}(-1)+(1-c_{{P_{1}}})P)$

Making ${P_{2}}_{check}^Q = {P_{2}}_{bet}^Q$ we find the P₁‘s optimal call frequency when holding kings

$c_{P_{1}}=\frac{P-1}{P+1}$

Hence, the balanced strategy for P₁ is built as: calling all his aces, calling $\frac{P-1}{P+1}$ of his kings, and folding all his queens.

P₂‘s strategy
P₂ always faces a check from P₁, and has to decide whether to bet or check any possible hand. Betting aces dominates checking aces (you are betting the nuts being last to act), and checking kings dominates betting kings (kings can neither extract value nor bluffing). We have therefore defined the strategy with aces and kings, and the strategy with queens is as follows:

P₂ must make P₁ indifferent from calling to folding his bluffs catchers:

${P_{1}}_{fold}^K=0$

${P_{1}}_{call}^K=\frac{1}{2}(-1) + \frac{1}{2}(b_{{P_{2}}}(P+1))$

Making ${P_{1}}_{fold}^K = {P_{1}}_{call}^K$ we find the P₂‘s optimal bet frequency when holding queens

$b_{P_{2}}=\frac{1}{P+1}$

Summing up, the balanced strategy for P₂ is built as: betting all his aces, checking all his kings, and betting $\frac{1}{P+1}$ of his queens.

Moving to unbalance
Now we know how both players should play in order to be unexploitable. But, what happens when a player decides to play unbalanced?
I wrote a simulator for this game that plots the variation of the expectation for several P₂‘s strategies and P₁‘s responses. The graph below represents the expectation for several bluffing frequencies and calling frequencies on a spot where the pot is 4.

You can see three noticeable results. First, when P₂ plays balanced, the EV remains constant for any P₁‘s call frequency, which is consistent with P₂‘s motivation of making P₁ indifferent from calling or folding. Second, once P₂ plays unbalanced, P₁ can improve (and also worsen!) his expectation by uniliterally changing his strategy; when P₂ moves from equilibrium by underbluffing, the more P₁ decreases his call frequency the more he increases his expectation, and, when P₂ moves from equilibrium by overbluffing, the more P₁ increases his call frequency the more he increases his expectation. And third, the structural asymmetry $(\frac {\mathrm{E_{P_{1}}}}{\mathrm{E_{P_{2}}}}\neq 1)$ of this spot because P₁ is not allowed to value bet or bluff, and because of the P₂‘s positional advantage.

No-limit betting

Fixed-limit was quite popular in early 2000’s, but it was massively displaced by the no-limit variant where the players can size their bets as large as their whole remaining stack. This turns out in a funnier game, but it also increases the complexity of the game by introducing another decision to make. Let’s analyze how the balanced play for the AKQ game presented before is affected by this variant. In order to make calculus clearer, we normalize the pot, in such a way that the pot is 1, and $\alpha$ is ratio of the pot.

P₁‘s strategy
The strategy when holding aces and queens is exactly the same as in the limit version reasoned above. With kings is as follows:

P₁ must make P₂ indifferent from betting to checking his bluffs:

${P_{2}}_{check}^Q=0$

${P_{2}}_{bet}^Q=\frac{1}{2}(-\alpha) + \frac{1}{2}(c_{{P_{1}}}(-\alpha)+(1-c_{{P_{1}}})1)$

Making ${P_{2}}_{check}^Q = {P_{2}}_{bet}^Q$ we find the P₁‘s optimal call frequency when holding kings

$c_{P_{1}}=\frac{1-\alpha}{1+\alpha}$

P₂‘s strategy
The strategy when holding aces and kings is exactly the same as in the limit version reasoned above. When holding queen is as follows:

P₂ must make P₁ indifferent from calling to folding his bluffs catchers:

${P_{1}}_{fold}^K=0$

${P_{1}}_{call}^K=\frac{1}{2}(-\alpha) + \frac{1}{2}(b_{{P_{2}}}(\alpha+1))$

Making ${P_{1}}_{fold}^K = {P_{1}}_{call}^K$ we find the P₂‘s optimal bet frequency when holding queens

$b_{P_{2}}=\frac{\alpha}{\alpha+1}$

Optimal bet size
We have defined the optimal frequencies both for P₁ and P₂. However, P₂ still has another decision to make: the size of his bets (value bets and bluffs). Let’s find out the P₂‘s expected value analyzing what happens in each possible hand dealing according to the frequencies obtained before.

{card matchup}: ∑(reward·ocurrence)

{AK}: 0
{AQ}: -α·b_P₂
{KA}: 1·(1 – c_P₁) + (α + 1)·c_P₁
{KQ}: -α·b_P₂c_P₁ + 1·b_P₂(1 – c_P₁)
{QA}: 1
{QK}: 1

Being

$\mathrm{E}_{P_{2}}=\sum \left \{ card\: matchup \right\}=$

$=0-\alpha b_{P_{2}}+(1-c_{P_{1}})+(\alpha+1)c_{P_{1}}-\alpha b_{P_{2}}c_{P_{1}}+b_{P_{2}}(1-c_{P_{1}})+1+1}$

We can find its maximum by equaling its derivative to zero

$\frac{\mathrm{d}\mathrm{E}_{P_{2}}}{\mathrm{d}\alpha }=\frac{(-2\alpha+2)(\alpha+1)-(-\alpha^{2}+2\alpha-1)}{(\alpha+1)^{2}}+\frac{(3\alpha^{2}-1)(\alpha+1)^{2}-(-2\alpha+2)(\alpha^{3}-\alpha)}{(\alpha+1)^{4}}-1=0$

Solving we find that $\alpha=\sqrt{2}-1\approx 0.414$

Hence, the balanced strategy for P₁ is built as: calling all his aces, calling $\frac{1-(\sqrt{2}-1)}{1+(\sqrt{2}-1)}\approx 41.4\%$ of his kings, and folding all his queens. P₂ should bet all his aces with a sizing of $\sqrt{2}-1\approx 0.414 (pot)$ , check all his kings, and bet $\frac{(\sqrt{2}-1)}{(\sqrt{2}-1)+1}\approx 29.3\%$ of his queens with a sizing of $\sqrt{2}-1\approx 0.414 (pot)$

This last graph shows the expectation of P₂ as a function of the size of his bets, and how both the bluff and bluff catching frequencies are affected also by the P₂‘s bet size. Notice the bigger P₂‘s bets the more should bluff, whereas P₁ should decrease his bluffcacthing frequency, to the point of always folding if P₂‘s bets are pot sized. I also wrote a simulator for this game that is available for download here, where you can play with the strategies parameters.

The next article in this series will get a little closer to real poker strategies by introducing more strategic options in the AKQ game, such as allowing P₁ both to value bet and bluff, or allowing both players to raise when facing a bet.

By Javier Olivito

Javier Olivito

Connect

Game-theoretic optimal strategy (I)

Case study: AKQ game

No-limit betting

Categories

Comments

Leave a Reply Cancel reply

Recent Posts

Recent Comments

Archives

Categories

Meta

Case study: AKQ game

No-limit betting

Share

Categories

Comments

Leave a Reply Cancel reply

Recent Posts

Recent Comments

Archives

Categories

Meta