Game-theoretic optimal strategy (II)

In the previous article of the series Game-theoretic optimal strategy, the concept of optimal play was introduced and illustrated solving a toy game. In this article, we are tackling a more complex game called heads-up push/fold, which is defined as follows:

Two players (SB and BB)
SB posts $0.5bb$ and BB $1.0bb$
SB acts first and is forced to go allin or fold
BB can call or fold when facing an allin

This game has practical use in short-stack blind-war spots, which often occur in tournaments. The solution of this game cannot be considered the solution of the real spot, though. Forcing the small blind to play push or fold leaves out of the analysis other lines such as limping or raising, and their postflop play involved, which might lead to balanced lines with higher expected value. However, the push/fold approach is a good approximation, especially with very short stacks.

Maximally exploitative strategy

A key concept we must handle before solving the game is maximally exploitative strategy (MES). Suppose you know exactly how your opponent plays; this means you know the frequency of each action he will take with every single hand. With such information, you can compute the equity of your legal actions and select the most valued one. Let us illustrate this process with a toy example where both players play with an effective stack of $10bb$ and are dealt only with [Ac][Ks], [9d][7d], [6h][5s].

The player in the SB knows that the player in the BB calls with [Ac][Ks], [9d][7d] and folds with [6h][5s]. We know that the reward when he folds is $-0.5bb$ , as he loses the small blind posted to play the hand; the reward when he pushes and the BB folds is $1.0bb$ , as he wins the big blind the opponent posted to play the hand; the reward when he pushes and gets called is the expected value (EV) of his hand against the range of hands he gets called. What is the MES with [Ac][Ks]?

We know by enumerating the outcome in all the combinations of boards that [Ac][Ks] wins 59.13%, ties 0.19% and loses 40.49% when facing [9d][7d], yielding an EV of $0.5913\cdot 10bb - 0.4049 \cdot (-10bb) = 1.864bb$ . Hence:

$EV_{push}=\frac{1}{2}1.864bb+\frac{1}{2}1.0bb = 1.432bb$

$EV_{fold}=-0.5bb$

Pushing [Ac][Ks] is more profitable than folding, so our MES pushes that hand all the time. Repeating this process for the remaining two hands, we get $EV_{push}= -0.182bb$ for [9d][7d], and $EV_{push}= -2.783bb$ for [6h][5s]. Therefore, our MES against that opponent’s strategy consists in pushing [Ac][Ks] and [9d][7d], and folding [6h][5s]. Notice that MESs are always pure strategies (i.e., always performing the same action in a certain spot).

Fictitious play

How do we find the optimal play? There are several methods that find Nash equilibria. The one chosen for this work is called Fictitious play (George W. Brown, 1951). The learning process of this method begins with any arbitrary strategy for each player. Then, it performs an iterative process where each step consists in finding the pure maximally exploitative strategy against the opponent’s strategy and mixing it with its current strategy. Several mixing functions can be used, but only the harmonic series $\sum \frac{1}{n}$ has been proven to converge.

sb = init_strategy(random)
bb = init_strategy(random)
n = 2
do
    sb_max = maximally_exploitative(bb)
    sb_max = maximally_exploitative(sb)
    weigth = 1 / n
    sb = (1 - weight) * sb + weight * sb_max
    bb = (1 - weight) * bb + weight * bb_max
    n = n + 1
until(convergence_criterion)

Each iteration yields a -potentially- mixed strategy. The process finishes once a pre-established convergence criterion is met. The charts below depict the optimal play for both the SB and the BB when playing stacks up to $15.0bb$ . This kind of chart shows all starting hands grouped into offsuited and suited. Hands over the major diagonal are suited whereas hands below are offsuited. The chart for the SB specifies the largest stack each hand must be pushed with; the chart for the BB the largest stack each hand must be called with.

For instance, suppose we are in the SB dealt with [Jc][6h] and the effective stack is $8.3bb$ . What should be our move? Our hand is offsuited, so we should look at the row and column under the major diagonal that matches our hand: the cell placed at column J and row 6 rules that we should fold as this hand must be pushed only when the effective stack is lower than $6.3bb$ .

Similarly, if we are in the BB and dealt with the same hand, we should call as long as the effective stack is under $5.5bb$ .

The actual optimal play is not as simple as a unique threshold for each hand; things are a bit more complex. However, the solution presented in these charts makes it easily readable for humans and therefore of practical use.

A key property of algorithms that finds out Nash equilibria is convergence speed. The more complex the game the more computations are required. This is a game easy to solve, but convergence speed becomes a major concern on more complex games, such as real poker, where the remaining preflop lines and their postflop game involved are considered. By way of illustration, I have plotted the convergence dynamics of fictitious play while building the optimal strategy for the SB and the BB playing with an effective stack of $15bb$ . A few thousand iterations are enough for most hands to converge.

Software repositories

I have released three applications:

Equity builder: The core of this app is a hand evaluator. It generates the preflop equity of any hand versus any hand.
Heads-up push fold solver: Finds the Nash equilibrium of the heads-up push/fold game by applying fictitious play. It leverages the precomputed equity written by Equity builder to speed up the computations.
Plotter: Plots the push and call charts and the convergence dynamics.

Equity builder and Heads-up push fold solver were written in C for performance purposes, and Plotter was written in Python. They are available for download at my GitHub optimal play repository

By Javier Olivito

Javier Olivito

Connect

Game-theoretic optimal strategy (II)

Maximally exploitative strategy

Fictitious play

Software repositories

Categories

Tags

Leave a Reply Cancel reply

Recent Posts

Recent Comments

Archives

Categories

Meta

Maximally exploitative strategy

Fictitious play

Software repositories

Share

Categories

Tags

Leave a Reply Cancel reply

Recent Posts

Recent Comments

Archives

Categories

Meta