Section 5.4 Summary of Chapter 5
In this chapter, we begin by introducing the notion of zero sum games. Suppose two players, Rowan and Colleen had \(n\) and \(m\) strategies to choose from, and given an \(i, j\) choice of strategies, the net payout to Rowan was \(a_{ij}\) (with a negative value meaning a payout to Colleen). This may be recorded in what is called a payoff matrix.
Colleen | |||||
\(a_{11}\) | \(a_{12}\) | \(\cdots\) | \(\cdots\) | \(a_{1m}\) | |
\(a_{21}\) | \(a_{22}\) | \(\cdots\) | \(\cdots\) | \(a_{2m}\) | |
Rowan | \(\vdots\) | \(\vdots\) | \(\ddots\) | \(\ddots\) | \(\vdots\) |
\(\vdots\) | \(\vdots\) | \(\ddots\) | \(\ddots\) | \(\vdots\) | |
\(a_{n1}\) | \(a_{n2}\) | \(\cdots\) | \(\cdots\) | \(a_{nm}\) |
We note that some strategies may be simply bad choices for either player. For example, if there were two rows \(i, i'\) where \(a_{ij} \geq a_{i'j}\text{,}\) for each \(j\text{,}\) then there is no reason for Rowan to pick \(i'\) over \(i\) and we may delete the \(i'\) row. Similarly if for columns \(j, j'\text{,}\) \(a_{ij} \leq a_{ij'}\text{,}\) we may delete the \(j'\) column. This process of deleting rows and columns is called domination.
Once domination is complete, we can find the resulting optimal strategies by considering the primal-dial optimization problem:
Where the \(p_i, q_j\) represent probabilities for Rowan and Colleen and \(\ec{u}, \ec{v}\) represent the values of Rowan and Colleens strategies respectively.
\(\ec{v}\) | \(q_1\) | \(q_2\) | \(\cdots\) | \(q_m\) | \(-1\) | ||
\(\ec{u}\) | \(0\) | \(-1\) | \(-1\) | \(\cdots\) | \(-1\) | \(-1\) | \(=-0\) |
\(p_1\) | \(-1\) | \(a_{11}\) | \(a_{12}\) | \(\cdots\) | \(a_{1m}\) | \(0\) | \(=-t_1\) |
\(p_2\) | \(-1\) | \(a_{21}\) | \(a_{22}\) | \(\cdots\) | \(a_{2m}\) | \(0\) | \(=-t_2\) |
\(\vdots\) | \(\vdots\) | \(\vdots\) | \(\vdots\) | \(\ddots\) | \(\vdots\) | \(\vdots\) | \(\vdots\) |
\(p_n\) | \(-1\) | \(a_{n1}\) | \(a_{n2}\) | \(\cdots\) | \(a_{nm}\) | \(0\) | \(=-t_n\) |
\(-1\) | \(-1\) | \(0\) | \(0\) | \(\cdots\) | \(0\) | \(0\) | \(=f\) |
\(=0\) | \(=s_1\) | \(=s_2\) | \(\cdots\) | \(=s_n\) | \(=g\) |
We note that if there is a saddle point, an \(a_{ij}\) that is the smallest value in row \(i\) but the greatest in column \(j\text{.}\) In this case the optimal strategies for Rowan and Colleen are \(i\) ans \(j\) respectively.
Otherwise, if there are no saddle points, then both players need to employ a random mix of strategies, and solving the above primal-dual problem find \(\p, \q\text{,}\) the probability distribution of the valid strategies for both players.
The theorem that shows this approach is valid is the von Neumann minimax Theorem Theorem 5.2.5. We first note that technically the above problem attempts to maximize \(u\) (minimize \(v\)) constrained by \(u\leq \displaystyle \min_{1\leq i\leq n} A_i \q\text{,}\) (minimize \(v\geq \displaystyle \min_{1\leq j\leq m} \p A^j\)), i.e. it maximizes or minimizes across pure strategies, but we wish to max or min across mixed strategies. These can be shown to be equivalent, a pure strategy is a type of mixed strategy, and the other inequality can be shown through some algebra.
Then if we assume that each \(a_{ij} > 0\text{,}\) then by letting \(\tilde{p}_i=\frac{p_i}{u}, \tilde{q}_j=\frac{q_j}{v}\text{,}\) the above system is equivalent to solving:
\(\tilde{q}_1\) | \(\tilde{q}_2\) | \(\cdots\) | \(\tilde{q}_m\) | \(-1\) | ||
\(\tilde{p}_1\) | \(a_{11}\) | \(a_{12}\) | \(\cdots\) | \(a_{1m}\) | \(1\) | \(=-t_1\) |
\(\tilde{p}_2\) | \(a_{21}\) | \(a_{22}\) | \(\cdots\) | \(a_{2m}\) | \(1\) | \(=-t_2\) |
\(\vdots\) | \(\vdots\) | \(\vdots\) | \(\ddots\) | \(\vdots\) | \(\vdots\) | \(\vdots\) |
\(\tilde{p}_n\) | \(a_{n1}\) | \(a_{n2}\) | \(\cdots\) | \(a_{nm}\) | \(1\) | \(=-t_n\) |
\(-1\) | \(1\) | \(1\) | \(\cdots\) | \(1\) | \(0\) | \(=f\) |
\(=s_1\) | \(=s_2\) | \(\cdots\) | \(=s_n\) | \(=g\) |
We note that since each \(a_{ij} > 0\text{,}\) then the primal region is bounded, and so by the Extreme Value Theorem, the primal problem achieves a maximum, and so by the Strong Duality Theorem, the dual achieves optimality as well.
Finally, we note that when playing games of chance, the random element prevents us from knowing the actual payouts of different strategies. But we can decide to maximize/minimize expected payouts, and then may proceed as before.