Home > theory > Bimatrix games

## Bimatrix games

This week in Israel is the Workshop on Innovations in Algorithmic Game Theory, which has been a collection of amazing talks and I thought of blogging about something new I learned here. First, Paul Goldberg gave an amazing talk about the Lemke-Howson algorithm and Homotopy Methods. Later, during the poster session Jugal Garg presented the impressive work on an exact algorithm for finding Nash equilibria in rank 1 bimatrix games. Both have in common the use of homotopy, which I found I quite cool idea – and has been present in Game Theory for a while, but I didn’t know about. They also have to do with improving the understanding we have on the Lemke-Howson Algorithm – a (potentially exponential time) algorithm to find Nash equilibrium in 2-person bi-matrix games, but that seems to work pretty well in practice. As always, I though that blogging about it would be a good idea in order to understand those concepts well.

Bimatrix Game and How to compute equilibrium

Bimatrix game is the simplest class of games studied. Basically it is a $2$ player game, with $n$ strategies for player $1$ and $m$ strategies for player 2 which is represented by a pair of $n \times m$ matrices $(A,B)$. Let $x \in \mathbb{R}_+^n, \Vert x \Vert_1 = 1$ represent a probability distribution that player $1$ assigns to his strategies and $y \in \mathbb{R}_+^m, \Vert y \Vert_1 = 1$ be the same for player $2$. This way, the players experience utilities: $2$

The best understood class of those games is the one where $A+B = 0$, called zero-sum games. For this class, computing a Nash equilibrium is very easy and it is given by the famous min-max theorem: player $1$ finds $x$ maximizing $\min_j (x^t A e_j)$ where $e_j$ is the $j$-th unit vector. Similarly player $2$ finds $y$ maximizing $\min_i (e_i^t B y)$. Then the pair of strategies obtained is a Nash equilibrium – and this verifying that is not hard.

When $A+B \neq 0$, the problems gets a lot more complicated. Proving that equilibrium exist can be done using various fixed point theorems, as Brouwer or Kakutani. There is a very simple exponential time algorithm for finding it and the key observation is the following, if $(x,y)$ is a Nash equilibrium then: $\forall i: x_i = 0 \text{ or } \forall i': e_i^t A y \geq e_{i'}^t A y$ $\forall j: y_j = 0 \text{ or } \forall j': x^t B e_j \geq x^t B e_{j'}$

which means that each strategy for player $i$ is either in the support or is a best response. Proving that is trivial (if some strategy is in the support and is not a best response, then reducing the probability we play it is an improving deviation). Therefore if we just guess the support of $x$ and the support of $y$ we just need to find some strategies with this support satisfying the inequalities above. This can be done using a simple LP. This is clearly not very efficient, since it involves solving $2^{n+m}$ LPs. A still exponential, but a lot more practical method, is:

Lemke-Howson Algorithm

A good overview of the L-H algorithm can be found in those lecture notes or in a more detailed version in third chapter of the AGT book (a pdf  can be found in Tim’s website). Here I’ll present a quick overview. The main idea is to define the best-response polytopes $P$ and $Q$: $Q$ $Q$

The intuition is that a point $(x,v) \in P$ represents the fact that the payoff $v$ of player $2$ when player $1$ plays $x$ is at least $v$. We could re-write as $v \geq \max_j x^t B e_j$.

Each of the polytopes has $m+n$ inequalities and $1$ equality. Given $x$ we define $L(x)$ as the indices of the tight inequalities in $P$ and similarly we define $L(y)$ as the indices of the tight inequalities in $Q$. The theorem in the previous section can be rephrased as: $(x,y)$ is Nash equilibrium iff $L(x) \cup L(y) = \{1, 2, \hdots, n+m\}$

So we need to look at points in the polytope below that are fully-labeled. And the way that this is done in L-H is quite ingenious – it is a similar idea that is used by the Simplex Method – walking through the vertices of the polytope looking for the desired vertex. In order to do it, let’s define another polytope: $L(x) \cup L(y) = \{1, 2, \hdots, n+m\}$ $L(x) \cup L(y) = \{1, 2, \hdots, n+m\}$

Note that there is a clear bijection between $P' \setminus 0 \leftrightarrow P$ and $Q' \setminus 0 \leftrightarrow Q$. This is a projective transformation given by $x \mapsto (\frac{x}{\Vert x \Vert_1},\Vert x \Vert_1)$. Notice that the labels are preserved by this transformation and the vertex and edges of the polytope are mapped almost 1-1 (except the vertex $0$). Notice a couple of details:

1. A vertex in $x \in P'$ corresponds to a point with $n$ labels. A vertex in $y \in Q''$ corresponds to a point with $m$ labels.

2. The point $(0,0)$ corresponds to a fully-labeled point of $P' \times Q'$, unfortunately this is the only fully-labeled point that doesn’t correspond to a Nash equilibrium.

3. By taking an edge of the polytope (which is define by $n-1$ labels in $P'$ and $m-1$ labels in $Q'$. We can move between two nodes that are almost fully labeled.

The idea is to consider the set of nodes that are almost fully-labeled: fix some label $k$ and consider all the nodes such that $L(x) \cup L(y) \supseteq \{1 .. k-1, k+1 .. n+m\}$. Those points are either $(0,0)$, a Nash equilibrium or they have one duplicated label, since $\vert L(x) \vert + \vert L(y) \vert = n+m$. An edge is composed of $n+m-1$ labels (no duplicated labels). So, one almost fully-labeled point that is not a Nash or $(0,0)$ is only connected to two other vertices via an edge (which correspond to dropping one of the duplicated labels and following the corresponding edge).

This gives us a topology of the space of Nash equilibria. This tells us a couple of facts: (1) we can find a Nash equilibrium by starting in $(0,0)$ and following the L-H path. In the end of the path, we must find a Nash equilibrium. (ii) If the polytope is not degenerate, then there is an odd number of Nash equilibria and they are connected by L-H edges in the following way: where the blue dots are the Nash equilibria, the white dots are the almost fully-labeled points and the edges are the L-H edges. The number of Nash equilibria are odd by a simple parity argument.

The classic way in which simplex-like methods walk through the vertices of a polytope is by pivoting. The same way we can implement Lemke-Howson. For an explanation on the implementation, please look at the chapter 3 of the AGT book.

One can ask if we could modify the L-H path following to go through the path faster. Recently, Goldberg, Papadimitriou and Savani proved that finding the Nash equilibrium that L-H outputs is PSPACE-complete. So, in principle, finding this specific equilibrium, seems harder than finding any Nash, which is PPAD-complete.

My current plan is on some other blog posts on Bimatrix games. I wanted to discuss two things: first the homotopy methods and homotopy fixed-point-theorems (which is the heart of the two papers mentioned above) and about new techniques for solving zero-matrix where the strategy space is exponential.

Categories: theory Tags: