In game theory, a game is said to be a potential game if the incentive of all players to change their strategy can be expressed using a single global function called the potential function. The concept originated in a 1996 paper by Dov Monderer and Lloyd Shapley.[1]
The properties of several types of potential games have since been studied. Games can be either ordinal or cardinal potential games. In cardinal games, the difference in individual payoffs for each player from individually changing one's strategy, other things equal, has to have the same value as the difference in values for the potential function. In ordinal games, only the signs of the differences have to be the same.
The potential function is a useful tool to analyze equilibrium properties of games, since the incentives of all players are mapped into one function, and the set of pure Nash equilibria can be found by locating the local optima of the potential function. Convergence and finite-time convergence of an iterated game towards a Nash equilibrium can also be understood by studying the potential function.
Potential games can be studied as repeated games with state so that every round played has a direct consequence on game's state in the next round.[2] This approach has applications in distributed control such as distributed resource allocation, where players without a central correlation mechanism can cooperate to achieve a globally optimal resource distribution.
Let
N
A
Ai
ui:A\toR
1\lei\leN
G=(N,A=A1 x \ldots x AN,u:A → \realsN)
G
\Phi:A → \reals
G
\Phi
\foralli,\forall{a-i\inA-i
\Phi(a'i,a-i)-\Phi(a''i,a-i)=ui(a'i,a-i)-ui(a''i,a-i)
That is: when player
i
a'
a''
\Phi
w\in
N | |
\reals | |
++ |
\foralli,\forall{a-i\inA-i
\Phi(a'i,a-i)-\Phi(a''i,a-i)=wi(ui(a'i,a-i)-ui(a''i,a-i))
\Phi
\foralli,\forall{a-i\inA-i
ui(a'i,a-i)-ui(a''i,a-i)>0\Leftrightarrow \Phi(a'i,a-i)-\Phi(a''i,a-i)>0
\Phi
\foralli,\forall{a-i\inA-i
ui(a'i,a-i)-ui(a''i,a-i)>0 ⇒ \Phi(a'i,a-i)-\Phi(a''i,a-i)>0
\foralli\inN, \forall{a-i\inA-i
bi(a-i
)=\argmax | |
ai\inAi |
\Phi(ai,a-i)
bi(a-i)
i
a-i
Note that while there are
N
In a 2-player, 2-action game with externalities, individual players' payoffs are given by the function, where is players i's action, is the opponent's action, and w is a positive externality from choosing the same action. The action choices are +1 and -1, as seen in the payoff matrix in Figure 1.
This game has a potential function .
If player 1 moves from -1 to +1, the payoff difference is .
The change in potential is .
The solution for player 2 is equivalent. Using numerical values,,, this example transforms into a simple battle of the sexes, as shown in Figure 2. The game has two pure Nash equilibria, and . These are also the local maxima of the potential function (Figure 3). The only stochastically stable equilibrium is, the global maximum of the potential function.
width=40% | width=30% | width=30% |
A 2-player, 2-action game cannot be a potential game unless
[u1(+1,-1)+u1(-1,+1)]-[u1(+1,+1)+u1(-1,-1)]= [u2(+1,-1)+u2(-1,+1)]-[u2(+1,+1)+u2(-1,-1)]
Exact potential games are equivalent to congestion games: Rosenthal[3] proved that every congestion game has an exact potential; Monderer and Shapley proved the opposite direction: every game with an exact potential function is a congestion game.
An improvement path (also called Nash dynamics) is a sequence of strategy-vectors, in which each vector is attained from the previous vector by a single player switching his strategy to a strategy that strictly increases his utility. If a game has a generalized-ordinal-potential function
\Phi
\Phi
A best-response path is a special case of an improvement path, in which each vector is attained from the previous vector by a single player switching his strategy to a best-response strategy. The property that every best-response path is finite is called the finite best-response property (FBRP). FBRP is weaker than FIP, and it still implies the existence of a pure-strategy Nash equilibrium. It also implies that a Nash equlibrium can be computed by a distributed process, but the computational burden on the agents is higher than with FIP, since they have to compute a best-response.
An even weaker property is weak-acyclicity (WA).[5] It means that, for any initial strategy-vector, there exists a finite best-response path starting at that vector. Weak-acyclicity is not sufficient for existence of a potential function (since some improvement-paths may be cyclic), but it is sufficient for the existence of pure-strategy Nash equilibirum. It implies that a Nash equilibrium can be computed almost-surely by a stochastic distributed process, in which at each point, a player is chosen at random, and this player chooses a best-strategy at random.