Price of anarchy explained

The Price of Anarchy (PoA)^[1] is a concept in economics and game theory that measures how the efficiency of a system degrades due to selfish behavior of its agents. It is a general notion that can be extended to diverse systems and notions of efficiency. For example, consider the system of transportation of a city and many agents trying to go from some initial location to a destination. Here, efficiency means the average time for an agent to reach the destination. In the 'centralized' solution, a central authority can tell each agent which path to take in order to minimize the average travel time. In the 'decentralized' version, each agent chooses its own path. The Price of Anarchy measures the ratio between average travel time in the two cases.

Usually the system is modeled as a game and the efficiency is some function of the outcomes (e.g. maximum delay in a network, congestion in a transportation system, social welfare in an auction, etc.). Different concepts of equilibrium can be used to model the selfish behavior of the agents, among which the most common is the Nash equilibrium. Different flavors of Nash equilibrium lead to variations of the notion of Price of Anarchy as Pure Price of Anarchy (for deterministic equilibria), Mixed Price of Anarchy (for randomized equilibria), and Bayes–Nash Price of Anarchy (for games with incomplete information). Solution concepts other than Nash equilibrium lead to variations such as the Price of Sinking.^[2]

The term Price of Anarchy was first used by Elias Koutsoupias and Christos Papadimitriou, but the idea of measuring inefficiency of equilibrium is older.^[3] The concept in its current form was designed to be the analogue of the 'approximation ratio' in an approximation algorithm or the 'competitive ratio' in an online algorithm. This is in the context of the current trend of analyzing games using algorithmic lenses (algorithmic game theory).

Mathematical definition

Consider a game

G=(N,S,u)

, defined by a set of players

, strategy sets

S_i

for each player and utilities

u_i:S → R

(where

S=S₁ x ... x S_n

also called set of outcomes). We can define a measure of efficiency of each outcome which we call welfare function

\operatorname{Welf}:S → R

. Natural candidates include the sum of players utilities (utilitarian objective)

\operatorname{Welf}(s)=\sum_iu_i(s),

minimum utility (fairness or egalitarian objective)

\operatorname{Welf}(s)=min_iu_i(s),

..., or any function that is meaningful for the particular game being analyzed and is desirable to be maximized.

We can define a subset

Equil\subseteqS

to be the set of strategies in equilibrium (for example, the set of Nash equilibria). The Price of Anarchy is then defined as the ratio between the optimal 'centralized' solution and 'worst equilibrium':

PoA=

	max_s\operatorname{Welf
	(s)}{min

_s\operatorname{Welf}(s)}

If, instead of a 'welfare' which we want to 'maximize', the function measure efficiency is a 'cost function'

\operatorname{Cost}:S → R

which we want to 'minimize' (e.g. delay in a network) we use (following the convention in approximation algorithms):

PoA=

	max_s\operatorname{Cost
	(s)}{min

_s\operatorname{Cost}(s)}

A related notion is that of the Price of Stability (PoS) which measures the ratio between the optimal 'centralized' solution and the 'best equilibrium':

PoS=

	max_s\operatorname{Welf
	(s)}{max

_s\operatorname{Welf}(s)}

or in the case of cost functions:

PoS=

	min_s\operatorname{Cost
	(s)}{min

_s\operatorname{Cost}(s)}

We know that

1\leqPoS\leqPoA

by the definition. It is expected that the loss in efficiency due to game-theoretical constraints is somewhere between 'PoS' and 'PoA'.

Examples

Prisoner's dilemma

Consider the 2x2 game called prisoner's dilemma, given by the following cost matrix:

Cooperate	Defect
Cooperate	1, 1	7, 0
Defect	0, 7	5, 5

and let the cost function be

C(s_1,s₂₎=u_1(s_1,s₂₎+u_2(s_1,s_2).

Now, the worst (and only) Nash Equilibrium would be when both players defect and the resulting cost is

C_equil=5+5=10

. However, the highest social welfare occurs when both cooperate, in which case the cost is

C_min=1+1=2

. Thus the PoA of this game will be

C_equil/C_min=10/2=5

Since the game has a unique Nash equilibrium, the PoS is equal to the PoA and it is 5 too.

Job scheduling

A more natural example is the one of job scheduling. There are

players and each of them has a job to run. They can choose one of

machines to run the job. The Price of Anarchy compares the situation where the selection of machines is guided/directed centrally to the situation where each player chooses the machine that will make its job run fastest.

Each machine has a speed

s_1,\ldots,s_M>0.

Each job has a weight

w_1,\ldots,w_N>0.

A player picks a machine to run his or her job on. So, the strategies of each player are

A_{i=\{1,2,\ldots,M\}.}

Define the load on machine

to be:

j(a)=

\sum		w_i
	i:a_i=j

s_j

The cost for player

c_i(a)=L


	a_i

(a),

i.e., the load of the machine they chose. We consider the egalitarian cost function

MS(a)=max_jL_j(a)

, here called the makespan.

We consider two concepts of equilibrium: pure Nash and mixed Nash. It should be clear that mixed PoA ≥ pure PoA, because any pure Nash equilibrium is also a mixed Nash equilibrium (this inequality can be strict: e.g. when

N=2

w_1=w₂₌₁

M=2

, and

s_1=s₂₌₁

, the mixed strategies

\sigma_1=\sigma_2=(1/2,1/2)

achieve an average makespan of 1.5, while any pure-strategy PoA in this setting is

\leq4/3

). First we need to argue that there exist pure Nash equilibria.

Claim. For each job scheduling game, there exists at least one pure-strategy Nash equilibrium.

Proof. We would like to take a socially optimal action profile

a^*

. This would mean simply an action profile whose makespan is minimum. However, this will not be enough. There may be several such action profiles leading to a variety of different loads distributions (all having the same maximum load). Among these, we further restrict ourselves to one that has a minimum second-largest load. Again, this results in a set of possible load distributions, and we repeat until the

th-largest (i.e., smallest) load, where there can only be one distribution of loads (unique up to permutation). This would also be called the lexicographic smallest sorted load vector.

We claim that this is a pure-strategy Nash equilibrium. Reasoning by contradiction, suppose that some player

could strictly improve by moving from machine

to machine

. This means that the increased load of machine

after the move is still smaller than the load of machine

before the move. As the load of machine

must decrease as a result of the move and no other machine is affected, this means that the new configuration is guaranteed to have reduced the

th-largest (or higher ranked) load in the distribution. This, however, violates the assumed lexicographic minimality of

. Q.E.D.

Claim. For each job scheduling game, the pure PoA is at most

Proof. It is easy to upper-bound the welfare obtained at any mixed-strategy Nash equilibrium

\sigma

w(\sigma)\leq

	\sum_i{w_i

Consider, for clarity of exposition, any pure-strategy action profile

: clearly

w(a)\geq

	\sum_i{w_i

} \geq \frac.

Since the above holds for the social optimum as well, comparing the ratios

w(\sigma)

and

w(a)

proves the claim. Q.E.D

Selfish Routing

Braess's paradox

See main article: Braess's paradox.

Consider a road network as shown in the adjacent diagram on which 4000 drivers wish to travel from point Start to End. The travel time in minutes on the Start–A road is the number of travelers (T) divided by 100, and on Start–B is a constant 45 minutes (likewise with the roads across from them). If the dashed road does not exist (so the traffic network has 4 roads in total), the time needed to drive Start–A–End route with

drivers would be

\tfrac{a}{100}+45

. The time needed to drive the Start–B–End route with

drivers would be

\tfrac{b}{100}+45

. As there are 4000 drivers, the fact that

a+b=4000

can be used to derive the fact that

a=b=2000

when the system is at equilibrium. Therefore, each route takes

\tfrac{2000}{100}+45=65

minutes. If either route took less time, it would not be a Nash equilibrium: a rational driver would switch from the longer route to the shorter route.

Now suppose the dashed line A–B is a road with an extremely short travel time of approximately 0 minutes. Suppose that the road is opened and one driver tries Start–A–B–End. To his surprise he finds that his time is

\tfrac{2000}{100}+\tfrac{2001}{100}=40.01

minutes, a saving of almost 25 minutes. Soon, more of the 4000 drivers are trying this new route. The time taken rises from 40.01 and keeps climbing. When the number of drivers trying the new route reaches 2500, with 1500 still in the Start–B–End route, their time will be

\tfrac{2500}{100}+\tfrac{4000}{100}=65

minutes, which is no improvement over the original route. Meanwhile, those 1500 drivers have been slowed to

45+\tfrac{4000}{100}=85

minutes, a 20-minute increase. They are obliged to switch to the new route via A too, so it now takes

\tfrac{4000}{100}+\tfrac{4000}{100}=80

minutes. Nobody has any incentive to travel A-End or Start-B because any driver trying them will take 85 minutes. Thus, the opening of the cross route triggers an irreversible change to it by everyone, costing everyone 80 minutes instead of the original 65. If every driver were to agree not to use the A–B path, or if that route were closed, every driver would benefit by a 15-minute reduction in travel time.

Generalized routing problem

The routing problem introduced in the Braess's paradox can be generalized to many different flows traversing the same graph at the same time.

Definition (Generalized flow). Let

G=(V,E)

and

be as defined above, and suppose that we want to route the quantities

R=\{r_1,r_2,...,r_k, | r_i>0\}

through each distinct pair of nodes in

\Gamma=\{(s_1,t_1),(s_2,t_2),...,(s_k,t_k)\}\subseteq(V x V)

. A flow

f_\Gamma,

is defined as an assignment

p\mapsto\Re

of a real, nonnegative number to each path

going from

s_i

t_i

\in\Gamma

, with the constraint that

\sum
	p:s_i → t_i

{f_p}=r_i \forall(s_i,t_i)\in\Gamma.

The flow traversing a specific edge of

is defined as

f_e,\Gamma,=\sum_p:{f_p}.

For succinctness, we write

f_e

when

\Gamma,R

are clear from context.

Definition (Nash-equilibrium flow). A flow

f_\Gamma,

is a Nash-equilibrium flow iff

\forall(s_i,t_i)\in\Gamma

and

\forallp,q

from

s_i

t_i

f_p>0 ⇒ \sum_e{l_e(f_e)}\leq\sum_e{l_e(f_e)}.

This definition is closely related to what we said about the support of mixed-strategy Nash equilibria in normal-form games.

Definition (Conditional welfare of a flow). Let

f_\Gamma,

and

	*
f
	\Gamma,R

be two flows in

associated with the same sets

\Gamma

and

. In what follows, we will drop the subscript to make the notation clearer. Assume to fix the latencies induced by

on the graph: the conditional welfare of

f^*

with respect to

is defined as

w^f(f^*)=\sum_e

	*
{f
	e

⋅ l_e(f_e)}

Fact 1. Given a Nash-equilibrium flow

and any other flow

f^*

w(f)=w^f(f)\leqw^f(f^*)

Proof (By contradiction). Assume that

w^f(f^*)<w^f(f)

. By definition,

	k
\sum
	i=1

\sum
	p:s_i → t_i

	*
f
	p

⋅ \sum_el_e(f_e)<

	k
\sum
	i=1

\sum
	p:s_i → t_i

f_p ⋅ \sum_el_e(f_e)

.Since

and

f^*

are associated with the same sets

\Gamma,R

, we know that

\sum
	p:s_i → t_i

f_p=

\sum
	p:s_i → t_i

	*
f
	p

=r_i \foralli.

Therefore, there must be a pair

(s_i,t_i)

and two paths

p,q

from

s_i

t_i

such that

	*
f
	p

>f_p

	*
f
	q

<f_q

, and

\sum_el_e(f_e)<\sum_el_e(f_e).

In other words, the flow

f^*

can achieve a lower welfare than

only if there are two paths from

s_i

t_i

having different costs, and if

f^*

reroutes some flow of

from the higher-cost path to the lower-cost path. This situation is clearly incompatible with the assumption that

is a Nash-equilibrium flow. Q.E.D.

Note that Fact 1 does not assume any particular structure on the set

Fact 2. Given any two real numbers

and

x ⋅ y\leqx²+y²/4

Proof. This is another way to express the true inequality

(x-y/2)²\geq0

. Q.E.D.

Theorem. The pure PoA of any generalized routing problem

(G,L)

with linear latencies is

\leq4/3

Proof. Note that this theorem is equivalent to saying that for each Nash-equilibrium flow

w(f)\leq(4/3) ⋅

min
	f^*

\{w(f^*)\}

, where

f^*

is any other flow. By definition,

w^f(f^*)=\sum_e

	*
f
	e

(a_e ⋅ f_e+b_e)

=\sum_e(a_ef_e

	*
f
	e

)+\sum_e

	*
f
	e

b_e.

By using Fact 2, we have that

w^f(f^*)\leq\sum_e\left(a_e ⋅ \left(

	*
(f
	e

)²+

	2
(f
	e)

/4\right)\right)+\sum_e

	*
f
	e

⋅ b_e

=\left(\sum_ea_e(f

	*

	e

)²+

	*
f
	e

b_e\right)+\sum_ea_e

	2
(f
	e)

\leqw(f^*)+

	w(f)
	4

since

(1/4) ⋅ w(f)=(1/4) ⋅ \sum_ef_e(a_ef_e+b_e)

=(1/4) ⋅ \sum_e(f_e)²+\underbrace{(1/4) ⋅ \sum_ef_eb_e

}_.

We can conclude that

w^f(f^*)\leqw(f^*)+w(f)/4

, and prove the thesis using Fact 1. Q.E.D.

Note that in the proof we have made extensive use of the assumption that the functions in

are linear. Actually, a more general fact holds.

Theorem. Given a generalized routing problem with graph

and polynomial latency functions of degree

with nonnegative coefficients, the pure PoA is

\leqd+1

Note that the PoA can grow with

. Consider the example shown in the following figure, where we assume unit flow: the Nash-equilibrium flows have social welfare 1; however, the best welfare is achieved when

x=1-1/{\sqrt{d+1}}

, in which case

w=\left(1-

	1
	\sqrt{d+1

} \right)^d \cdot \left(1-\frac \right) + 1 \cdot \frac

=\left(\left(1-

	1
	\sqrt{d+1

} \right)^\right)^\sqrt+\frac

\leqe^-\sqrt{d+1

} + \frac.

This quantity tends to zero when

tends to infinity.

Further results

PoA upper bounds can be obtained if the game is shown to satisfy a so-called smoothness inequality. More precisely, a cost-minimimization game is (λ,μ)-smooth (with λ ≥ 0 and μ < 1) if the inequality

	n
\sum
	i=1

C_i\left(a

	*

	i

,a_-i\right)\leqλC\left(a^*\right)+\muC(a)

holds for any outcome a and a*. In this case, the PoA is upper bounded by λ/(1 − μ).^[4]

For cost-sharing games with concave cost functions, the optimal cost-sharing rule that optimizes the price of anarchy, followed by the price of stability, is precisely the Shapley value cost-sharing rule.^[5] (A symmetrical statement is similarly valid for utility-sharing games with convex utility functions.) In mechanism design, this means that the Shapley value solution concept is optimal for these sets of games.

Moreover, for these (finite) games it was proven that every equilibrium which achieves the PoA bound is fragile, in the sense that the agents demonstrate a state of indifference between their equilibrium action and the action they would pursue in a system-optimal outcome.^[6]

References

Tim Roughgarden and Eva Tardos, "Introduction to the Inefficiency of Equilibria". Chapter 17 in .
Book: Tim Roughgarden . Tim Roughgarden . Selfish routing and the price of anarchy . . 2005 . 0-262-18243-2 .

Notes and References

Worst-case Equilibria. Elias. Koutsoupias. Christos. Papadimitriou. Computer Science Review. 3. 2. May 2009. 65–69. 10.1016/j.cosrev.2009.04.003 . 2010-09-12. https://web.archive.org/web/20160313023635/http://www.cs.berkeley.edu/~christos/nash.ps. 2016-03-13.
M. Goemans, V. Mirrokni, A. Vetta, Sink equilibria and convergence, FOCS 05
P. Dubey. Inefficiency of Nash equilibria. Math. Operat. Res., 11(1):1–8, 1986
Roughgarden . Tim . 2015-11-02 . Intrinsic Robustness of the Price of Anarchy . Journal of the ACM . en . 62 . 5 . 1–42 . 10.1145/2806883 . 0004-5411.
Phillips. Matthew. Marden. Jason R.. July 2018. Design Tradeoffs in Concave Cost-Sharing Games. IEEE Transactions on Automatic Control. en-US. 63. 7. 2242–2247. 10.1109/tac.2017.2765299. 45923961. 0018-9286.
Seaton . Joshua H. . Brown . Philip N. . 2023 . On the Intrinsic Fragility of the Price of Anarchy . IEEE Control Systems Letters . 7 . 3573–3578 . 10.1109/LCSYS.2023.3335315 . 2475-1456.

Price of anarchy explained

Mathematical definition

Examples

Prisoner's dilemma

Job scheduling

Selfish Routing

Braess's paradox

Generalized routing problem

Further results

See also

References

Further reading

Notes and References