Shapley value explained

The Shapley value is a solution concept in cooperative game theory. It was named in honor of Lloyd Shapley, who introduced it in 1951 and won the Nobel Memorial Prize in Economic Sciences for it in 2012.^[1] ^[2] To each cooperative game it assigns a unique distribution (among the players) of a total surplus generated by the coalition of all players. The Shapley value is characterized by a collection of desirable properties. Hart (1989) provides a survey of the subject.^[3] ^[4]

Formal definition

that maps subsets of players to the real numbers:

v\colon2^N\toR

, with

v(\emptyset)=0

, where

\emptyset

denotes the empty set. The function

is called a characteristic function.

The function

has the following meaning: if S is a coalition of players, then

(S), called the worth of coalition S, describes the total expected sum of payoffs the members of

can obtain by cooperation.

The Shapley value is one way to distribute the total gains to the players, assuming that they all collaborate. It is a "fair" distribution in the sense that it is the only distribution with certain desirable properties listed below. According to the Shapley value,^[5] the amount that player i is given in a coalitional game

(v,N)

\varphi_i(v)=\sum_S

} \frac

\; (n-	S	-1)!

(v(S\cup\)-v(S))

	1
	n

\sum_S

} ^ (v(S\cup\)-v(S))

where n is the total number of players and the sum extends over all subsets S of N not containing player i, including the empty set. Also note that

{n\choosek}

is the binomial coefficient. The formula can be interpreted as follows: imagine the coalition being formed one actor at a time, with each actor demanding their contribution

v(S\cup\{i\})-v(S)

as a fair compensation, and then for each actor take the average of this contribution over the possible different permutations in which the coalition can be formed.

An alternative equivalent formula for the Shapley value is:

\varphi_i(v)=

	1
	n!

\sum_R\left[

	R
v(P
	i

\cup\left\{i\right\})-

	R)
v(P
	i

\right]

where the sum ranges over all

orders

of the players and

	R
P
	i

is the set of players in

which precede

in the order

. Finally, it can also be expressed as

\varphi_i(v)=

	1
	n

\sum_S

} \binom

^ (v(S \cup \) - v(S))

which can be interpreted as

\varphi_i(v)=

	1
	numberofplayers

\sum_{coalitionsexcludingi}

	marginalcontributionofitocoalition
	numberofcoalitionsexcludingiofthissize

In terms of synergy

From the characteristic function

one can compute the synergy that each group of players provides. The synergy is the unique function

w\colon2^N\toR

, such that

v(S)=\sum_Rw(R)

for any subset

S\subseteqN

of players. In other words, the 'total value' of the coalition

comes from summing up the synergies of each possible subset of

Given a characteristic function

, the synergy function

is calculated via

w(S)=\sum_R(-1)^|S|v(R)

using the Inclusion exclusion principle. In other words, the synergy of coalition

is the value

v(S)

, which is not already accounted for by its subsets.

The Shapley values are given in terms of the synergy function by^[6] ^[7]

\varphi_i(v)=\sum_i

	w(S)
	\|S\|

where the sum is over all subsets

that include player

This can be interpreted as

\varphi_i(v)=\sum_{coalitionsincludingi}

	synergyofthecoalition
	membersinthecoalition

In other words, the synergy of each coalition is divided equally between all members.

Examples

Business example

Consider a simplified description of a business. An owner, o, provides crucial capital in the sense that, without him/her, no gains can be obtained. There are m workers w₁,...,w_m, each of whom contributes an amount p to the total profit. Let

N=\{o,w_1,\ldots,w_m\}.

The value function for this coalitional game is

v(S)=\begin{cases} (|S|-1)p&ifo\inS ,\\ 0&otherwise .\\ \end{cases}

Computing the Shapley value for this coalition game leads to a value of for the owner and for each one of the m workers.

This can be understood from the perspective of synergy. The synergy function

w(S)=\begin{cases} p,&ifS=\{o,w_i\}\\ 0,&otherwise\\ \end{cases}

so the only coalitions that generate synergy are one-to-one between the owner and any individual worker.

Using the above formula for the Shapley value in terms of

we compute

\varphi
	w_i

	w(\{o,w_i\
	)}{2}

	p
	2

and

\varphi_o=

	m
\sum
	i=1

	w(\{o,w_i\
	)}{2}

	mp
	2

The result can also be understood from the perspective of averaging over all orders. A given worker joins the coalition after the owner (and therefore contributes p) in half of the orders and thus makes an average contribution of

	p2

upon joining. When the owner joins, on average half the workers have already joined, so the owner's average contribution upon joining is

	mp
	2

Glove game

The glove game is a coalitional game where the players have left- and right-hand gloves and the goal is to form pairs. Let

N=\{1,2,3\},

where players 1 and 2 have right-hand gloves and player 3 has a left-hand glove.

The value function for this coalitional game is

v(S)=\begin{cases} 1&ifS\in\left\{\{1,3\},\{2,3\},\{1,2,3\}\right\};\\ 0&otherwise.\\ \end{cases}

The formula for calculating the Shapley value is

\varphi_i(v)=

	1
	\|N\|!

\sum_R\left[

	R
v(P
	i

\cup\left\{i\right\})-

	R)
v(P
	i

\right],

where is an ordering of the players and

	R
P
	i

is the set of players in which precede in the order .

The following table displays the marginal contributions of Player 1.

\begin{array}{|c|r|} OrderR&MC_{1
\\
\hline
{1,2,3}
&v(\{1\})}-v(\varnothing)=0-0=0 \\ {1,3,2} &v(\{1\})-v(\varnothing)=0-0=0 \\ {2,1,3} &v(\{1,2\})-v(\{2\})=0-0=0 \\ {2,3,1} &v(\{1,2,3\})-v(\{2,3\})=1-1=0 \\ {3,1,2} &v(\{1,3\})-v(\{3\})=1-0=1 \\ {3,2,1} &v(\{1,3,2\})-v(\{3,2\})=1-1=0 \end{array}

Observe

\varphi_1(v)=\left(

	1	\right)(1)=
	6

	1
	6

By a symmetry argument it can be shown that

\varphi_2(v)=\varphi

1(v)=	1
	6

Due to the efficiency axiom, the sum of all the Shapley values is equal to 1, which means that

\varphi_3(v)=

	4
	6

	2
	3

Properties

The Shapley value has many desirable properties.Notably, it is the only payment rule satisfying the four properties of Efficiency, Symmetry, Linearity and Null player.^[8] See for more characterizations of the Shapley value.

Efficiency

The sum of the Shapley values of all agents equals the value of the grand coalition, so that all the gain is distributed among the agents:

\sum_i\in\varphi_i(v)=v(N)

Proof:

\sum_i\in\varphi_i(v)=

	1
	\|N\|!

\sum_R\sum_i\in

	R
v(P
	i

\cup\left\{i\right\})-

	R)
v(P
	i

	1
	\|N\|!

\sum_Rv(N)=

	1
	\|N\|!

|N|! ⋅ v(N)=v(N)

since

\sum_i\in

	R
v(P
	i

\cup\left\{i\right\})-

	R)
v(P
	i

is a telescoping sum and there are |N|! different orderings R.

Symmetry

and

are two actors who are equivalent in the sense that

v(S\cup\{i\})=v(S\cup\{j\})

for every subset

which contains neither

nor

, then

\varphi_i(v)=\varphi_j(v)

This property is also called equal treatment of equals.

Linearity

If two coalition games described by gain functions

and

are combined, then the distributed gains should correspond to the gains derived from

and the gains derived from

\varphi_i(v+w)=\varphi_i(v)+\varphi_i(w)

for every

. Also, for any real number

\varphi_i(av)=a\varphi_i(v)

for every

Null player

The Shapley value

\varphi_i(v)

of a null player

in a game

is zero. A player

is null in

v(S\cup\{i\})=v(S)

for all coalitions

that do not contain

Stand-alone test

is a subadditive set function, i.e.,

v(S\sqcupT)\leqv(S)+v(T)

, then for each agent

\varphi_i(v)\leqv(\{i\})

Similarly, if

is a superadditive set function, i.e.,

v(S\sqcupT)\geqv(S)+v(T)

, then for each agent

\varphi_i(v)\geqv(\{i\})

So, if the cooperation has positive externalities, all agents (weakly) gain, and if it has negative externalities, all agents (weakly) lose.

Anonymity

and

are two agents, and

is a gain function that is identical to

except that the roles of

and

have been exchanged, then

\varphi_i(v)=\varphi_j(w)

. This means that the labeling of the agents doesn't play a role in the assignment of their gains.

Marginalism

The Shapley value can be defined as a function which uses only the marginal contributions of player

as the arguments.

Aumann–Shapley value

In their 1974 book, Lloyd Shapley and Robert Aumann extended the concept of the Shapley value to infinite games (defined with respect to a non-atomic measure), creating the diagonal formula.^[9] This was later extended by Jean-François Mertens and Abraham Neyman.

As seen above, the value of an n-person game associates to each player the expectation of his contribution to the worth or the coalition or players before him in a random ordering of all the players. When there are many players and each individual plays only a minor role, the set of all players preceding a given one is heuristically thought as a good sample of the players so that the value of a given infinitesimal player around as "his" contribution to the worth of a "perfect" sample of the population of all players.

Symbolically, if is the coalitional worth function associating to each coalition measured subset of a measurable set that can be thought as

I=[0,1]

without loss of generality.

(Sv)(ds)=

	1
\int
	0

(v(tI+ds)-v(tI))dt.

where

(Sv)(ds)

denotes the Shapley value of the infinitesimal player in the game, is a perfect sample of the all-player set containing a proportion of all the players, and

tI+ds

is the coalition obtained after joins . This is the heuristic form of the diagonal formula.

Assuming some regularity of the worth function, for example assuming can be represented as differentiable function of a non-atomic measure on,,

v(c)=f(\mu(c))

with density function

\varphi

, with

\mu(c)=\int1_{c(u)\varphi(u)du,}

(

1_c

the characteristic function of). Under such conditions

\mu(tI)=t\mu(I)

as can be shown by approximating the density by a step function and keeping the proportion for each level of the density function, and

v(tI+ds)=f(t\mu(I))+f'(t\mu(I))\mu(ds) .

The diagonal formula has then the form developed by Aumann and Shapley (1974)

(Sv)(ds)=

	1
\int
	0

f'_t\mu(I)(\mu(ds))dt

Above can be vector valued (as long as the function is defined and differentiable on the range of, the above formula makes sense).

In the argument above if the measure contains atoms

\mu(tI)=t\mu(I)

is no longer true—this is why the diagonal formula mostly applies to non-atomic games.

Two approaches were deployed to extend this diagonal formula when the function is no longer differentiable. Mertens goes back to the original formula and takes the derivative after the integral thereby benefiting from the smoothing effect. Neyman took a different approach. Going back to an elementary application of Mertens's approach from Mertens (1980):^[10]

(Sv)(ds)=\lim_\varepsilon

	1
	\varepsilon

	1-\varepsilon
\int
	0

(f(t+\varepsilon\mu(ds))-f(t))dt

This works for example for majority games—while the original diagonal formula cannot be used directly. How Mertens further extends this by identifying symmetries that the Shapley value should be invariant upon, and averaging over such symmetries to create further smoothing effect commuting averages with the derivative operation as above.^[11] A survey for non atomic value is found in Neyman (2002)^[12]

Generalization to coalitions

The Shapley value only assigns values to the individual agents. It has been generalized^[13] to apply to a group of agents C as,

\varphi_C(v)=\sum_T

	(n-\|T\|-\|C\|)! \|T\|!
	(n-\|C\|+1)!

\sum_S(-1)^|C|v(S\cupT) .

In terms of the synergy function

above, this reads^[6] ^[7]

\varphi_C(v)=\sum_C

	w(T)
	\|T\|-\|C\|+1

where the sum goes over all subsets

that contain

This formula suggests the interpretation that the Shapley value of a coalition is to be thought of as the standard Shapley value of a single player, if the coalition

is treated like a single player.

Value of a player to another player

The Shapley value

\varphi_i(v)

was decomposed in^[14] into a matrix of values

\varphi_ij(v)=\sum_S

	(\|S\|-1)! (n-\|S\|)!
	n!

(v(S)-v(S\setminus\{i\})-v(S\setminus\{j\})+v(S\setminus\{i,j\}))

	n
\sum
	t=\|S\|

	1
	t

Each value

\varphi_ij(v)

represents the value of player

to player

. This matrix satisfies

\varphi_i(v)=\sum_j\varphi_ij(v)

i.e. the value of player

to the whole game is the sum of their value to all individual players.

In terms of the synergy

defined above, this reads

\varphi_ij(v)=\sum_\{i,\subseteqS\subseteqN}

	w(S)
	\|S\|²

where the sum goes over all subsets

that contain

and

This can be interpreted as sum over all subsets that contain players

and

, where for each subset

you

take the synergy

w(S)

of that subset

divide it by the number of players in the subset

|S|

. Interpret that as the surplus value player

gains from this coalition

further divide this by

|S|

to get the part of player

's value that's attributed to player

In other words, the synergy value of each coalition is evenly divided among all

|S|²

pairs

(i,j)

of players in that coalition, where

generates surplus for

Shapley value regression

Shapley value regression is a statistical method used to measure the contribution of individual predictors in a regression model. In this context, the "players" are the individual predictors or variables in the model, and the "gain" is the total explained variance or predictive power of the model. This method ensures a fair distribution of the total gain among the predictors, attributing each predictor a value representing its contribution to the model's performance. Lipovetsky (2006) discussed the use of Shapley value in regression analysis, providing a comprehensive overview of its theoretical underpinnings and practical applications.^[15]

Shapley value contributions are recognized for their balance of stability and discriminating power, which make them suitable for accurately measuring the importance of service attributes in market research.^[16] Several studies have applied Shapley value regression to key drivers analysis in marketing research. Pokryshevskaya and Antipov (2012) utilized this method to analyze online customers' repeat purchase intentions, demonstrating its effectiveness in understanding consumer behavior.^[17] Similarly, Antipov and Pokryshevskaya (2014) applied Shapley value regression to explain differences in recommendation rates for hotels in South Cyprus, highlighting its utility in the hospitality industry.^[18] Further validation of the benefits of Shapley value in key-driver analysis is provided by Vriens, Vidden, and Bosch (2021), who underscored its advantages in applied marketing analytics.^[19]

In machine learning

The Shapley value provides a principled way to explain the predictions of nonlinear models common in the field of machine learning. By interpreting a model trained on a set of features as a value function on a coalition of players, Shapley values provide a natural way to compute which features contribute to a prediction ^[20] or contribute to the uncertainty of a prediction.^[21] This unifies several other methods including Locally Interpretable Model-Agnostic Explanations (LIME),^[22] DeepLIFT,^[23] and Layer-Wise Relevance Propagation.^[24] ^[25]

External links

Notes and References

Web site: Notes on the n-Person Game -- II: The Value of an n-Person Game . Lloyd S. . Shapley . August 21, 1951 . Santa Monica, Calif. . RAND Corporation . RM-670 .
Book: The Shapley Value: Essays in Honor of Lloyd S. Shapley. Cambridge University Press. 1988. 0-521-36177-X. Roth. Alvin E.. Cambridge. 10.1017/CBO9780511528446.
Book: Hart, Sergiu. The New Palgrave: Game Theory. Norton. 1989. 978-0-333-49537-7. Eatwell. J.. 210–216. Shapley Value. 10.1007/978-1-349-20181-5_25. M.. Milgate. P.. Newman.
Web site: A Bibliography of Cooperative Games: Value Theory . Sergiu . Hart . May 12, 2016 .
For a proof of unique existence, see Book: Ichiishi, Tatsuro . Game Theory for Economic Analysis . New York . Academic Press . 1983 . 0-12-370180-5 . 118–120 .
Grabisch. Michel. October 1997. Alternative Representations of Discrete Fuzzy Measures for Decision Making. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems. en-US. 5. 5. 587–607. 10.1142/S0218488597000440. 0218-4885.
Grabisch . Michel . k-order additive discrete fuzzy measures and their representation . Fuzzy Sets and Systems . 1 December 1997 . 92 . 2 . 167–189 . 10.1016/S0165-0114(97)00168-1. 0165-0114.
Book: Shapley, Lloyd S.. Contributions to the Theory of Games. Princeton University Press. 1953. 9781400881970. Kuhn. H. W.. Annals of Mathematical Studies. 28. 307–317. A Value for n-person Games. 10.1515/9781400881970-018. A. W.. Tucker.
Book: Robert J. . Aumann . Lloyd S. . Shapley . Values of Non-Atomic Games . Princeton Univ. Press . Princeton . 1974 . 0-691-08103-4 .
Mertens . Jean-François . 1980 . Values and Derivatives . . 5 . 4 . 523–552 . 3689325 . 10.1287/moor.5.4.523.
Mertens . Jean-François . 1988 . The Shapley Value in the Non Differentiable Case . International Journal of Game Theory . 17 . 1 . 1–65 . 10.1007/BF01240834 . 118017018 .
Neyman, A., 2002. Value of Games with infinitely many Players, "Handbook of Game Theory with Economic Applications," Handbook of Game Theory with Economic Applications, Elsevier, edition 1, volume 3, number 3, 00. R.J. Aumann & S. Hart (ed.).http://ratio.huji.ac.il/dp/neyman/values.pdf
Grabisch . Michel . Roubens . Marc . An axiomatic approach to the concept of interaction among players in cooperative games . International Journal of Game Theory . 1999 . 28 . 4 . 547–565 . 10.1007/s001820050125. 18033890 .
Hausken . Kjell . Mohr . Matthias . The Value of a Player in n-Person Games . Social Choice and Welfare . 2001 . 18 . 3 . 465–83 . 10.1007/s003550000070 . 41060209 . 27089088 .
Lipovetsky S . Shapley value regression: A method for explaining the contributions of individual predictors to a regression model . Linear Algebra and its Applications . 2006 . 417 . 48-54 . 10.1016/j.laa.2006.04.027.
Pokryshevskaya E, Antipov E . A comparison of methods used to measure the importance of service attributes . International Journal of Market Research . 2014 . 56 . 3 . 283-296 . 10.2501/IJMR-2014-023.
Pokryshevskaya EB, Antipov EA . The strategic analysis of online customers’ repeat purchase intentions . Journal of Targeting, Measurement and Analysis for Marketing . 2012 . 20 . 203-211 . 10.1057/jt.2012.13.
Antipov EA, Pokryshevskaya EB . Explaining differences in recommendation rates: the case of South Cyprus hotels . Economics Bulletin . 2014 . 34 . 4 . 2368-2376.
Vriens M, Vidden C, Bosch N . The benefits of Shapley-value in key-driver analysis . Applied Marketing Analytics . 2021 . 6 . 3 . 269-278.
Lundberg . Scott M. . Lee . Su-In . A Unified Approach to Interpreting Model Predictions . Advances in Neural Information Processing Systems . 2017 . 30 . 4765–4774 . 1705.07874 . 2021-01-30.
Watson. David . O’Hara . Joshua . Tax . Niek . Mudd . Richard . Guy . Ido . Explaining Predictive Uncertainty with Information Theoretic Shapley . Advances in Neural Information Processing Systems . 2023 . 37 . 2306.05724 . 2023-12-19.
Ribeiro . Marco Tulio . Singh . Sameer . Guestrin . Carlos . Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . "Why Should I Trust You?" . ACM . New York, NY, USA . 2016-08-13 . 1135–1144 . 978-1-4503-4232-2 . 10.1145/2939672.2939778 .
Shrikumar . Avanti . Greenside . Peyton . Kundaje . Anshul . Learning Important Features Through Propagating Activation Differences . PMLR . 2017-07-17 . 2640-3498 . 2021-01-30 . 3145–3153.
Bach . Sebastian . Binder . Alexander . Montavon . Grégoire . Klauschen . Frederick . Müller . Klaus-Robert . Klaus-Robert Müller . Samek . Wojciech . Suarez . Oscar Deniz . On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation . PLOS ONE . Public Library of Science (PLoS) . 10 . 7 . 2015-07-10 . 1932-6203 . 10.1371/journal.pone.0130140 . e0130140. 26161953 . 4498753 . 2015PLoSO..1030140B . free .
Antipov . E. A. . Pokryshevskaya . E. B. . 2020 . Interpretable machine learning for demand modeling with high-dimensional data using Gradient Boosting Machines and Shapley values . Journal of Revenue and Pricing Management . 19 . 355-364.

Shapley value explained

Formal definition

In terms of synergy

Examples

Business example

Glove game

Properties

Efficiency

Symmetry

Linearity

Null player

Stand-alone test

Anonymity

Marginalism

Aumann–Shapley value

Generalization to coalitions

Value of a player to another player

Shapley value regression

In machine learning

See also

Further reading

External links

Notes and References