Coupling (probability) explained

In probability theory, coupling is a proof technique that allows one to compare two unrelated random variables (distributions) and by creating a random vector whose marginal distributions correspond to and respectively. The choice of is generally not unique, and the whole idea of "coupling" is about making such a choice so that and can be related in a particularly desirable way.

Definition

Using the standard formalism of probability theory, let

X₁

and

X₂

be two random variables defined on probability spaces

(\Omega_1,F_1,P₁₎

and

(\Omega_2,F_2,P₂₎

. Then a coupling of

X₁

and

X₂

is a new probability space

(\Omega,F,P)

over which there are two random variables

Y₁

and

Y₂

such that

Y₁

has the same distribution as

X₁

while

Y₂

has the same distribution as

X₂

An interesting case is when

Y₁

and

Y₂

are not independent.

Examples

Random walk

Assume two particles A and B perform a simple random walk in two dimensions, but they start from different points. The simplest way to couple them is simply to force them to walk together. On every step, if A walks up, so does B, if A moves to the left, so does B, etc. Thus, the difference between the two particles' positions stays fixed. As far as A is concerned, it is doing a perfect random walk, while B is the copycat. B holds the opposite view, i.e. that it is, in effect, the original and that A is the copy. And in a sense they both are right. In other words, any mathematical theorem, or result that holds for a regular random walk, will also hold for both A and B.

Consider now a more elaborate example. Assume that A starts from the point (0,0) and B from (10,10). First couple them so that they walk together in the vertical direction, i.e. if A goes up, so does B, etc., but are mirror images in the horizontal direction i.e. if A goes left, B goes right and vice versa. We continue this coupling until A and B have the same horizontal coordinate, or in other words are on the vertical line (5,y). If they never meet, we continue this process forever (the probability of that is zero, though). After this event, we change the coupling rule. We let them walk together in the horizontal direction, but in a mirror image rule in the vertical direction. We continue this rule until they meet in the vertical direction too (if they do), and from that point on, we just let them walk together.

This is a coupling in the sense that neither particle, taken on its own, can "feel" anything we did. Neither the fact that the other particle follows it in one way or the other, nor the fact that we changed the coupling rule or when we did it. Each particle performs a simple random walk. And yet, our coupling rule forces them to meet almost surely and to continue from that point on together permanently. This allows one to prove many interesting results that say that "in the long run", it is not important where you started in order to obtain that particular result.

Biased coins

Assume two biased coins, the first with probability p of turning up heads and the second with probability q > p of turning up heads. Intuitively, if both coins are tossed the same number of times, we should expect the first coin turns up fewer heads than the second one. More specifically, for any fixed k, the probability that the first coin produces at least k heads should be less than the probability that the second coin produces at least k heads. However proving such a fact can be difficult with a standard counting argument.^[1] Coupling easily circumvents this problem.

Let X₁, X₂, ..., X_n be indicator variables for heads in a sequence of flips of the first coin. For the second coin, define a new sequence Y₁, Y₂, ..., Y_n such that

if X_i = 1, then Y_i = 1,
if X_i = 0, then Y_i = 1 with probability (q − p)/(1 − p).

Then the sequence of Y_i has exactly the probability distribution of tosses made with the second coin. However, because Y_i depends on X_i, a toss by toss comparison of the two coins is now possible. That is, for any k ≤ n

\Pr(X₁+ … +X_n>k)\leq\Pr(Y₁+ … +Y_n>k).

Convergence of Markov Chains to a stationary distribution

Initialize one process

X_n

outside the stationary distribution and initialize another process

Y_n

inside the stationary distribution. Couple these two independent processes together

(X_n,Y_n)

. As you let time run these two processes will evolve independently. Under certain conditions, these two processes will eventually meet and can be considered the same process at that point. This means that the process outside the stationary distribution converges to the stationary distribution.

References

Book: Lindvall, T. . Lectures on the coupling method . Wiley . New York . 1992 . 0-471-54025-0 .
Book: Thorisson, H. . Coupling, Stationarity, and Regeneration . Springer . New York . 2000 .

Notes and References

Book: Dubhashi . Devdatt . Panconesi . Alessandro . Concentration of Measure for the Analysis of Randomized Algorithms . Cambridge University Press . June 15, 2009. 1st . 978-0-521-88427-3. 91.