In mathematics, a random walk, sometimes known as a drunkard's walk, is a stochastic process that describes a path that consists of a succession of random steps on some mathematical space.
An elementary example of a random walk is the random walk on the integer number line
Z
Realizations of random walks can be obtained by Monte Carlo simulation.[2]
A popular random walk model is that of a random walk on a regular lattice, where at each step the location jumps to another site according to some probability distribution. In a simple random walk, the location can only jump to neighboring sites of the lattice, forming a lattice path. In a simple symmetric random walk on a locally finite lattice, the probabilities of the location jumping to each one of its immediate neighbors are the same. The best-studied example is the random walk on the d-dimensional integer lattice (sometimes called the hypercubic lattice)
Zd
If the state space is limited to finite dimensions, the random walk model is called a simple bordered symmetric random walk, and the transition probabilities depend on the location of the state because on margin and corner states the movement is limited.[4]
An elementary example of a random walk is the random walk on the integer number line,
\Z
This walk can be illustrated as follows. A marker is placed at zero on the number line, and a fair coin is flipped. If it lands on heads, the marker is moved one unit to the right. If it lands on tails, the marker is moved one unit to the left. After five flips, the marker could now be on -5, -3, -1, 1, 3, 5. With five flips, three heads and two tails, in any order, it will land on 1. There are 10 ways of landing on 1 (by flipping three heads and two tails), 10 ways of landing on −1 (by flipping three tails and two heads), 5 ways of landing on 3 (by flipping four heads and one tail), 5 ways of landing on −3 (by flipping four tails and one head), 1 way of landing on 5 (by flipping five heads), and 1 way of landing on −5 (by flipping five tails). See the figure below for an illustration of the possible outcomes of 5 flips.
To define this walk formally, take independent random variables
Z1,Z2,...
S0=0
\{Sn\}
\Z
E(Sn)
Sn
A similar calculation, using the independence of the random variables and the fact that
2)=1 | |
E(Z | |
n |
This hints that
E(|Sn|)
\sqrtn
To answer the question of how many times will a random walk cross a boundary line if permitted to continue walking forever, a simple random walk on
Z
If a and b are positive integers, then the expected number of steps until a one-dimensional simple random walk starting at 0 first hits b or −a is ab. The probability that this walk will hit b before −a is
a/(a+b)
O(a+b)
Some of the results mentioned above can be derived from properties of Pascal's triangle. The number of different walks of n steps where each step is +1 or −1 is 2n. For the simple random walk, each of these walks is equally likely. In order for Sn to be equal to a number k it is necessary and sufficient that the number of +1 in the walk exceeds those of −1 by k. It follows +1 must appear (n + k)/2 times among n steps of a walk, hence the number of walks which satisfy
Sn=k
Sn=k
n
If space is confined to
Z
This relation with Pascal's triangle is demonstrated for small values of n. At zero turns, the only possibility will be to remain at zero. However, at one turn, there is one chance of landing on −1 or one chance of landing on 1. At two turns, a marker at 1 could move to 2 or back to zero. A marker at −1, could move to −2 or back to zero. Therefore, there is one chance of landing on −2, two chances of landing on zero, and one chance of landing on 2.
k | −5 | −4 | −3 | −2 | −1 | 0 | 1 | 2 | 3 | 4 | 5 | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
P[S0=k] | 1 | |||||||||||||||||
2P[S1=k] | 1 | 1 | ||||||||||||||||
| 1 | 2 | 1 | |||||||||||||||
| 1 | 3 | 3 | 1 | ||||||||||||||
| 1 | 4 | 6 | 4 | 1 | |||||||||||||
| 1 | 5 | 10 | 10 | 5 | 1 |
The central limit theorem and the law of the iterated logarithm describe important aspects of the behavior of simple random walks on
Z
To be precise, knowing that ,and using Stirling's formula one has
Fixing the scaling , for fixed, and using the expansion when vanishes, it follows
taking the limit (and observing that corresponds to the spacing of the scaling grid) one finds the gaussian density . Indeed, for a absolutely continuous random variable with density it holds
A one-dimensional random walk can also be looked at as a Markov chain whose state space is given by the integers
i=0,\pm1,\pm2,....
0<p<1
See main article: Heterogeneous random walk in one dimension. The heterogeneous random walk draws in each time step a random number that determines the local jumping probabilities and then a random number that determines the actual jump direction. The main question is the probability of staying in each of the various sites after
t
t
In higher dimensions, the set of randomly walked points has interesting geometric properties. In fact, one gets a discrete fractal, that is, a set which exhibits stochastic self-similarity on large scales. On small scales, one can observe "jaggedness" resulting from the grid on which the walk is performed. The trajectory of a random walk is the collection of points visited, considered as a set with disregard to when the walk arrived at the point. In one dimension, the trajectory is simply all points between the minimum height and the maximum height the walk achieved (both are, on average, on the order of
\sqrt{n}
To visualize the two-dimensional case, one can imagine a person walking randomly around a city. The city is effectively infinite and arranged in a square grid of sidewalks. At every intersection, the person randomly chooses one of the four possible routes (including the one originally travelled from). Formally, this is a random walk on the set of all points in the plane with integer coordinates.
To answer the question of the person ever getting back to the original starting point of the walk, this is the 2-dimensional equivalent of the level-crossing problem discussed above. In 1921 George Pólya proved that the person almost surely would in a 2-dimensional random walk, but for 3 dimensions or higher, the probability of returning to the origin decreases as the number of dimensions increases. In 3 dimensions, the probability decreases to roughly 34%.[8] The mathematician Shizuo Kakutani was known to refer to this result with the following quote: "A drunk man will find his way home, but a drunk bird may get lost forever".[9]
The probability of recurrence is in general
p=1-\left(
1 | |
\pid |
\int | |
[-\pi,\pi]d |
| ||||||||||
|
\right)-1
Another variation of this question which was also asked by Pólya is: "if two people leave the same starting point, then will they ever meet again?"[12] It can be shown that the difference between their locations (two independent random walks) is also a simple random walk, so they almost surely meet again in a 2-dimensional walk, but for 3 dimensions and higher the probability decreases with the number of the dimensions. Paul Erdős and Samuel James Taylor also showed in 1960 that for dimensions less or equal than 4, two independent random walks starting from any two given points have infinitely many intersections almost surely, but for dimensions higher than 5, they almost surely intersect only finitely often.[13]
The asymptotic function for a two-dimensional random walk as the number of steps increases is given by a Rayleigh distribution. The probability distribution is a function of the radius from the origin and the step length is constant for each step. Here, the step length is assumed to be 1, N is the total number of steps and r is the radius from the origin.[14]
A Wiener process is a stochastic process with similar behavior to Brownian motion, the physical phenomenon of a minute particle diffusing in a fluid. (Sometimes the Wiener process is called "Brownian motion", although this is strictly speaking a confusion of a model with the phenomenon being modeled.)
A Wiener process is the scaling limit of random walk in dimension 1. This means that if there is a random walk with very small steps, there is an approximation to a Wiener process (and, less accurately, to Brownian motion). To be more precise, if the step size is ε, one needs to take a walk of length L/ε2 to approximate a Wiener length of L. As the step size tends to 0 (and the number of steps increases proportionally), random walk converges to a Wiener process in an appropriate sense. Formally, if B is the space of all paths of length L with the maximum topology, and if M is the space of measure over B with the norm topology, then the convergence is in the space M. Similarly, a Wiener process in several dimensions is the scaling limit of random walk in the same number of dimensions.
A random walk is a discrete fractal (a function with integer dimensions; 1, 2, ...), but a Wiener process trajectory is a true fractal, and there is a connection between the two. For example, take a random walk until it hits a circle of radius r times the step length. The average number of steps it performs is r2. This fact is the discrete version of the fact that a Wiener process walk is a fractal of Hausdorff dimension 2.
In two dimensions, the average number of points the same random walk has on the boundary of its trajectory is r4/3. This corresponds to the fact that the boundary of the trajectory of a Wiener process is a fractal of dimension 4/3, a fact predicted by Mandelbrot using simulations but proved only in 2000by Lawler, Schramm and Werner.[15]
A Wiener process enjoys many symmetries a random walk does not. For example, a Wiener process walk is invariant to rotations, but the random walk is not, since the underlying grid is not (random walk is invariant to rotations by 90 degrees, but Wiener processes are invariant to rotations by, for example, 17 degrees too). This means that in many cases, problems on a random walk are easier to solve by translating them to a Wiener process, solving the problem there, and then translating back. On the other hand, some problems are easier to solve with random walks due to its discrete nature.
Random walk and Wiener process can be coupled, namely manifested on the same probability space in a dependent way that forces them to be quite close. The simplest such coupling is the Skorokhod embedding, but there exist more precise couplings, such as Komlós–Major–Tusnády approximation theorem.
The convergence of a random walk toward the Wiener process is controlled by the central limit theorem, and by Donsker's theorem. For a particle in a known fixed position at t = 0, the central limit theorem tells us that after a large number of independent steps in the random walk, the walker's position is distributed according to a normal distribution of total variance:
where t is the time elapsed since the start of the random walk,
\varepsilon
\deltat
This corresponds to the Green's function of the diffusion equation that controls the Wiener process, which suggests that, after a large number of steps, the random walk converges toward a Wiener process.
In 3D, the variance corresponding to the Green's function of the diffusion equation is:
By equalizing this quantity with the variance associated to the position of the random walker, one obtains the equivalent diffusion coefficient to be considered for the asymptotic Wiener process toward which the random walk converges after a large number of steps: (valid only in 3D).
The two expressions of the variance above correspond to the distribution associated to the vector
\vecR
Rx
Ry
Rz
For 2D:[16]
For 1D:[17]
A random walk having a step size that varies according to a normal distribution is used as a model for real-world time series data such as financial markets.
Here, the step size is the inverse cumulative normal distribution
\Phi-1(z,\mu,\sigma)
If μ is nonzero, the random walk will vary about a linear trend. If vs is the starting value of the random walk, the expected value after n steps will be vs + nμ.
For the special case where μ is equal to zero, after n steps, the translation distance's probability distribution is given by N(0, nσ2), where N is the notation for the normal distribution, n is the number of steps, and σ is from the inverse cumulative normal distribution as given above.
Proof: The Gaussian random walk can be thought of as the sum of a sequence of independent and identically distributed random variables, Xi from the inverse cumulative normal distribution with mean equal zero and σ of the original inverse cumulative normal distribution:
Z=
n | |
\sum | |
i=0 |
{Xi},
but we have the distribution for the sum of two independent normally distributed random variables,, is given by (see here).
In our case, and yieldBy induction, for n steps we have
Z\siml{N}(0,n\sigma2).
\sqrt{Var(Sn)}=
2]} | |
\sqrt{E[S | |
n |
=\sigma\sqrt{n}.
But for the Gaussian random walk, this is just the standard deviation of the translation distance's distribution after n steps. Hence, if μ is equal to zero, and since the root mean square(RMS) translation distance is one standard deviation, there is 68.27% probability that the RMS translation distance after n steps will fall between
\pm\sigma\sqrt{n}
\pm0.6745\sigma\sqrt{n}
The number of distinct sites visited by a single random walker
S(t)
The information rate of a Gaussian random walk with respect to the squared error distance, i.e. its quadratic rate distortion function, is given parametrically by[25] where
S(\varphi)=\left(2\sin(\pi\varphi/2)\right)-2
{\{Zn\}
N} | |
n=1 |
NR(D\theta)
D\theta
\varepsilon>0
N\inN
NR(D\theta) | |
2 |
{\{Zn\}
N} | |
n=1 |
D\theta-\varepsilon
As mentioned the range of natural phenomena which have been subject to attempts at description by some flavour of random walks is considerable, particularly in physics[26] [27] and chemistry,[28] materials science,[29] [30] and biology.[31] [32] [33] The following are some specific applications of random walks:
A number of types of stochastic processes have been considered that are similar to the pure random walks but where the simple structure is allowed to be more generalized. The pure structure can be characterized by the steps being defined by independent and identically distributed random variables. Random walks can take place on a variety of spaces, such as graphs, the integers, the real line, the plane or higher-dimensional vector spaces, on curved surfaces or higher-dimensional Riemannian manifolds, and on groups. It is also possible to define random walks which take their steps at random times, and in that case, the position has to be defined for all times . Specific cases or limits of random walks include the Lévy flight and diffusion models such as Brownian motion.
A random walk of length k on a possibly infinite graph G with a root 0 is a stochastic process with random variables
X1,X2,...,Xk
X1=0
{Xi+1
Xi
pv,w,k(G)
p0,0,2k
2k
Building on the analogy from the earlier section on higher dimensions, assume now that our city is no longer a perfect square grid. When our person reaches a certain junction, he picks between the variously available roads with equal probability. Thus, if the junction has seven exits the person will go to each one with probability one-seventh. This is a random walk on a graph. Will our person reach his home? It turns out that under rather mild conditions, the answer is still yes, but depending on the graph, the answer to the variant question 'Will two persons meet again?' may not be that they meet infinitely often almost surely.[43]
An example of a case where the person will reach his home almost surely is when the lengths of all the blocks are between a and b (where a and b are any two finite positive numbers). Notice that we do not assume that the graph is planar, i.e. the city may contain tunnels and bridges. One way to prove this result is using the connection to electrical networks. Take a map of the city and place a one ohm resistor on every block. Now measure the "resistance between a point and infinity". In other words, choose some number R and take all the points in the electrical network with distance bigger than R from our point and wire them together. This is now a finite electrical network, and we may measure the resistance from our point to the wired points. Take R to infinity. The limit is called the resistance between a point and infinity. It turns out that the following is true (an elementary proof can be found in the book by Doyle and Snell):
Theorem: a graph is transient if and only if the resistance between a point and infinity is finite. It is not important which point is chosen if the graph is connected.
In other words, in a transient system, one only needs to overcome a finite resistance to get to infinity from any point. In a recurrent system, the resistance from any point to infinity is infinite.
This characterization of transience and recurrence is very useful, and specifically it allows us to analyze the case of a city drawn in the plane with the distances bounded.
A random walk on a graph is a very special case of a Markov chain. Unlike a general Markov chain, random walk on a graph enjoys a property called time symmetry or reversibility. Roughly speaking, this property, also called the principle of detailed balance, means that the probabilities to traverse a given path in one direction or the other have a very simple connection between them (if the graph is regular, they are just equal). This property has important consequences.
Starting in the 1980s, much research has gone into connecting properties of the graph to random walks. In addition to the electrical network connection described above, there are important connections to isoperimetric inequalities, see more here, functional inequalities such as Sobolev and Poincaré inequalities and properties of solutions of Laplace's equation. A significant portion of this research was focused on Cayley graphs of finitely generated groups. In many cases these discrete results carry over to, or are derived from manifolds and Lie groups.
In the context of random graphs, particularly that of the Erdős–Rényi model, analytical results to some properties of random walkers have been obtained. These include the distribution of first[44] and last hitting times[45] of the walker, where the first hitting time is given by the first time the walker steps into a previously visited site of the graph, and the last hitting time corresponds the first time the walker cannot perform an additional move without revisiting a previously visited site.
A good reference for random walk on graphs is the online book by Aldous and Fill. For groups see the book of Woess.If the transition kernel
p(x,y)
\omega
\omega
\omega
We can think about choosing every possible edge with the same probability as maximizing uncertainty (entropy) locally. We could also do it globally – in maximal entropy random walk (MERW) we want all paths to be equally probable, or in other words: for every two vertexes, each path of given length is equally probable.[46] This random walk has much stronger localization properties.
There are a number of interesting models of random paths in which each step depends on the past in a complicated manner. All are more complex for solving analytically than the usual random walk; still, the behavior of any model of a random walker is obtainable using computers. Examples include:
The self-avoiding walk of length n on
Zd
Zd
See main article: Biased random walk on a graph.
See main article: Maximal entropy random walk. Random walk chosen to maximize entropy rate, has much stronger localization properties.
Random walks where the direction of movement at one time is correlated with the direction of movement at the next time. It is used to model animal movements.[53] [54]