Mersenne Twister Explained

The Mersenne Twister is a general-purpose pseudorandom number generator (PRNG) developed in 1997 by and .[1] [2] Its name derives from the choice of a Mersenne prime as its period length.

The Mersenne Twister was designed specifically to rectify most of the flaws found in older PRNGs.

The most commonly used version of the Mersenne Twister algorithm is based on the Mersenne prime

219937-1

. The standard implementation of that, MT19937, uses a 32-bit word length. There is another implementation (with five variants[3]) that uses a 64-bit word length, MT19937-64; it generates a different sequence.

k-distribution

A pseudorandom sequence

xi

of w-bit integers of period P is said to be k-distributed to v-bit accuracy if the following holds.

Let truncv(x) denote the number formed by the leading v bits of x, and consider P of the k v-bit vectors

(\operatorname{trunc}v(xi),\operatorname{trunc}v(xi+1),\ldots,\operatorname{trunc}v(xi+k-1))(0\leqi<P)

.

Then each of the

2kv

possible combinations of bits occurs the same number of times in a period, except for the all-zero combination that occurs once less often.

Algorithmic detail

For a w-bit word length, the Mersenne Twister generates integers in the range

[0,2w-1]

.

bf{F}2

. The algorithm is a twisted generalised feedback shift register[4] (twisted GFSR, or TGFSR) of rational normal form (TGFSR(R)), with state bit reflection and tempering. The basic idea is to define a series

xi

through a simple recurrence relation, and then output numbers of the form
T
x
i
, where T is an invertible

bf{F}2

-matrix called a tempering matrix.

The general algorithm is characterized by the following quantities:

x

,

1\lem<n

0\ler\lew-1

with the restriction that

2nw-r-1

is a Mersenne prime. This choice simplifies the primitivity test and k-distribution test that are needed in the parameter search.

The series

x

is defined as a series of w-bit quantities with the recurrence relation:

xk+n:=xk+m\left(

u
({x
k}

\mid{xk+1

}^l) A \right)\qquad k=0,1,2,\ldots

where

\mid

denotes concatenation of bit vectors (with upper bits on the left),

the bitwise exclusive or (XOR),
u
x
k

means the upper bits of

xk

, and
l
x
k+1

means the lower r bits of

xk+1

.

The subscripts may all be offset by -n

xk:=xk-(n-m)\left(({xk-n

}^u \mid ^l) A \right)\qquad k=n,n+1,n+2,\ldots

where now the LHS,

xk

, is the next generated value in the series in terms of values generated in the past, which are on the RHS.

The twist transformation A is defined in rational normal form as:A = \begin 0 & I_ \\ a_ & (a_, \ldots, a_0) \end with

Iw-1

as the

(w-1)(w-1)

identity matrix. The rational normal form has the benefit that multiplication by A can be efficiently expressed as: (remember that here matrix multiplication is being done in

bf{F}2

, and therefore bitwise XOR takes the place of addition)\boldsymbolA = \begin\boldsymbol \gg 1 & x_0 = 0\\(\boldsymbol \gg 1) \oplus \boldsymbol & x_0 = 1\endwhere

x0

is the lowest order bit of

x

.

As like TGFSR(R), the Mersenne Twister is cascaded with a tempering transform to compensate for the reduced dimensionality of equidistribution (because of the choice of A being in the rational normal form). Note that this is equivalent to using the matrix A where

A=T-1*AT

for T an invertible matrix, and therefore the analysis of characteristic polynomial mentioned below still holds.

As with A, we choose a tempering transform to be easily computable, and so do not actually construct T itself. This tempering is defined in the case of Mersenne Twister as

\begin{aligned} y&\equivx((x\ggu)~\And~d)\\ y&\equivy((y\lls)~\And~b)\\ y&\equivy((y\llt)~\And~c)\\ z&\equivy(y\ggl) \end{aligned}

where

x

is the next value from the series,

y

is a temporary intermediate value, and

z

is the value returned from the algorithm, with

\ll

and

\gg

as the bitwise left and right shifts, and

\&

as the bitwise AND. The first and last transforms are added in order to improve lower-bit equidistribution. From the property of TGFSR,

s+t\ge\left\lfloor{

w
2
}\right\rfloor - 1 is required to reach the upper bound of equidistribution for the upper bits.

The coefficients for MT19937 are:

\begin{aligned} (w,n,m,r)&=(32,624,397,31)\\ a&=rm{9908B0DF}16\\ (u,d)&=(11,rm{FFFFFFFF}16)\\ (s,b)&=(7,rm{9D2C5680}16)\\ (t,c)&=(15,rm{EFC60000}16)\\ l&=18\\ \end{aligned}

Note that 32-bit implementations of the Mersenne Twister generally have d = FFFFFFFF16. As a result, the d is occasionally omitted from the algorithm description, since the bitwise and with d in that case has no effect.

The coefficients for MT19937-64 are:[5]

\begin{aligned} (w,n,m,r)=(64,312,156,31)\\ a=rm{B5026F5AA96619E9}16\\ (u,d)=(29,rm{5555555555555555}16)\\ (s,b)=(17,rm{71D67FFFEDA60000}16)\\ (t,c)=(37,rm{FFF7EEE000000000}16)\\ l=43\\ \end{aligned}

Initialization

The state needed for a Mersenne Twister implementation is an array of n values of w bits each. To initialize the array, a w-bit seed value is used to supply

x0

through

xn-1

by setting

x0

to the seed value and thereafter setting

xi=f x (xi-1(xi-1\gg(w-2)))+i

for

i

from

1

to

n-1

.

xn

, not on

x0

.

C code

#include

  1. define n 624
  2. define m 397
  3. define w 32
  4. define r 31
  5. define UMASK (0xffffffffUL << r)
  6. define LMASK (0xffffffffUL >> (w-r))
  7. define a 0x9908b0dfUL
  8. define u 11
  9. define s 7
  10. define t 15
  11. define l 18
  12. define b 0x9d2c5680UL
  13. define c 0xefc60000UL
  14. define f 1812433253UL

typedef struct mt_state;

void initialize_state(mt_state* state, uint32_t seed)

uint32_t random_uint32(mt_state* state)

Comparison with classical GFSR

In order to achieve the

2nw-r-1

theoretical upper limit of the period in a TGFSR,

\phiB(t)

must be a primitive polynomial,

\phiB(t)

being the characteristic polynomial of

B=\begin{pmatrix} 0&Iw&&0&0\\ \vdots&&&&\\ Iw&\vdots&\ddots&\vdots&\vdots\\ \vdots&&&&\\ 0&0&&Iw&0\\ 0&0&&0&Iw\\ S&0&&0&0 \end{pmatrix} \begin{matrix} \\\leftarrowm-throw\\\\\ \end{matrix}

S=\begin{pmatrix}0&Ir\Iw&0\end{pmatrix}A

The twist transformation improves the classical GFSR with the following key properties:

2nw-r-1

(except if initialized with 0)

Variants

CryptMT is a stream cipher and cryptographically secure pseudorandom number generator which uses Mersenne Twister internally.[6] [7] It was developed by Matsumoto and Nishimura alongside Mariko Hagita and Mutsuo Saito. It has been submitted to the eSTREAM project of the eCRYPT network. Unlike Mersenne Twister or its other derivatives, CryptMT is patented.

MTGP is a variant of Mersenne Twister optimised for graphics processing units published by Mutsuo Saito and Makoto Matsumoto.[8] The basic linear recurrence operations are extended from MT and parameters are chosen to allow many threads to compute the recursion in parallel, while sharing their state space to reduce memory load. The paper claims improved equidistribution over MT and performance on an old (2008-era) GPU (Nvidia GTX260 with 192 cores) of 4.7 ms for 5×107 random 32-bit integers.

The SFMT (SIMD-oriented Fast Mersenne Twister) is a variant of Mersenne Twister, introduced in 2006,[9] designed to be fast when it runs on 128-bit SIMD.

Intel SSE2 and PowerPC AltiVec are supported by SFMT. It is also used for games with the Cell BE in the PlayStation 3.[11]

TinyMT is a variant of Mersenne Twister, proposed by Saito and Matsumoto in 2011.[12] TinyMT uses just 127 bits of state space, a significant decrease compared to the original's 2.5 KiB of state. However, it has a period of

2127-1

, far shorter than the original, so it is only recommended by the authors in cases where memory is at a premium.

Characteristics

Advantages:

219937-1

. Note that a long period is not a guarantee of quality in a random number generator, short periods, such as the

232

common in many older software packages, can be problematic.[14]

1\lek\le623

(for a definition of k-distributed, see below)

Disadvantages:

bf{F}2

-algebra.

Applications

The Mersenne Twister is used as default PRNG by the following software:

It is also available in Apache Commons,[47] in the standard C++ library (since C++11),[48] [49] and in Mathematica.[50] Add-on implementations are provided in many program libraries, including the Boost C++ Libraries,[51] the CUDA Library,[52] and the NAG Numerical Library.[53]

The Mersenne Twister is one of two PRNGs in SPSS: the other generator is kept only for compatibility with older programs, and the Mersenne Twister is stated to be "more reliable".[54] The Mersenne Twister is similarly one of the PRNGs in SAS: the other generators are older and deprecated.[55] The Mersenne Twister is the default PRNG in Stata, the other one is KISS, for compatibility with older versions of Stata.[56]

Alternatives

An alternative generator, WELL ("Well Equidistributed Long-period Linear"), offers quicker recovery, and equal randomness, and nearly equal speed.[57]

Marsaglia's xorshift generators and variants are the fastest in the class of LFSRs.[58]

64-bit MELGs ("64-bit Maximally Equidistributed

bf{F}2

-Linear Generators with Mersenne Prime Period") are completely optimized in terms of the k-distribution properties.[59]

The ACORN family (published 1989) is another k-distributed PRNG, which shows similar computational speed to MT, and better statistical properties as it satisfies all the current (2019) TestU01 criteria; when used with appropriate choices of parameters, ACORN can have arbitrarily long period and precision.

The PCG family is a more modern long-period generator, with better cache locality, and less detectable bias using modern analysis methods.[60]

Further reading

External links

Notes and References

  1. Matsumoto. M.. Nishimura. T.. 1998. Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator. ACM Transactions on Modeling and Computer Simulation. 8. 1. 3–30. 10.1.1.215.1141. 10.1145/272991.272995. 3332028.
  2. E.g. Marsland S. (2011) Machine Learning (CRC Press), §4.1.1. Also see the section "Adoption in software systems".
  3. Web site: John Savard. The Mersenne Twister. A subsequent paper, published in the year 2000, gave five additional forms of the Mersenne Twister with period 2^19937-1. All five were designed to be implemented with 64-bit arithmetic instead of 32-bit arithmetic..
  4. Matsumoto. M.. Kurita. Y.. 1992. Twisted GFSR generators. ACM Transactions on Modeling and Computer Simulation. 2. 3. 179–194. 10.1145/146382.146383. 15246234.
  5. Web site: std::mersenne_twister_engine. 2015-07-20. Pseudo Random Number Generation.
  6. Web site: CryptMt and Fubuki. 2017-11-12. eCRYPT. 2012-07-01. https://web.archive.org/web/20120701135329/http://www.ecrypt.eu.org/stream/cryptmtfubuki.html. dead.
  7. Web site: Matsumoto. Makoto. Nishimura. Takuji. Hagita. Mariko. Saito. Mutsuo. 2005. Cryptographic Mersenne Twister and Fubuki Stream/Block Cipher.
  8. 1005.4973v3. cs.MS. Mutsuo Saito. Makoto Matsumoto. Variants of Mersenne Twister Suitable for Graphic Processors. 2010.
  9. Web site: SIMD-oriented Fast Mersenne Twister (SFMT). 4 October 2015. hiroshima-u.ac.jp.
  10. Web site: SFMT:Comparison of speed. 4 October 2015. hiroshima-u.ac.jp.
  11. Web site: PlayStation3 License. 4 October 2015. scei.co.jp.
  12. Web site: Tiny Mersenne Twister (TinyMT). 4 October 2015. hiroshima-u.ac.jp.
  13. P. L'Ecuyer and R. Simard, "TestU01: "A C library for empirical testing of random number generators", ACM Transactions on Mathematical Software, 33, 4, Article 22 (August 2007).
  14. Note: 219937 is approximately 4.3 × 106001; this is many orders of magnitude larger than the estimated number of particles in the observable universe, which is 1087.
  15. Route. Matthew. August 10, 2017. Radio-flaring Ultracool Dwarf Population Synthesis. The Astrophysical Journal. 845. 1. 66. 1707.02212. 2017ApJ...845...66R. 10.3847/1538-4357/aa7ede. 118895524 . free .
  16. Web site: SIMD-oriented Fast Mersenne Twister (SFMT): twice faster than Mersenne Twister. 27 March 2017. Japan Society for the Promotion of Science.
  17. Web site: Makoto Matsumoto. Takuji Nishimura. Dynamic Creation of Pseudorandom Number Generators. 19 July 2015.
  18. Web site: Hiroshi Haramoto. Makoto Matsumoto. Takuji Nishimura. François Panneton. Pierre L'Ecuyer. Efficient Jump Ahead for F2-Linear Random Number Generators. 12 Nov 2015.
  19. Web site: mt19937ar: Mersenne Twister with improved initialization. 4 October 2015. hiroshima-u.ac.jp.
  20. Fog. Agner. 1 May 2015. Pseudo-Random Number Generators for Vector Processors and Multicore Processors. Journal of Modern Applied Statistical Methods. 14. 1. 308–334. 10.22237/jmasm/1430454120. free.
  21. Web site: Random link. 2020-06-04. Dyalog Language Reference Guide.
  22. Web site: RANDOMU (IDL Reference). 2013-08-23. Exelis VIS Docs Center.
  23. Web site: Random Number Generators. 2012-05-29. CRAN Task View: Probability Distributions.
  24. Web site: "Random" class documentation. 2012-05-29. Ruby 1.9.3 documentation.
  25. Web site: random. 2013-11-28. free pascal documentation.
  26. Web site: mt_rand — Generate a better random value. 2016-03-02. PHP Manual.
  27. Web site: NumPy 1.17.0 Release Notes — NumPy v1.21 Manual. 2021-06-29. numpy.org.
  28. Web site: 9.6 random — Generate pseudo-random numbers. 2012-05-29. Python v2.6.8 documentation.
  29. Web site: 8.6 random — Generate pseudo-random numbers. 2012-05-29. Python v3.2 documentation.
  30. Web site: random — Generate pseudo-random numbers — Python 3.8.3 documentation. 2020-06-23. Python 3.8.3 documentation.
  31. Web site: Design choices and extensions. 2014-02-03. CMUCL User's Manual.
  32. Web site: Random states. 2015-09-20. The ECL manual.
  33. Web site: Random Number Generation. SBCL User's Manual.
  34. Web site: Random Numbers · The Julia Language . 2022-06-21 . docs.julialang.org.
  35. Web site: Random Numbers: GLib Reference Manual.
  36. Web site: Random Number Algorithms. 2013-11-21. GNU MP.
  37. Web site: 16.3 Special Utility Matrices. GNU Octave. Built-in Function: rand.
  38. Web site: Random number environment variables. 2013-11-24. GNU Scientific Library.
  39. .
  40. Web site: GAUSS 14 Language Reference.
  41. "uniform". Gretl Function Reference.
  42. Web site: New random-number generator—64-bit Mersenne Twister.
  43. Web site: Probability Distributions — Sage Reference Manual v7.2: Probablity.
  44. Web site: grand - Random numbers. Scilab Help.
  45. Web site: random number generator. 2013-11-21. Maple Online Help.
  46. Web site: Random number generator algorithms. Documentation Center, MathWorks.
  47. Web site: Data Generation. Apache Commons Math User Guide.
  48. Web site: Random Number Generation in C++11. Standard C++ Foundation.
  49. Web site: std::mersenne_twister_engine. 2012-09-25. Pseudo Random Number Generation.
  50. http://reference.wolfram.com/language/tutorial/RandomNumberGeneration.html#569959585
  51. Web site: boost/random/mersenne_twister.hpp. 2012-05-29. Boost C++ Libraries.
  52. Web site: Host API Overview. 2016-08-02. CUDA Toolkit Documentation.
  53. Web site: G05 – Random Number Generators. 2012-05-29. NAG Library Chapter Introduction.
  54. Web site: Random Number Generators. 2013-11-21. IBM SPSS Statistics.
  55. Web site: Using Random-Number Functions. 2013-11-21. SAS Language Reference.
  56. Stata help: set rng -- Set which random-number generator (RNG) to use
  57. P. L'Ecuyer, "Uniform Random Number Generators", International Encyclopedia of Statistical Science, Lovric, Miodrag (Ed.), Springer-Verlag, 2010.
  58. Web site: xorshift*/xorshift+ generators and the PRNG shootout.
  59. Harase. S.. Kimoto. T.. 2018. Implementing 64-bit Maximally Equidistributed F2-Linear Generators with Mersenne Prime Period. ACM Transactions on Mathematical Software. 44. 3. 30:1–30:11. 1505.06582. 10.1145/3159444. 14923086.
  60. Web site: 27 July 2017. The PCG Paper.