Van Houtum distribution explained

In probability theory and statistics, the Van Houtum distribution is a discrete probability distribution named after prof. Geert-Jan van Houtum.[1] It can be characterized by saying that all values of a finite set of possible values are equally probable, except for the smallest and largest element of this set. Since the Van Houtum distribution is a generalization of the discrete uniform distribution, i.e. it is uniform except possibly at its boundaries, it is sometimes also referred to as quasi-uniform.

It is regularly the case that the only available information concerning some discrete random variable are its first two moments. The Van Houtum distribution can be used to fit a distribution with finite support on these moments.

A simple example of the Van Houtum distribution arises when throwing a loaded dice which has been tampered with to land on a 6 twice as often as on a 1. The possible values of the sample space are 1, 2, 3, 4, 5 and 6. Each time the die is thrown, the probability of throwing a 2, 3, 4 or 5 is 1/6; the probability of a 1 is 1/9 and the probability of throwing a 6 is 2/9.

Probability mass function

A random variable U has a Van Houtum (a, b, pa, pb) distribution if its probability mass function is

\Pr(U=u)=\begin{cases}pa&ifu=a;\\[8pt] pb&ifu=b\\[8pt] \dfrac{1-pa-pb}{b-a-1}&ifa<u<b\\[8pt] 0&otherwise\end{cases}

Fitting procedure

Suppose a random variable

X

has mean

\mu

and squared coefficient of variation

c2

. Let

U

be a Van Houtum distributed random variable. Then the first two moments of

U

match the first two moments of

X

if

a

,

b

,

pa

and

pb

are chosen such that:[2]

\begin{align} a&=\left\lceil\mu-

1
2

\left\lceil\sqrt{1+12c2\mu2}\right\rceil\right\rceil\\[8pt] b&=\left\lfloor\mu+

1
2

\left\lceil\sqrt{1+12c2\mu2}\right\rceil\right\rfloor\\[8pt] pb&=

(c2+1)\mu2-A-(a2-A)(2\mu-a-b)/(a-b)
a2+b2-2A

\\[8pt] pa&=

2\mu-a-b
a-b

+pb\\[12pt] whereA&=

2a2+a+2ab-b+2b2
6

. \end{align}

There does not exist a Van Houtum distribution for every combination of

\mu

and

c2

. By using the fact that for any real mean

\mu

the discrete distribution on the integers that has minimal variance is concentrated on the integers

\lfloor\mu\rfloor

and

\lceil\mu\rceil

, it is easy to verify that a Van Houtum distribution (or indeed any discrete distribution on the integers) can only be fitted on the first two moments if [3]

c2\mu2\geq(\mu-\lfloor\mu\rfloor)(1+\mu-\lceil\mu\rceil)2+(\mu-\lfloor\mu\rfloor)2(1+\mu-\lceil\mu\rceil).

Notes and References

  1. A. Saura (2012), Van Houtumin jakauma (in Finnish). BSc Thesis, University of Helsinki, Finland
  2. J.J. Arts (2009), Efficient optimization of the Dual-Index policy using Markov Chain approximations. MSc Thesis, Eindhoven University of Technology, The Netherlands (Appendix B)
  3. I.J.B.F. Adan, M.J.A. van Eenige, and J.A.C. Resing. "Fitting discrete distributions on thefirst two moments". Probability in the Engineering and Informational Sciences, 9:623–632,1996.