In the theory of finite population sampling, Bernoulli sampling is a sampling process where each element of the population is subjected to an independent Bernoulli trial which determines whether the element becomes part of the sample. An essential property of Bernoulli sampling is that all elements of the population have equal probability of being included in the sample.[1]
Bernoulli sampling is therefore a special case of Poisson sampling. In Poisson sampling each element of the population may have a different probability of being included in the sample. In Bernoulli sampling, the probability is equal for all the elements.
Because each element of the population is considered separately for the sample, the sample size is not fixed but rather follows a binomial distribution.
The most basic Bernoulli method generates n random variates to extract a sample from a population of n items. Suppose you want to extract a given percentage pct of the population. The algorithm can be described as follows:[2]
for each item in the set generate a random non-negative integer R if (R mod 100) < pct then select item
A percentage of 20%, say, is usually expressed as a probability p=0.2. In that case, random variates are generated in the unit interval. After running the algorithm, a sample of size k will have been selected. One would expect to have
k ≈ n ⋅ p
On the left this function is shown for four values of
n
p=0.2
n
k
\left[0,n\right]
On the right the minimum values of
n
k
The probability to end up within
K
The picture shows the lowest values of
n
p=0.0
p=1.00
n
p
100 ⋅ p
error=0.005
100 ⋅ k/n=100 ⋅ p
n=38400