Krichevsky–Trofimov estimator explained

In information theory, given an unknown stationary source with alphabet A and a sample w from, the Krichevsky–Trofimov (KT) estimator produces an estimate pi(w) of the probability of each symbol i ∈ A. This estimator is optimal in the sense that it minimizes the worst-case regret asymptotically.

For a binary alphabet and a string w with m zeroes and n ones, the KT estimator pi(w) is defined as:[1]

\begin{align} p0(w)&=

m+1/2
m+n+1

,\\[5pt] p1(w)&=

n+1/2
m+n+1

. \end{align}

This corresponds to the posterior mean of a Beta-Bernoulli posterior distribution with prior

1/2

.For the general case the estimate is made using a Dirichlet-Categorical distribution.

See also

Notes and References

  1. Krichevsky . R. E. . Trofimov . V. K. . 1981 . The Performance of Universal Encoding . IEEE Trans. Inf. Theory . IT-27 . 2 . 199–207 . 10.1109/TIT.1981.1056331 .