In information theory, given an unknown stationary source with alphabet A and a sample w from, the Krichevsky–Trofimov (KT) estimator produces an estimate pi(w) of the probability of each symbol i ∈ A. This estimator is optimal in the sense that it minimizes the worst-case regret asymptotically.
For a binary alphabet and a string w with m zeroes and n ones, the KT estimator pi(w) is defined as:[1]
\begin{align} p0(w)&=
m+1/2 | |
m+n+1 |
,\\[5pt] p1(w)&=
n+1/2 | |
m+n+1 |
. \end{align}
This corresponds to the posterior mean of a Beta-Bernoulli posterior distribution with prior
1/2