In statistics, identifiability is a property which a model must satisfy for precise inference to be possible. A model is identifiable if it is theoretically possible to learn the true values of this model's underlying parameters after obtaining an infinite number of observations from it. Mathematically, this is equivalent to saying that different values of the parameters must generate different probability distributions of the observable variables. Usually the model is identifiable only under certain technical restrictions, in which case the set of these requirements is called the identification conditions.
A model that fails to be identifiable is said to be non-identifiable or unidentifiable: two or more parametrizations are observationally equivalent. In some cases, even though a model is non-identifiable, it is still possible to learn the true values of a certain subset of the model parameters. In this case we say that the model is partially identifiable. In other cases it may be possible to learn the location of the true parameter up to a certain finite region of the parameter space, in which case the model is set identifiable.
Aside from strictly theoretical exploration of the model properties, identifiability can be referred to in a wider scope when a model is tested with experimental data sets, using identifiability analysis.[1]
Let
l{P}=\{P\theta:\theta\in\Theta\}
\Theta
l{P}
\theta\mapstoP\theta
P | |
\theta1 |
=P | |
\theta2 |
⇒ \theta1=\theta2 forall\theta1,\theta2\in\Theta.
This definition means that distinct values of θ should correspond to distinct probability distributions: if θ1≠θ2, then also Pθ1≠Pθ2. If the distributions are defined in terms of the probability density functions (pdfs), then two pdfs should be considered distinct only if they differ on a set of non-zero measure (for example two functions ƒ1(x) = 10 ≤ x < 1 and ƒ2(x) = 10 ≤ x ≤ 1 differ only at a single point x = 1 — a set of measure zero — and thus cannot be considered as distinct pdfs).
Identifiability of the model in the sense of invertibility of the map
\theta\mapstoP\theta
1 | |
T |
T | |
\sum | |
t=1 |
1 | |
\{Xt\inA\ |
\theta\mapstoP\theta
Let
l{P}
l{P}=\{ f\theta(x)=\tfrac{1}{\sqrt{2\pi}\sigma}
| ||||||
e |
| \theta=(\mu,\sigma):\mu\inR,\sigma>0 \}.
\begin{align} &
f | |
\theta1 |
(x)=f | |
\theta2 |
(x)\\[6pt] \Longleftrightarrow{}&
1 | |
\sqrt{2\pi |
\sigma1}\exp\left(-
1 | ||||||
|
2 | |
(x-\mu | |
1) |
\right)=
1 | |
\sqrt{2\pi |
\sigma2}\exp\left(-
1 | ||||||
|
2 | |
(x-\mu | |
2) |
\right)\\[6pt] \Longleftrightarrow{}&
1 | ||||||
|
2 | |
(x-\mu | |
1) |
+ln\sigma1=
1 | ||||||
|
2 | |
(x-\mu | |
2) |
+ln\sigma2\\[6pt] \Longleftrightarrow{}&x2\left(
1 | - | |||||
|
1 | ||||||
|
\right)-2x\left(
\mu1 | - | |||||
|
\mu2 | ||||||
|
\right)+\left(
| - | |||||||
|
| |||||||
|
+ln\sigma1-ln\sigma2\right)=0 \end{align}
Let
l{P}
y=\beta'x+\varepsilon, E[\varepsilon\midx]=0
E[xx']
Suppose
l{P}
\begin{cases} y=\betax*+\varepsilon,\\ x=x*+η, \end{cases}
If we abandon the normality assumption and require that x* were not normally distributed, retaining only the independence condition ε ⊥ η ⊥ x*, then the model becomes identifiable.