The sensitivity index or discriminability index or detectability index is a dimensionless statistic used in signal detection theory. A higher index indicates that the signal can be more readily detected.
The discriminability index is the separation between the means of two distributions (typically the signal and the noise distributions), in units of the standard deviation.
For two univariate distributions
a
b
d'
d'=
\left\vert\mua-\mub\right\vert | |
\sigma |
In higher dimensions, i.e. with two multivariate distributions with the same variance-covariance matrix
\Sigma
S
d'=\sqrt{(\boldsymbol{\mu}a-\boldsymbol{\mu}
-1 | |
b)'\Sigma |
(\boldsymbol{\mu}a-\boldsymbol{\mu}b)}=\lVertS-1(\boldsymbol{\mu}a-\boldsymbol{\mu}b)\rVert=\lVert\boldsymbol{\mu}a-\boldsymbol{\mu}b\rVert/\sigma\boldsymbol{\mu
\sigma\boldsymbol{\mu
\boldsymbol{\mu}
d'
d'
For two bivariate distributions with equal variance-covariance, this is given by:
{d'}2=
1 | |
1-\rho2 |
2 | |
\left({d'} | |
y-2\rho |
{d'}x{d'}y\right)
where
\rho
d' | ||||
|
a}x}{\sigmax}
d' | ||||
|
a}y}{\sigmay}
d'
Z(hitrate)-Z(falsealarmrate)
When the two distributions have different standard deviations (or in general dimensions, different covariance matrices), there exist several contending indices, all of which reduce to
d'
This is the maximum (Bayes-optimal) discriminability index for two distributions, based on the amount of their overlap, i.e. the optimal (Bayes) error of classification
eb
ab
d'b=-2Z\left(Bayeserrorrateeb\right)=2Z\left(bestaccuracyrateab\right)
Z
d'b
DKL
DKL(a,b)
d'b(a,b)
d'b
In particular, for a yes/no task between two univariate normal distributions with means
\mua,\mub
va>vb
2 | |
p(A|a)=p({\chi'} | |
1,vaλ |
>vbc),
2 | |
p(B|b)=p({\chi'} | |
1,vbλ |
<vac)
where
\chi'2
λ=\left( | \mua-\mub |
va-vb |
\right)2
c=λ+ | lnva-lnvb |
va-vb |
d' | ||||
|
\right).
d'b
For a two-interval task between these distributions, the optimal accuracy is
ab=p\left(
2 | |
\tilde{\chi} | |
\boldsymbol{w |
,\boldsymbol{k},\boldsymbol{λ},0,0}>0\right)
\tilde{\chi}2
\boldsymbol{w}=\begin{bmatrix}
2 | |
\sigma | |
s |
&
2 | |
-\sigma | |
n |
\end{bmatrix}, \boldsymbol{k}=\begin{bmatrix}1&1\end{bmatrix}, \boldsymbol{λ}=
\mus-\mun | ||||||
|
\begin{bmatrix}
2 | |
\sigma | |
s |
&
2 | |
\sigma | |
n |
\end{bmatrix}
d'b=2Z\left(ab\right)
A common approximate (i.e. sub-optimal) discriminability index that has a closed-form is to take the average of the variances, i.e. the rms of the two standard deviations:
d'a=\left\vert\mua-\mub\right\vert/\sigmarms
da
\sqrt{2}
z
Srms=\left[\left(\Sigmaa+\Sigmab\right)/2
| ||||
\right] |
Another index is
d'e=\left\vert\mua-\mub\right\vert/\sigmaavg
Savg=\left(Sa+Sb\right)/2
It has been shown that for two univariate normal distributions,
d'a\leqd'e\leqd'b
d'a\leqd'e
Thus,
d'a
d'e
d'b
d'a
d'b
d'e
d'b
d'a
d'b
d'e
d'a
The approximate index
d'gm
d'b
In general, the contribution to the total discriminability by each dimension or feature may be measured using the amount by which the discriminability drops when that dimension is removed. If the total Bayes discriminability is
d'
i
d'-i
i
2-{d' | |
\sqrt{d' | |
-i |
i
We may sometimes want to scale the discriminability of two data distributions by moving them closer or farther apart. One such case is when we are modeling a detection or classification task, and the model performance exceeds that of the subject or observed data. In that case, we can move the model variable distributions closer together so that it matches the observed performance, while also predicting which specific data points should start overlapping and be misclassified.
There are several ways of doing this. One is to compute the mean vector and covariance matrix of the two distributions, then effect a linear transformation to interpolate the mean and sd matrix (square root of the covariance matrix) of one of the distributions towards the other.
Another way that is by computing the decision variables of the data points (log likelihood ratio that a point belongs to one distribution vs another) under a multinormal model, then moving these decision variables closer together or farther apart.