Bhattacharyya distance explained
In statistics, the Bhattacharyya distance is a quantity which represents a notion of similarity between two probability distributions.[1] It is closely related to the Bhattacharyya coefficient, which is a measure of the amount of overlap between two statistical samples or populations.
It is not a metric, despite being named a "distance", since it does not obey the triangle inequality.
History
Both the Bhattacharyya distance and the Bhattacharyya coefficient are named after Anil Kumar Bhattacharyya, a statistician who worked in the 1930s at the Indian Statistical Institute.[2] He has developed this through a series of papers.[3] [4] He developed the method to measure the distance between two non-normal distributions and illustrated this with the classical multinomial populations,[5] this work despite being submitted for publication in 1941, appeared almost five years later in Sankhya. Consequently, Professor Bhattacharyya started working toward developing a distance metric for probability distributions that are absolutely continuous with respect to the Lebesgue measure and published his progress in 1942, at Proceedings of the Indian Science Congress and the final work has appeared in 1943 in the Bulletin of the Calcutta Mathematical Society.
Definition
For probability distributions
and
on the same
domain
, the Bhattacharyya distance is defined as
DB(P,Q)=-ln\left(BC(P,Q)\right)
where
} \sqrtis the Bhattacharyya coefficient for
discrete probability distributions.
For continuous probability distributions, with
and
where
and
are the
probability density functions, the Bhattacharyya coefficient is defined as
} \sqrt\, dx.
More generally, given two probability measures
on a measurable space
, let
be a (
sigma finite) measure such that
and
are
absolutely continuous with respect to
i.e. such that
, and
for probability density functions
with respect to
defined
-almost everywhere. Such a measure, even such a probability measure, always exists, e.g.
. Then define the Bhattacharyya measure on
by
bc(dx|P,Q)=\sqrt{p(x)q(x)}λ(dx)=\sqrt{
(x)}λ(dx).
It does not depend on the measure
, for if we choose a measure
such that
and an other measure choice
are absolutely continuous i.e.
and
, then
P(dx)=p(x)λ(dx)=p'(x)λ'(dx)=p(x)l(x)\mu(dx)=p'(x)l'(x)\mu(dx)
,and similarly for
. We then have
bc(dx|P,Q)=\sqrt{p(x)q(x)}λ(dx)=\sqrt{p(x)q(x)}l(x)\mu(x)=\sqrt{p(x)l(x)q(x)l(x)}\mu(dx)=\sqrt{p'(x)l'(x)q'(x)l'(x)}\mu(dx)=\sqrt{p'(x)q'(x)}λ'(dx)
.We finally define the Bhattacharyya coefficient
BC(P,Q)=\intlXbc(dx|P,Q)=\intl{X
} \sqrt\, \lambda(dx). By the above, the quantity
does not depend on
, and by the Cauchy inequality
. In particular if
is absolutely continuous wrt to
with Radon Nikodym derivative
, then
Gaussian case
Let
,
, where
is the
normal distribution with mean
and variance
; then
.
And in general, given two multivariate normal distributions
pi=l{N}(\boldsymbol\mui,\boldsymbol\Sigmai)
,
DB(p1,p2)={1\over8}(\boldsymbol\mu1-\boldsymbol\mu
\boldsymbol\Sigma-1(\boldsymbol\mu1-\boldsymbol\mu2)+{1\over2}ln\left({\det\boldsymbol\Sigma\over\sqrt{\det\boldsymbol\Sigma1\det\boldsymbol\Sigma2}}\right)
,
where
\boldsymbol\Sigma={\boldsymbol\Sigma1+\boldsymbol\Sigma2\over2}.
[6] Note that the first term is a squared
Mahalanobis distance.
Properties
and
.
does not obey the
triangle inequality, though the
Hellinger distance
does.
Bounds on Bayes Error
The Bhattacharyya distance can be used to upper and lower bound the Bayes error rate:
where
\rho=E\sqrt{η(X)(1-η(X))}
and
is the posterior probability.
[7] Applications
The Bhattacharyya coefficient quantifies the "closeness" of two random statistical samples.
Given two sequences from distributions
, bin them into
buckets, and let the frequency of samples from
in bucket
be
, and similarly for
, then the sample Bhattacharyya coefficient is
which is an estimator of
. The quality of estimation depends on the choice of buckets; too few buckets would overestimate
, while too many would underestimate.
A common task in classification is estimating the separability of classes. Up to a multiplicative factor, the squared Mahalanobis distance is a special case of the Bhattacharyya distance when the two classes are normally distributed with the same variances. When two classes have similar means but significantly different variances, the Mahalanobis distance would be close to zero, while the Bhattacharyya distance would not be.
The Bhattacharyya coefficient is used in the construction of polar codes.[8]
The Bhattacharyya distance is used in feature extraction and selection,[9] image processing,[10] speaker recognition,[11] phone clustering,[12] and in genetics.[13]
See also
External links
-
- Some of the properties of Bhattacharyya Distance
- Nielsen, F.; Boltz, S. (2010). "The Burbea–Rao and Bhattacharyya centroids". IEEE Transactions on Information Theory. 57 (8): 5455–5466.[14]
- Kailath, T. (1967). "The Divergence and Bhattacharyya Distance Measures in Signal Selection". IEEE Transactions on Communication Technology. 15 (1): 52–60.[15]
- Djouadi, A.; Snorrason, O.; Garber, F. (1990). "The quality of Training-Sample estimates of the Bhattacharyya coefficient". IEEE Transactions on Pattern Analysis and Machine Intelligence. 12 (1): 92–97.[16]
Notes and References
- Book: Dodge, Yadolah . The Oxford Dictionary of Statistical Terms . 2003 . Oxford University Press . 978-0-19-920613-1 . en.
- Sen . Pranab Kumar . 1996 . Anil Kumar Bhattacharyya (1915-1996): A Reverent Remembrance . Calcutta Statistical Association Bulletin. 46 . 3–4 . 151–158 . 10.1177/0008068319960301 . 164326977 .
- Bhattacharyya . A . 1942 . On discrimination and divergence . Proceedings of the Indian Science Congress . Asiatic Society of Bengal.
- Bhattacharyya . A. . March 1943 . On a measure of divergence between two statistical populations defined by their probability distributions . . 35 . 99–109 . 0010358.
- Bhattacharyya . A. . On a Measure of Divergence between Two Multinomial Populations . Sankhyā. 1946 . 7 . 4 . 401–406 . 25047882 .
- Kashyap . Ravi . 2019 . The Perfect Marriage and Much More: Combining Dimension Reduction, Distance Measures and Covariance . Physica A: Statistical Mechanics and its Applications . 536 . 120938 . 10.1016/j.physa.2019.04.174. 1603.09060 .
- Devroye, L., Gyorfi, L. & Lugosi, G. A Probabilistic Theory of Pattern Recognition. Discrete Appl Math 73, 192–194 (1997).
- Arıkan . Erdal . July 2009 . Channel polarization: A method for constructing capacity-achieving codes for symmetric binary-input memoryless channels . IEEE Transactions on Information Theory . 55 . 7 . 3051–3073 . 0807.3917 . 10.1109/TIT.2009.2021379. 889822 .
- Euisun Choi, Chulhee Lee, "Feature extraction based on the Bhattacharyya distance", Pattern Recognition, Volume 36, Issue 8, August 2003, Pages 1703–1709
- François Goudail, Philippe Réfrégier, Guillaume Delyon, "Bhattacharyya distance as a contrast parameter for statistical processing of noisy optical images", JOSA A, Vol. 21, Issue 7, pp. 1231−1240 (2004)
- Chang Huai You, "An SVM Kernel With GMM-Supervector Based on the Bhattacharyya Distance for Speaker Recognition", Signal Processing Letters, IEEE, Vol 16, Is 1, pp. 49-52
- Mak, B., "Phone clustering using the Bhattacharyya distance", Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on, Vol 4, pp. 2005–2008 vol.4, 3−6 Oct 1996
- Chattopadhyay . Aparna . Chattopadhyay . Asis Kumar . B-Rao . Chandrika . 2004-06-01 . Bhattacharyya’s distance measure as a precursor of genetic distance measures . Journal of Biosciences . en . 29 . 2 . 135–138 . 10.1007/BF02703410 . 0973-7138.
- Nielsen . Frank . Boltz . Sylvain . 2011 . The Burbea-Rao and Bhattacharyya Centroids . IEEE Transactions on Information Theory . 57 . 8 . 5455–5466 . 1004.5049 . 10.1109/TIT.2011.2159046 . 0018-9448 . 14238708.
- Kailath . T. . 1967 . The Divergence and Bhattacharyya Distance Measures in Signal Selection . IEEE Transactions on Communications . en . 15 . 1 . 52–60 . 10.1109/TCOM.1967.1089532 . 0096-2244.
- Djouadi . A. . Snorrason . O. . Garber . F.D. . 1990 . The quality of training sample estimates of the Bhattacharyya coefficient . IEEE Transactions on Pattern Analysis and Machine Intelligence . 12 . 1 . 92–97 . 10.1109/34.41388.