In statistics, the RV coefficient[1] is a multivariate generalization of the squared Pearson correlation coefficient (because the RV coefficient takes values between 0 and 1).[2] It measures the closeness of two set of points that may each be represented in a matrix.
The major approaches within statistical multivariate data analysis can all be brought into a common framework in which the RV coefficient is maximised subject to relevant constraints. Specifically, these statistical methodologies include:
One application of the RV coefficient is in functional neuroimaging where it can measure the similarity between two subjects' series of brain scans[3] or between different scans of a same subject.[4]
The definition of the RV-coefficient makes use of ideas[5] concerning the definition of scalar-valued quantities which are called the "variance" and "covariance" of vector-valued random variables. Note that standard usage is to have matrices for the variances and covariances of vector random variables. Given these innovative definitions, the RV-coefficient is then just the correlation coefficient defined in the usual way.
Suppose that X and Y are matrices of centered random vectors (column vectors) with covariance matrix given by
\SigmaXY=\operatorname{E}(XY\top),
\operatorname{COVV}(X,Y)=\operatorname{Tr}(\SigmaXY\SigmaYX).
\operatorname{VAV}(X)=
2) | |
\operatorname{Tr}(\Sigma | |
XX |
.
Then the RV-coefficient is defined by
RV(X,Y)=
\operatorname{COVV | |
(X,Y) |
} {\sqrt{\operatorname{VAV}(X)\operatorname{VAV}(Y)}}.
Even though the coefficient takes values between 0 and 1 by construction, it seldom attains values close to 1 as the denominator is often too large with respect to the maximal attainable value of the denominator.[6]
Given known diagonal blocks
\SigmaXX
\SigmaYY
p x p
q x q
p\leq
\operatorname{Tr}(ΛX\PiΛY),
ΛX
ΛY
\SigmaXX
\SigmaYY
\Pi
p x q
(Ip 0p x )
In light of this, Mordant and Segers[7] proposed an adjusted version of the RV coefficient in which the denominator is the maximal value attainable by the numerator. It reads
\bar{\operatorname{RV}}(X,Y)=
\operatorname{Tr | |
(\Sigma |
XY\SigmaYX)}{\operatorname{Tr}(ΛX\PiΛY)}=
\operatorname{Tr | |
(\Sigma |
XY\SigmaYX
min(p,q) | |
)}{\sum | |
j=1 |
(ΛX)j,j(ΛY)j,j