In statistics, Tschuprow's T is a measure of association between two nominal variables, giving a value between 0 and 1 (inclusive). It is closely related to Cramér's V, coinciding with it for square contingency tables.It was published by Alexander Tschuprow (alternative spelling: Chuprov) in 1939.[1]
For an r × c contingency table with r rows and c columns, let
\piij
(i,j)
\pii+
c\pi | |
=\sum | |
ij |
\pi+j
r\pi | |
=\sum | |
ij |
.
Then the mean square contingency is given as
\phi2=
| ||||
\sum | ||||
j=1 |
,
and Tschuprow's T as
T=\sqrt{
\phi2 | |
\sqrt{(r-1)(c-1) |
T equals zero if and only if independence holds in the table, i.e., if and only if
\piij=\pii+\pi+j
\piij>0
If we have a multinomial sample of size n, the usual way to estimate T from the data is via the formula
\hatT=\sqrt{
| ||||||||||
\sqrt{(r-1)(c-1) |
where
pij=nij/n
(i,j)
\chi2
\hatT=\sqrt{
\chi2/n | |
\sqrt{(r-1)(c-1) |
Other measures of correlation for nominal data:
Other related articles: