Nucleotide diversity explained

Nucleotide diversity is a concept in molecular genetics which is used to measure the degree of polymorphism within a population.[1]

One commonly used measure of nucleotide diversity was first introduced by Nei and Li in 1979. This measure is defined as the average number of nucleotide differences per site between two DNA sequences in all possible pairs in the sample population, and is denoted by

\pi

.

An estimator for

\pi

is given by:

\hat{\pi}=

n
n-1

\sumijxixj\piij=

n
n-1
n
\sum
i=2
i-1
\sum
j=1

2xixj\piij

where

xi

and

xj

are the respective frequencies of the

i

th and

j

th sequences,

\piij

is the number of nucleotide differences per nucleotide site between the

i

th and

j

th sequences, and

n

is the number of sequences in the sample. The term in front of the sums guarantees an unbiased estimator, which does not depend on how many sequences you sample.[2]

Nucleotide diversity is a measure of genetic variation. It is usually associated with other statistical measures of population diversity, and is similar to expected heterozygosity. This statistic may be used to monitor diversity within or between ecological populations, to examine the genetic variation in crops and related species,[3] or to determine evolutionary relationships.[4]

Nucleotide diversity can be calculated by examining the DNA sequences directly, or may be estimated from molecular marker data, such as Random Amplified Polymorphic DNA (RAPD) data [5] and Amplified Fragment Length Polymorphism (AFLP) data.[6]

Software

Notes and References

  1. Mathematical Model for Studying Genetic Variation in Terms of Restriction Endonucleases. PNAS. October 1, 1979. Masatoshi Nei . Wen-Hsiung Li. 76. 10. 5269–73. 291943 . 413122 . 10.1073/pnas.76.10.5269. Nei, M.. 1979PNAS...76.5269N. free.
  2. Nei . M . Tajima . F . DNA polymorphism detectable by restriction endonucleases. . Genetics . January 1981 . 97 . 1 . 145–63 . 10.1093/genetics/97.1.145 . 6266912. 1214380 .
  3. Molecular diversity at 18 loci in 321 wild and 92 domesticate lines reveal no reduction of nucleotide diversity during Triticum monococcum (Einkorn) domestication: implications for the origin of agriculture. Molecular Biology and Evolution. December 2007. Ozkan H . Walther A . Kohl J . Dagan T . Salamini F . Martin W. 24. 12. 2657–68. 17898361. 10.1093/molbev/msm192. Kilian, B. free. 11858/00-001M-0000-0012-37D5-9. free.
  4. Nucleotide diversity in gorillas. Genetics. March 2004. Jensen-Seaman MI . Chemnick L . Ryder O . Li WH. 166. 3. 1375–83. 15082556 . 1470796 . 10.1534/genetics.166.3.1375. Yu, N..
  5. Estimating nucleotide diversity from random amplified polymorphic DNA and amplified fragment length polymorphism data. Molecular Phylogenetics and Evolution. January 2001. 18. 1. 143–8. 11161751 . 10.1006/mpev.2000.0865. Borowsky, Richard L..
  6. A method for estimating nucleotide diversity from AFLP data. Genetics. Ryohei Terauchib Günter Kahlb . Fumio Tajima. Fumio Tajima. 151. 3. 1157–64. 10049931 . Innan, Hideki. March 1, 1999. 10.1093/genetics/151.3.1157. 1460529 .