The Human Genome Diversity Project (HGDP) was started by Stanford University's Morrison Institute in 1990s along with collaboration of scientists around the world.[1] It is the result of many years of work by Luigi Cavalli-Sforza, one of the most cited scientists in the world, who has published extensively in the use of genetics to understand human migration and evolution. The HGDP data sets have often been cited in papers on such topics as population genetics, anthropology, and heritable disease research.[2] [3]
The project has noted the need to record the genetic profiles of indigenous populations, as isolated populations are the best way to understand the genetic frequencies that have clues into our distant past. Knowing about the relationship between such populations makes it possible to infer the journey of humankind from the humans who left Africa and populated the world to the humans of today. The HGDP-CEPH Human Genome Diversity Cell Line Panel is a resource of 1,063 cultured lymphoblastoid cell lines (LCLs) from 1,050 individuals in 52 world populations, banked at the Fondation Jean Dausset-CEPH in Paris.
The HGDP is not related to the Human Genome Project (HGP) and has attempted to maintain a distinct identity.[4] The whole genome sequencing and analysis of the HGDP was published in 2020, creating a comprehensive resource of genetic variation from underrepresented human populations and illuminating patterns of genetic variation, demographic history and introgression of modern humans with Neanderthals and Denisovans.[5] [6]
The HGDP includes the 51 populations from around the world.[7] A description of the populations that were studied can be found in a 2005 review paper by Cavalli-Sforza:[8]
Africa | Bantu, Biaka, Mandenka, Mbuti pygmy, Mozabite, San, and Yoruba | ||
---|---|---|---|
Asia | Western Asia | Bedouin, Druze, and Palestinian | |
Central & South Asia | Balochi, Brahui, Burusho, Hazara, Kalash, Makrani, Pashtun, Sindhi, and Uyghur | ||
Eastern Asia | Khmer, Dai, Daur, Han (North China), Han (South China), Hezhen, Japanese, Lahu, Miao, Mongola, Naxi, Oroqen, She, Tu, Tujia, Xibo, Yakut, Yi | ||
Native America | Colombian, Karitiana, Maya, Pima, Surui | ||
Europe | Adygei, Basque, French, North Italian, Orcadian, Russian, Sardinian, and Tuscan | ||
Oceania | Melanesian, and Papuan |
One of the most important tenets of the HGDP debate has been the social and ethical implications for indigenous populations, specifically the methods and ethics of informed consent. Some questions include:
These questions are specifically addressed by the HGDP's "Model Ethical Protocol for Collecting DNA Samples".[9]
The scientific community has used the HGDP data to study human migration, mutation rates, relationships between different populations, genes involved in height, and selective pressure. HGDP has been instrumental in assessing human diversity and in providing information about similarities and differences in human populations. The HGDP is the project with the largest scope among the various human diversity databases available.
So far 148 papers have been published using the HGDP database. Authors using HGDP data work in the US, Russia, Brazil, Ireland, Portugal, France, and other countries.
More specifically, HGDP data has been used in studies of evolution and expansion of modern human populations.[10]
Diversity research is relevant in various fields of study ranging from disease surveillance to anthropology. Genomewide-association studies (GWAS) try to associate a genetic mutation with a disease; it is becoming clear that these associations are population-dependent and that understanding human diversity will be a major step toward increasing the power to find genes associated with disease.
To gain a full assessment of human development, scientists must engage in diversity research. This research needs to be conducted as quickly as possible before small native populations such as those in South America become extinct.
Another benefit of genomic diversity mapping would be in disease research. Diversity research could help explain why certain ethnic populations are vulnerable to or resistant to certain diseases and how populations have adapted to vulnerabilities (see race in biomedicine).
The study of human populations has been at the forefront of genomic and clinical research since the Human Genome Project (HGP) was completed. Projects similar to HGDP are the 1000 Genomes Project and the HapMap Project. Each has its own specificities and each has been used by scientists to a large extent for overlapping purposes.
Denouncing the project since its outset, some indigenous communities, NGOs, and human rights organizations have objected to the HGDP's goals based on perceived issues of scientific racism, colonialism, biocolonialism, and informed consent.
The Action Group on Erosion, Technology and Concentration (ETC Group) has been a major critic of the HGDP, speculating that issues of racism and stigmatization could occur should the HGDP be completed. One major concern with the research project has been the potential, in certain countries, for racism resulting from use of HGDP data. Critics feel that when governments are armed with genetic data linked to certain racial groups, those governments might deny human rights based on this genetic data. For example, countries could define races purely in genetic terms and deny a certain person's right(s) based on lack of conformity to a certain race's genetic model.
Eight of nine DNA groups under Ctrl/South category belong to Pakistan even though India is in the same group with about seven times the population of Pakistan and with far more numerous racial diversities. However, it is noteworthy that Rosenberg et al. found that the sampled Pakistani populations are more genetically diverse than 15 Indian populations that were explicitly compared.[11]
Use of HGDP genetic materials for nonmedical purposes not agreed to by indigenous donors, especially purposes that create possibilities for human rights violations, has been a matter of concern. For example, Kidd et al. described the use of DNA samples from indigenous populations to explore a forensic identification capability based on ethnic origins.[12]
Anthropologist Jonathan M. Marks stated: "As any anthropologist knows, ethnic groups are categories of human invention, not given by nature."[13] Some indigenous peoples have refused to take part in the HGDP due to concerns about misuse of the data: "In December [1993], a World Council of Indigenous Peoples in Guatemala repudiated the HGDP."
In 1995, the National Research Council (NRC) issued its recommendations on the HGDP. The NRC endorsed the concept of diversity research, also pointing out some concerns with the HGDP procedure. The NRC report suggested alternatives such as keeping sample sources anonymous (i.e., sampling genetic data without tying it to specific racial groups). While such approaches would eliminate the concerns discussed above (regarding racism, weapons development, etc.), they would also prevent researchers from achieving many of the benefits that were to be gained from the project.
Some members of the Human Genome Project (HGP) argued in favor of engaging in diversity research on data gleaned from the Human Genome Diversity Project, although most agreed that diversity research should be done by the HGP and not as a separate project.
A number of the principal collaborators in the HGDP have been involved in the privately funded Genographic Project launched in April 2005 with similar aims.