Genotyping by sequencing explained

In the field of genetic sequencing, genotyping by sequencing, also called GBS, is a method to discover single nucleotide polymorphisms (SNP) in order to perform genotyping studies, such as genome-wide association studies (GWAS).[1] GBS uses restriction enzymes to reduce genome complexity and genotype multiple DNA samples.[2] After digestion, PCR is performed to increase fragments pool and then GBS libraries are sequenced using next generation sequencing technologies, usually resulting in about 100bp single-end reads.[3] It is relatively inexpensive and has been used in plant breeding. Although GBS presents an approach similar to restriction-site-associated DNA sequencing (RAD-seq) method, they differ in some substantial ways.[4] [5] [6]

Methods

GBS is a robust, simple, and affordable procedure for SNP discovery and mapping. Overall, this approach reduces genome complexity with restriction enzymes (REs) in high-diversity, large genomes species for efficient high-throughput, highly multiplexed sequencing. By using appropriate REs, repetitive regions of genomes can be avoided and lower copy regions can be targeted, which reduces alignments problems in genetically highly diverse species. The method was first described by Elshire et al. (2011). In summary, high molecular weight DNAs are extracted and digested using a specific RE previously defined by cutting frequently[7] in the major repetitive fraction of the genome. ApeKI is the most used RE. Barcode adapters are then ligated to sticky ends and PCR amplification is performed. Next-generation sequencing technology is performed resulting in about 100 bp single-end reads. Raw sequence data are filtered and aligned to a reference genome using usually Burrows–Wheeler alignment tool (BWA) or Bowtie 2. The next step is to identify SNPs from aligned tags and score all discovered SNPs for various coverage, depth and genotypic statistics. Once a large-scale, species-wide SNP production has been run, it is possible to quickly call known SNPs in newly sequenced samples.[8]

When initially developed, the GBS approach was tested and validated in recombinant inbred lines (RILs) from a high-resolution maize mapping population (IBM) and doubled haploid (DH) barley lines from the Oregon Wolfe Barley (OWB) mapping population. Up to 96 RE (ApeKI)-digested DNA samples were pooled and processed simultaneously during the GBS library construction, which was checked on a Genome Analyzer II (Illumina, Inc.). Overall, 25,185 biallelic tags were mapped in maize, while 24,186 sequence tags were mapped in barley. Barley GBS marker validation using a single DH line (OWB003) showed 99% agreement between the reference markers and the mapped GBS reads. Although barley lacks a complete genome sequence, GBS does not require a reference genome for sequence tag mapping, the reference is developed during the process of sampling genotyping. Tags can also be treated as dominant markers for alternative genetic analysis in the absence of a reference genome. Other than the multiplex GBS skimming, imputation of missing SNPs has the potential to further reduce GBS costs. GBS is a versatile and cost-effective procedure that will allow mining genomes of any species without prior knowledge of its genome structure.

See also

Notes and References

  1. Elshire. Robert J.. Glaubitz. Jeffrey C.. Sun. Qi. Poland. Jesse A.. Kawamoto. Ken. Buckler. Edward S.. Mitchell. Sharon E.. 2011-05-04. A Robust, Simple Genotyping-by-Sequencing (GBS) Approach for High Diversity Species. PLOS ONE. 6. 5. e19379. 10.1371/journal.pone.0019379. 1932-6203. 3087801. 21573248. 2011PLoSO...619379E. free.
  2. He. Jiangfeng. Zhao. Xiaoqing. Laroche. André. Lu. Zhen-Xiang. Liu. HongKui. Li. Ziqin. 2014-01-01. Genotyping-by-sequencing (GBS), an ultimate marker-assisted selection (MAS) tool to accelerate plant breeding. Frontiers in Plant Science. 5. 484. 10.3389/fpls.2014.00484. 4179701. 25324846. free.
  3. Liu. Hui. Bayer. Micha. Druka. Arnis. Russell. Joanne R.. Hackett. Christine A.. Poland. Jesse. Ramsay. Luke. Hedley. Pete E.. Waugh. Robbie. 2014-01-01. An evaluation of genotyping by sequencing (GBS) to map the Breviaristatum-e (ari-e) locus in cultivated barley. BMC Genomics. 15. 104. 10.1186/1471-2164-15-104. 1471-2164. 3922333. 24498911 . free .
  4. Davey. John W.. Hohenlohe. Paul A.. Etter. Paul D.. Boone. Jason Q.. Catchen. Julian M.. Blaxter. Mark L.. 2011-07-01. Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nature Reviews Genetics. en. 12. 7. 499–510. 10.1038/nrg3012. 21681211. 15080731. 1471-0056.
  5. 10.1111/2041-210X.13038 . Would an RRS by any other name sound as RAD? . Methods in Ecology and Evolution . 9 . 9 . 1920–1927 . 2018 . Campbell. Erin O. . Brunet. Byran M.T. . Dupuis. Julian R. . Sperling. Felix A.H.. free.
  6. 10.1111/jbi.14516 . Genotyping-by-sequencing for biogeography . Journal of Biogeography . 50. 2. 262–281. 2022 . Vaux. Felix . Dutoit. Ludovic . Fraser. Ceridwen I. . Waters. Jonathan M.. free.
  7. 10.1186/1471-2164-15-979. 25406744. 4253001. Flexible and scalable genotyping-by-sequencing strategies for population studies. BMC Genomics. 15. 979. 2014. Heffelfinger. Christopher. Fragoso. Christopher A.. Moreno. Maria A.. Overton. John D.. Mottinger. John P.. Zhao. Hongyu. Tohme. Joe. Dellaporta. Stephen L.. 1. free .
  8. Web site: Tassel 5 GBS v2 Pipeline. Tassel 5 Source. 20 May 2016.