Carbohydrate Structure Database Explained
Carbohydrate Structure Database |
Description: | Natural carbohydrate structures with NMR, bibliographic and biological annotations. |
Scope: | carbohydrate structures and related data |
Center: | Zelinsky Institute of Organic Chemistry |
Citation: | Carbohydrate Structure Database [1] |
Author: | Philip V. Toukach, Ksenia S. Egorova, Yuri A. Knirel, et al. |
Released: | 2005 |
Url: | http://csdb.glycoscience.ru/ |
Download: | export feature in web-interface |
Versioning: | yes |
Frequency: | annual |
Curation: | yes (manual and automatic) |
Version: | 1 (merged) |
Carbohydrate Structure Database (CSDB) is a free curated database and service platform in glycoinformatics, launched in 2005[2] by a group of Russian scientists from N.D. Zelinsky Institute of Organic Chemistry, Russian Academy of Sciences. CSDB stores published structural, taxonomical, bibliographic and NMR-spectroscopic data on natural carbohydrates and carbohydrate-related molecules.
Overview
The main data stored in CSDB are carbohydrate structures of bacterial, fungal, and plant origin. Each structure is assigned to an organism and is provided with the link(s) to the corresponding scientific publication(s), in which it was described. Apart from structural data, CSDB also stores NMR spectra, information on methods used to decipher a particular structure, and some other data.[3] CSDB provides access to several carbohydrate-related research tools:
History and funding
Until 2015, Bacterial Carbohydrate Structure Database (BCSDB) and Plant&Fungal Carbohydrate Structure Database (PFCSDB) databases existed in parallel. In 2015, they were joined into the single Carbohydrate Structure Database (CSDB).[1] The development and maintenance of CSDB have been funded by International Science and Technology Center (2005-2007), Russian Federation President grant program (2005-2006), Russian Foundation for Basic Research (2005-2007,2012-2014,2015-2017,2018-2020), Deutsches Krebsforschungszentrum (short-term in 2006-2010), and Russian Science Foundation (2018-2020).
Data sources and coverage
The main sources of CSDB data are:
The data are selected and added to CSDB manually by browsing original scientific publications. The data originating from other databases are subject to error-correction and approval procedures.[14] As of 2017, the coverage on bacteria and archaea is ca. 80% of carbohydrate structures published in scientific literature [1] The time lag between the publication of relative data and their deposition into CSDB is about 18 months. Plants are covered up to 1997, and fungi up to 2012.[15] CSDB does not cover data from the animalia domain, except unicellular metazoa. There is a number of dedicated databases on animal carbohydrates, e.g. UniCarbKB [16] or GLYCOSCIENCES.de .[17]
CSDB is reported as one of the biggest projects in glycoinformatics.[18] [19] [20] [21] [22] [23] [24] It is employed in structural studies of natural carbohydrates[25] [26] [27] and in glyco-profiling.[28] The content of CSDB has been used as a data source in other glycoinformatics projects.[29] [30] [31] [32]
Deposited objects
- Molecular structures of glycans, glycopolymers and glycoconjugates: primary structure, aglycon information, polymerization degree and class of molecule. Structural scope includes molecules composed of residues (monosaccharides, alditols, amino acids, fatty acids etc.) linked by glycosidic, ester, amidic, ketal, phospho- or sulpho-diester bonds, in which at least one residue is a monosaccharide or its derivative.
- Bibliography associated with structures: imprint data, keywords, abstracts, IDs in bibliographic databases
- Biological context of structures: associated taxon, strain, serogroup, host organism, disease information. The covered domains are: prokaryotes, plants, fungi and selected pathogenic unicellular metazoa. The database contains only glycans originating from these domains or obtained by chemical modification of such glycans.
- Assigned NMR spectra and experimental conditions.
- Glycosyltransferases associated with taxons: gene and enzyme identifiers, full structures, donor and substrates, methods used to prove enzymatic activity, trustworthiness level.
- References to other databases
- Other data collected from original publications
- Conformation maps of disaccharides derived from molecular dynamics simulations.
Interrelation with other databases
CSDB is cross-linked to other glycomics databases,[33] [34] such as MonosaccharideDB, Glycosciences.DE, NCBI Pubmed, NCBI Taxonomy, NLM catalog, International Classification of Diseases 11, etc. Besides a native notation, CSDB Linear,[35] structures are presented in multiple carbohydrate notations (SNFG,[36] SweetDB,[37] GlycoCT,[38] WURCS,[39] GLYCAM,[40] etc.). CSDB is exportable as a Resource Description Framework (RDF) feed according to the GlycoRDF ontology.[41] [42]
External links
Notes and References
- Toukach Ph.V.. Egorova K.S.. 2016. Nucleic Acids Research. 44. D1. D1229–D1236 . Carbohydrate structure database merged from bacterial, archaeal, plant and fungal parts. 10.1093/nar/gkv840. 26286194. 4702937.
- Toukach F.V.. Knirel Y.A.. 2005. Glycoconjugate Journal. 22. 4–6. 216–217. New database of bacterial carbohydrate structures.
- Harvey D.J.. 2015. Mass Spectrometry Reviews . Analysis of carbohydrates and glycoconjugates by matrix-assisted laser desorption/ionization mass spectrometry: An update for 2011-2012. 10.1002/mas.21471. 26270629. 36. 3. 255–422.
- Kapaev R.R.. Egorova K.S.. Toukach Ph.V.. 2014. Journal of Chemical Information and Modeling . 54. 9. 2594–2611 . Carbohydrate structure generalization scheme for database-driven simulation of experimental observables, such as NMR chemical shifts. 10.1021/ci500267u. 25020143.
- Kapaev R.R.. Toukach Ph.V.. 2015. Analytical Chemistry . 87. 7006–7010 . Improved carbohydrate structure generalization scheme for 1H and 13C NMR simulations. 14. 10.1021/acs.analchem.5b01413. 26087011.
- Kapaev R.R.. Toukach Ph.V.. 2016. Journal of Chemical Information and Modeling . 56. 1100–1104 . Simulation of 2D NMR Spectra of Carbohydrates Using GODDESS Software. 6. 10.1021/acs.jcim.6b00083. 27227420.
- Kapaev R.R.. Toukach Ph.V.. 2018. Bioinformatics . 34. 6. 957–963 . GRASS: semi-automated NMR-based structure elucidation of saccharides. 10.1093/bioinformatics/btx696. 29092007. free.
- Egorova K.S.. Kondakova A.N.. Toukach Ph.V.. 2015. Database . ID bav073 . Carbohydrate structure database: tools for statistical analysis of bacterial, plant and fungal glycomes. 10.1093/database/bav073. 26337239. 4559136. 2015.
- Herget S.. Toukach Ph.V.. Ranzinger R.. Hull W.E.. Knirel Y.. von der Lieth C.-W.. 2008. BMC Structural Biology . 8. ID 35 . Statistical analysis of the Bacterial Carbohydrate Structure Data Base (BCSDB): Characteristics and diversity of bacterial carbohydrates in comparison with mammalian glycans. 10.1186/1472-6807-8-35. 18694500. 2543016. free.
- Chernyshov I.Y.. Toukach Ph.V.. 2018. Bioinformatics . REStLESS: Automated Translation of Glycan Sequences from Residue-Based Notation to SMILES and Atomic Coordinates. 10.1093/bioinformatics/bty168. 29547883. 34. 15. 2679–2681. free.
- Toukach Ph.V.. Egorova K.S. . 2017. Glycobiology . 27. CSDB_GT: a new curated database on glycosyltransferases. 4 . 285–290 . 10.1093/glycob/cww137. 28011601 . free.
- Egorova K.S.. Knirel Y.A.. Toukach Ph.V. . 2019. Glycobiology . 29. Expanding CSDB_GT glycosyltransferase database with Escherichia coli. 4 . 285–287 . 10.1093/glycob/cwz006. 30759212 .
- Doubet S.. Albersheim P.. 1992. Glycobiology. 2. 6 . 505–507 . CarbBank. 1472756. 10.1093/glycob/2.6.505. free.
- Egorova K.S.. Toukach Ph.V.. 2012. Journal of Chemical Information and Modeling . 52. 2812–2814 . Critical analysis of CCSD data quality. 11. 10.1021/ci3002815. 23025661.
- Egorova K.S.. Toukach Ph.V.. 2013. Carbohydrate Research . 389. 112–114. Expansion of coverage of Carbohydrate Structure Database (CSDB). 10.1016/j.carres.2013.10.009. 24680503.
- Campbell M.P.. Packer N.H. . 2016. 1860. 8 . 1669–1675 . UniCarbKB: New database features for integrating glycan structure abundance, compositional glycoproteomics data, and disease associations. Biochimica et Biophysica Acta (BBA) - General Subjects . 10.1016/j.bbagen.2016.02.016. 26940363.
- Lütteke T.. Bohne-Lang A.. Loss A.. Goetz T.. Frank M.. von der Lieth C.-W.. 2006. Glycobiology. 16. 5 . 71R–81R . GLYCOSCIENCES.de: an Internet portal to support glycomics and glycobiology research. 10.1093/glycob/cwj049. 16239495. free.
- Rigden D.J.. Fernández-Suárez X.M.. Galperin M.Y. . 2016. Nucleic Acids Research . 44. D1 . D1–D6 . The 2016 database issue of Nucleic Acids Research and an updated molecular biology database collection. 10.1093/nar/gkv1356. 26740669 . 4702933.
- Aoki-Kinoshita K.F. . 2013. Molecular & Cellular Proteomics . 12. 4 . 1036–1045 . Using databases and web resources for glycomics research. 10.1074/mcp.R112.026252. free. 23325765. 3617328.
- Frank M.. Schloissnig S. . 2010. Cellular and Molecular Life Sciences . 67. 16 . 2749–2772 . Bioinformatics and molecular modeling in glycobiology.. 10.1007/s00018-010-0352-4. 20364395 . 2912727.
- Book: Artemenko N.V.. McDonald A.G.. Davey G.P.. Rudd P.M. . Databases and Tools in Glycobiology . 2012. Methods in Molecular Biology . 899 . 325–350 . Therapeutic Proteins. 10.1007/978-1-61779-921-1_21. 22735963. 978-1-61779-920-4.
- Lütteke T. . 2012. Beilstein Journal of Organic Chemistry . 8 . 915–929 . The use of glycoinformatics in glycochemistry. 10.3762/bjoc.8.104. 23015842 . 3388882.
- Zhulin I.B. . 2015. Journal of Bacteriology . 197. 15 . 2458–2467 . Databases for Microbiologists. 10.1128/JB.00330-15. 26013493. 4505447.
- Yamada K.. Kakehi K. . 2011. Journal of Pharmaceutical and Biomedical Analysis . 55. 4 . 702–727 . Recent advances in the analysis of carbohydrates for biomedical use. 10.1016/j.jpba.2011.02.003. 21382683.
- Fontana C.. Zaccheus M.. Weintraub A.. Ansaruzzaman M.. Widmalm G.. 2016. Carbohydrate Research . 432 . 41–49 . Structural studies of a polysaccharide from Vibrio parahaemolyticus strain AN-16000. 10.1016/j.carres.2016.06.004. 27392309. 23129802 .
- Potekhina N.V.. Shashkov A.S.. Senchenkova S.N.. Dorofeeva L.V.. Evtushenko L.I.. 2012. Biochemistry (Moscow) . 77 . 11 . 1294–1302 . Structure of hexasaccharide 1-phosphate polymer from Arthrobacter uratoxydans VKM Ac-1979(T) cell wall. 10.1134/S0006297912110089. 23240567. 9699031.
- Chapot-Chartier M.P.. Vinogradov E.. Sadovskaya I.. Andre G.. Mistou M.Y.. Trieu-Cuot P.. Furlan S.. Bidnenko E.. Courtin P.. Péchoux C.. Hols P.. Dufrêne Y.F.. Kulakauskas S.. 2010. Journal of Biological Chemistry . 285. 14 . 10464–10471 . Cell surface of Lactococcus lactis is covered by a protective polysaccharide pellicle. 10.1074/jbc.M109.082958. 20106971. 2856253. free.
- Walsh I.. Zhao S.. Campbell M.. Taron C.H.. Rudd P.M.. 2016. Current Opinion in Structural Biology . 40 . 70–80 . Quantitative profiling of glycans and glycopeptides: an informatics' perspective. 10.1016/j.sbi.2016.07.022. 27522273.
- Book: Ranzinger R.. York W.S. . GlycomeDB . 2015. Methods in Molecular Biology . 1273 . 109–124 . Glycoinformatics. 10.1007/978-1-4939-2343-4_8. 25753706. 978-1-4939-2342-7 .
- Ranzinger R.. Herget S.. von der Lieth C.-W.. Frank M. . 2011. Nucleic Acids Research. 39 . D373-376 . GlycomeDB - a unified database for carbohydrate structures. Database issue. 10.1093/nar/gkq1014. 21045056 . 3013643.
- Aoki-Kinoshita K.F.. et al . 2016. Nucleic Acids Research. 44. D1 . D1237-1242 . GlyTouCan 1.0 - The international glycan structure repository. 10.1093/nar/gkv1041. 26476458 . 4702779.
- Campbell M.P.. Ranzinger R.. Lütteke T.. Mariethoz J.. Hayes CA.. Zhang J.. Akune Y.. Aoki-Kinoshita K.F.. Damerell D.. Carta G.. York W.S.. Haslam S.M.. Narimatsu H.. Rudd P.M.. Karlsson N.G.. Packer N.H.. Lisacek F. . 2014. BMC Bioinformatics . 15 . Suppl 1:S9 . Toolboxes for a standardised and systematic study of glycans. Suppl 1 . 10.1186/1471-2105-15-S1-S9. 24564482 . 4016020 . free .
- Ranzinger R.. Herget S.. Wetter T.. von der Lieth C.-W.. 2008. BMC Bioinformatics . 9 . ID 384 . GlycomeDB - integration of open-access carbohydrate structure databases. 10.1186/1471-2105-9-384. 18803830 . 2567997 . free .
- Toukach Ph.V.. Joshi H.. Ranzinger R.. Knirel Y.. von der Lieth C.-W.. 2007. Nucleic Acids Research. 35. D280–D286. Sharing of worldwide distributed carbohydrate-related digital resources: online connection of the Bacterial Carbohydrate Structure DataBase and GLYCOSCIENCES.de. Database issue. 10.1093/nar/gkl883. 17202164. 1899093.
- Toukach Ph.V.. Egorova K.S.. 2020. Journal of Chemical Information and Modeling . 60. 3 . 1276–1289 . New features of CSDB Linear, as compared to other carbohydrate notations. 10.1021/acs.jcim.9b00744. 31790229. 226214957.
- Varki A.. et al . 2015. Glycobiology . 25. 12 . 1323–1324 . Symbol Nomenclature for Graphical Representations of Glycans. 10.1093/glycob/cwv091. 26543186. 4643639.
- Loss A.. Bunsmann P.. Bohne A.. Loss A.. Schwarzer E.. Lang E.. von der Lieth C.-W. . 2002. Nucleic Acids Research . 30. 1 . 405–408 . SWEET-DB: an attempt to create annotated data collections for carbohydrates. 11752350 . 10.1093/nar/30.1.405 . 99123.
- Herget S.. Ranzinger R.. Maass K.. von der Lieth C.-W.. 2008. Carbohydrate Research . 343. 12 . 2162–2171. GlycoCT - a unifying sequence format for carbohydrates. 10.1016/j.carres.2008.03.011. 18436199.
- Tanaka K.. Aoki-Kinoshita K.F.. Kotera M.. Sawaki H.. Tsuchiya S.. Fujita N.. Shikanai T.. Kato M.. Kawano S.. Yamada I.. Narimatsu H. . 2014. Journal of Chemical Information and Modeling . 54. 6 . 1558–1566 . WURCS: the Web3 unique representation of carbohydrate structures. 10.1021/ci400571e. 24897372. free.
- Kirschner K.N.. Yongye A.B.. Tschampel S.M.. González-Outeiriño J.. Daniels C.R.. Foley B.L.. Woods R.J. . 2008. Journal of Computational Chemistry . 29. 4 . 622–655 . GLYCAM06: a generalizable biomolecular force field. Carbohydrates. 10.1002/jcc.20820. 17849372. 4423547.
- Ranzinger R.. Aoki-Kinoshita K.F.. Campbell M.P.. Kawano S.. Lütteke T.. Okuda S.. Shinmachi D.. Shikanai T.. Sawaki H.. Toukach Ph.V.. Matsubara M.. Yamada I.. Narimatsu H.. 2015. Bioinformatics. 31. 6. 919–925. GlycoRDF: An ontology to standardize Glycomics data in RDF. 10.1093/bioinformatics/btu732. 25388145. 4380026.
- Aoki-Kinoshita K.F.. Bolleman J.. Campbell M.P.. Kawano S.. Kim J.. Lütteke T.. Matsubara M.. Okuda S.. Ranzinger R.. Sawaki H.. Shikanai T.. Shinmachi D.. Suzuki Y.. Toukach Ph.V.. Yamada I.. Packer N.H.. Narimatsu H.. 2013. Journal of Biomedical Semantics . 4. ID 39 . Introducing glycomics data into the Semantic Web. 1. 10.1186/2041-1480-4-39. 24280648. 4177142. free.