Structural bioinformatics explained

Structural bioinformatics is the branch of bioinformatics that is related to the analysis and prediction of the three-dimensional structure of biological macromolecules such as proteins, RNA, and DNA. It deals with generalizations about macromolecular 3D structures such as comparisons of overall folds and local motifs, principles of molecular folding, evolution, binding interactions, and structure/function relationships, working both from experimentally solved structures and from computational models. The term structural has the same meaning as in structural biology, and structural bioinformatics can be seen as a part of computational structural biology. The main objective of structural bioinformatics is the creation of new methods of analysing and manipulating biological macromolecular data in order to solve problems in biology and generate new knowledge.[1]

Introduction

Protein structure

See main article: Protein structure. The structure of a protein is directly related to its function. The presence of certain chemical groups in specific locations allows proteins to act as enzymes, catalyzing several chemical reactions.[2] In general, protein structures are classified into four levels: primary (sequences), secondary (local conformation of the polypeptide chain), tertiary (three-dimensional structure of the protein fold), and quaternary (association of multiple polypeptide structures). Structural bioinformatics mainly addresses interactions among structures taking into consideration their space coordinates. Thus, the primary structure is better analyzed in traditional branches of bioinformatics. However, the sequence implies restrictions that allow the formation of conserved local conformations of the polypeptide chain, such as alpha-helix, beta-sheets, and loops (secondary structure[3]). Also, weak interactions (such as hydrogen bonds) stabilize the protein fold. Interactions could be intrachain, i.e., when occurring between parts of the same protein monomer (tertiary structure), or interchain, i.e., when occurring between different structures (quaternary structure). Finally, the topological arrangement of interactions, whether strong or weak, and entanglements is being studied in the field of structural bioinformatics, utilizing frameworks such as circuit topology.

Structure visualization

Protein structure visualization is an important issue for structural bioinformatics.[4] It allows users to observe static or dynamic representations of the molecules, also allowing the detection of interactions that may be used to make inferences about molecular mechanisms. The most common types of visualization are:

DNA structure

The classic DNA duplexes structure was initially described by Watson and Crick (and contributions of Rosalind Franklin). The DNA molecule is composed of three substances: a phosphate group, a pentose, and a nitrogen base (adenine, thymine, cytosine, or guanine). The DNA double helix structure is stabilized by hydrogen bonds formed between base pairs: adenine with thymine (A-T) and cytosine with guanine (C-G). Many structural bioinformatics studies have focused on understanding interactions between DNA and small molecules, which has been the target of several drug design studies.

Interactions

Interactions are contacts established between parts of molecules at different levels. They are responsible for stabilizing protein structures and perform a varied range of activities. In biochemistry, interactions are characterized by the proximity of atom groups or molecules regions that present an effect upon one another, such as electrostatic forces, hydrogen bonding, and hydrophobic effect. Proteins can perform several types of interactions, such as protein-protein interactions (PPI), protein-peptide interactions[5] , protein-ligand interactions (PLI)[6] , and protein-DNA interaction.

Calculating contacts

Calculating contacts is an important task in structural bioinformatics, being important for the correct prediction of protein structure and folding, thermodynamic stability, protein-protein and protein-ligand interactions, docking and molecular dynamics analyses, and so on.[7]

Traditionally, computational methods have used threshold distance between atoms (also called cutoff) to detect possible interactions.[8] This detection is performed based on Euclidean distance and angles between atoms of determined types. However, most of the methods based on simple Euclidean distance cannot detect occluded contacts. Hence, cutoff free methods, such as Delaunay triangulation, have gained prominence in recent years. In addition, the combination of a set of criteria, for example, physicochemical properties, distance, geometry, and angles, have been used to improve the contact determination.

Distance criteria for contact definition!Type!Max distance criteria
Hydrogen bond3,9 Å
Hydrophobic interaction5 Å
Ionic interaction6 Å
Aromatic Stacking6 Å

Protein Data Bank (PDB)

See main article: Protein Data Bank.

Notes and References

  1. Book: Gu . Jenny . Bourne . Philip E.. vanc . Structural Bioinformatics . 2011. John Wiley & Sons. 978-1-118-21056-7. 2nd. Hoboken. 778339075.
  2. Book: Gu. Jenny . Bourne. Philip E. . vanc . Structural Bioinformatics. 2009-03-16. John Wiley & Sons. 978-0-470-18105-8. en.
  3. Kocincová L, Jarešová M, Byška J, Parulek J, Hauser H, Kozlíková B . Comparative visualization of protein secondary structures . BMC Bioinformatics . 18 . Suppl 2 . 23 . February 2017 . 28251875 . 5333176 . 10.1186/s12859-016-1449-z . free .
  4. Shi M, Gao J, Zhang MQ . Web3DMol: interactive protein structure visualization based on WebGL . Nucleic Acids Research . 45 . W1 . W523–W527 . July 2017 . 28482028 . 5570197 . 10.1093/nar/gkx383 .
  5. Stanfield RL, Wilson IA . Protein-peptide interactions . Current Opinion in Structural Biology . 5 . 1 . 103–13 . February 1995 . 7773739 . 10.1016/0959-440X(95)80015-S .
  6. Book: Drug Design. Klebe G. 2015. Springer. 978-3-642-17906-8. Scapin G, Patel D, Arnold E. NATO Science for Peace and Security Series A: Chemistry and Biology. Dordrecht. 83–92. Protein–Ligand Interactions as the Basis for Drug Action. 10.1007/978-3-642-17907-5_4.
  7. Book: Martins PM, Mayrink VD, de Silveira S, da Silveira CH, de Lima LH, de Melo-Minardi RC . Proceedings of the 33rd Annual ACM Symposium on Applied Computing . How to compute protein residue contacts more accurately? . 2018. http://dl.acm.org/citation.cfm?doid=3167132.3167136. en. Pau, France. ACM Press. 60–67. 10.1145/3167132.3167136. 978-1-4503-5191-1. 49562347.
  8. da Silveira CH, Pires DE, Minardi RC, Ribeiro C, Veloso CJ, Lopes JC, Meira W, Neshich G, Ramos CH, Habesch R, Santoro MM . 6 . Protein cutoff scanning: A comparative analysis of cutoff dependent and cutoff free methods for prospecting contacts in proteins . Proteins . 74 . 3 . 727–43 . February 2009 . 18704933 . 10.1002/prot.22187 . 1208256 .