The polyelectrolyte theory of the gene proposes that for a linear genetic biopolymer dissolved in water, such as DNA, to undergo Darwinian evolution anywhere in the universe, it must be a polyelectrolyte, a polymer containing repeating ionic charges.[1] These charges maintain the uniform physical properties needed for Darwinian evolution, regardless of the information encoded in the genetic biopolymer. DNA is such a molecule. Regardless of its nucleic acid sequence, the negative charges on its backbone dominate the physical interactions of the molecule to such a degree that it maintains uniform physical properties such as its aqueous solubility and double-helix structure.
The polyelectrolyte theory of the gene was proposed by Steven A. Benner and Daniel Hutter in 2002[2] and has largely remained a theoretical framework astrobiologists have used to think about how life may be detected beyond Earth. This idea was later linked by Benner [3] to Erwin Schrödinger's view of the gene as an "aperiodic crystal"[4] to make a robust, universally generalized concept of a genetic biopolymer—a biopolymer acting as a unit of inheritance in Darwinian evolution.
Benner and others who built on his work have proposed methods for how to concentrate and identify genetic biopolymers on other planets and moons within the solar system using electrophoresis, which uses an electric field to concentrate charged compounds.
Although few have tested the polyelectrolyte theory of the gene, in 2019, lab experiments challenged the universality of this idea. This work was able to create non-electrolyte polymers capable of limited Darwinian evolution, but only up to a length of 72 nucleotides.[5] [6]
A polyelectrolyte is a polymer with repeating electrostatically charged units. In the context of the polyelectrolyte theory of the gene, this polyelectrolyte is a biopolymer—a polymer derived from a living system—with a repeated ionically charged unit, similar to the genetic biopolymer in modern biology, DNA. Although RNA does not act as a genetic biopolymer archive in modern biology—except in the case of some viruses such as coronavirus[7] and HIV[8] —the RNA World hypothesis suggests that RNA may have preceded DNA as life’s first genetic biopolymer.[9] The nucleotide building blocks that make up DNA and RNA are connected by negatively charged phosphate groups. These phosphodiester linkages create the repeating negative charges on the molecule’s backbone that give DNA and RNA their polyelectrolyte nature.[10]
To participate in Darwinian evolution, which can be described as "descent with modification", a unit of inheritance must be capable of imperfect replication to occasionally produce a new modified unit of inheritance, which must still be capable of being replicated. This imperfect replication leads to the variation on which Darwinian evolution can act.[11]
The polyelectrolyte theory of the gene attempts to understand modern biology’s unit of inheritance, DNA, at a generalizable level. In 2002, Steven A. Benner and Daniel Hutter identified the repeated charges in DNA's phosphodiester linkages as crucial to its function as a genetic biopolymer. They proposed with the polyelectrolyte theory of the gene that repeated ionic charges—positive or negative—are a general requirement for all water-dissolved genetic biopolymers to undergo Darwinian evolution anywhere in the cosmos.
This concept works in tandem with the view of the gene as an "aperiodic crystal" as proposed by Erwin Schrödinger in his 1944 book "What Is Life?". An aperiodic crystal, as Schrödinger describes it, has a discrete set of molecular building blocks in a non-repeating arrangement. DNA is an aperiodic crystal composed of discrete nucleobases (A, T, C, and G), which are arranged based on the information they encode, not in any repeated format. While this idea of an "aperiodic crystal" was not initially linked to the polyelectrolyte theory of the gene, Benner, in later work, connected the two.
In biochemistry, the structure of a biomolecule dictates its function, and therefore changes in structure cause changes in function.[12] To work as a unit of inheritance, the genetic biopolymer must maintain shape and, therefore, physical and chemical consistency, regardless of the information the structure encodes. DNA is such a molecule. No matter what the nucleic acid sequence is, DNA maintains a consistent double helix structure and, therefore, the consistent physical properties that allow it to remain dissolved in water and be replicated by cellular machinery. The polyelectrolyte theory of the gene reasons that DNA can maintain its shape regardless of mutations because the negative charges on the phosphate backbone dominate the physical interactions of the molecule to such a degree that changes in the nucleic acid sequence, the encoded information, do not affect the overall physical behavior of the molecule.
For example, thymidine nucleotides (T) are very soluble in water while guanosine nucleotides (G) are more insoluble; however, an oligonucleotide—a short polynucleotide sequence—composed of only thymine and one composed of only guanine has the same overall structure and physical properties.[13] If changes in the nucleic acid sequence, which encodes genetic information, change the physical properties of DNA, these changes could break down the mechanism by which DNA replicates.
This physical uniformity is very rare in nature. Take another biopolymer, for example, proteins. The nucleic acid sequence in DNA codes for the sequence of amino acids that make up proteins. A change to even a single amino acid in the primary sequence of a protein can completely change the physical properties of that protein. For example, the sickle-cell trait is caused by a single mutation of an adenine to a thymine in the hemoglobin gene, causing a switch from a glutamic acid to a valine.[14] This completely changes the three-dimensional structure of hemoglobin and thus changes the physical properties of the protein that lead to the sickle-cell trait.
Proteins are sensitive to changes in amino acid sequence because the 20 different amino acid side chains form bonds and partial bonds with each other.[15] In addition, the protein backbone has a dipole moment—having partially positive and partially negative sides—which can further create interactions within the molecule. These side-chain and backbone interactions are sensitive to changes in the environment and amino acid sequence. It is unlikely that a protein could act as a genetic biomolecule because changes in amino acid sequence lead to changes in overall physical structure and properties.
Another non-electrolyte biopolymer would suffer the same challenges as a protein when acting as a genetic biomolecule. Changes in physical properties with changes in encoded information would mean that such a molecule would struggle to be replicated with certain sequences of encoded information, as those sequences would result in physical properties incompatible with replication. This problem means that the hypothetical protein gene would not be able to explore all possible genetic sequences, as certain sequences would cause the molecule to fail to be replicated based on the physical structure of its gene, not on the fitness of what the gene codes for.
Benner and Hutter initially described this property of DNA as being "capable of surviving modifications in constitution without loss of properties essential for replication" or the acronym COSMIC-LOPER. This acronym gives scientists a shorthand way of describing the complex idea of a genetic biopolymer having the physical uniformity regardless of encoded information that allows it to be replicated.
Although RNA is often described as a genetic biopolymer because of its theorized role as life’s first unit of inheritance (RNA World), it is not entirely COSMIC-LOPER. RNA, especially sequences high in guanine (G), is capable of folding and performing enzyme-type chemistry.[16] Folding in guanine-rich RNA sequences prevents the templating ability of RNA and thus its ability to be replicated in an RNA-world scenario, for the same reason it would be difficult for a protein-based gene to replicate.
The repeated negative charges increase the solubility of DNA and RNA in water. Because ionic charges are highly soluble in water, having them on the molecule's backbone increases the molecule's solubility.[17] If the backbone of a hypothetical genetic biopolymer were linked together in a non-ionic fashion, the solubility of the whole molecule would decrease.[18] Solubility is important because, in order to be replicated, DNA—or any other genetic biomolecule—must be soluble to interact with replicative machinery.
The repeated negative charges of the DNA backbone electrostatically repel each other, preventing interactions both within and between DNA strands. This repulsion promotes specific interactions along the Watson–Crick 'edge' of the nucleobases, promoting Watson–Crick base pairing specificity—A pairs with T and C pairs with G.
The repeated negative charges on the backbone keep DNA and many RNA molecules from folding and allow them to act as templates. In water, molecules take on a conformation that is the most energetically favorable, with the lowest Gibbs free energy. This configuration maximizes favorable interactions (hydrogen bonding, positive-negative charge interactions, van der Waals interactions) and minimizes unfavorable interactions (i.e., hydrophilic-hydrophobic interactions and like charge interactions). In the case of double-stranded DNA and RNA, the most energetically favorable form is the linear double helix configuration because it maximizes interactions between base pairs and between the negatively charged backbone and the surrounding water molecules while minimizing interactions between the negatively charged phosphodiester linkages of the backbone. If the double-stranded DNA or RNA molecule folded, it would exchange favorable water-backbone interactions for unfavorable backbone-backbone interactions. A biopolymer without an ionically charged backbone, like proteins, would not produce unfavorable backbone-backbone interaction during folding and thus would readily fold and aggregate. This inherent tendency towards linearity improves DNA’s ability to act as a template for replication because folded and aggregated conformations are inaccessible to replication machinery.
Lab experiments conducted with non-electrolyte analogs of DNA and RNA initially inspired Benner and Hutton to publish on the polyelectrolyte theory of the gene. During the late ‘80s and '90s, scientists developed synthetic DNA-like molecules to bind to and silence unwanted mRNA gene products as a way to treat disease. As part of this exploratory research, researchers developed a variety of non-electrolyte RNA and DNA analogs that would be able to cross the cell membrane, which DNA and RNA are incapable of doing because of their charged backbones. One of these analogs substituted a sulfone (SO₂) for the natural phosphodiester (PO₂⁻) linkage. While initial experiments showed the sulfone analog to have very similar properties to DNA as a dimer—two nucleotides linked together—when longer sulfone analogs were synthesized, they folded, lost Watson–Crick base pair specificity, and had dramatic changes in physical properties due to small changes in nucleic acid sequence. The reduction in the quality of the traits that make DNA a good genetic molecule was seen with all the nonionic linkers that were tested as of 2002.
The closest non-electrolyte analog to maintaining the qualities of DNA was the polyamide-linked nucleic acid analog (PNA), which replaced the phosphodiester linkage of DNA with an uncharged N-(2-aminoethyl)glycine linkage. Even Benner and Hutter questioned if PNA might disprove their polyelectrolyte hypothesis; however, even though PNA maintained the qualities of DNA up to a length of 20 nucleotides, beyond that length, the molecules started to lose Watson–Crick base pair specificity, aggregated, and became sensitive to changes in nucleic acid sequence.
In 2019, a group led by Philipp Holliger in Cambridge, England, developed non-electrolyte P-alkylphosphonate nucleic acids (phNA) DNA analogs that were able to undergo templated synthesis and directed evolution.[19] The phNA analogs substituted the charged oxygen on DNA’s phosphate backbone with an uncharged methyl or ethyl group. While other DNA analogs have been shown to undergo templated synthesis and directed evolution, this discovery was the first time a non-electrolyte DNA analog had been shown to have these properties and the first time the polyelectrolyte theory of the gene had been experimentally challenged.[20] However, the Template-directed synthesis of phNA was only performed up to a length of 72 nucleotides. This is around the length of the shortest naturally occurring gene, tRNA,[21] but is roughly an order of magnitude shorter than the genome of the smallest free-living organism.[22] The human genome for reference is 3.05×10⁹ base pairs long.[23]
Since its inception, the polyelectrolyte theory of the gene has been put in the context of searching for life in the universe. This theory, combined with Schrödinger's view of a gene as an aperiodic crystal, provides a so-called "agnostic biosignature", a sign of life that does not presuppose any biochemistry.[24] In other words, a generalized view of life should hold anywhere in the universe.
Since the theorized genetic polyelectrolyte biomolecules could be charged either positively or negatively, as in the case of DNA and RNA, they can be concentrated in water with an electric field using electrophoresis or electrodialysis. This hypothetical concentration device has been called an agnostic life-finding device. Similar to how electrophoresis works to separate DNA molecules, negatively charged molecules, like DNA or RNA, would be attracted to a positively charged anode, and positively charged genetic biomolecules would be attracted to a negatively charged cathode.
Once the polyelectrolyte biomolecule has been concentrated, Benner suggests the molecules be tested for size and shape uniformity. In addition, the molecules should be tested for the use of a limited number of building blocks arranged in a non-repeating fashion, an aperiodic crystal structure. Benner has suggested that this could be done using matrix-assisted laser desorption ionization (MALDI) paired with an orbitrap high-resolution mass spectrometer.[25] Another suggested approach has been to use nanopore sequencing technology, although questions of whether the solar radiation experienced during transit and on-site would affect the functionality of the device remain.[26] While space agencies have yet to use any of these proposed systems for life detection, they may be used in the future on Mars, Enceladus, and Europa.
Despite the polyelectrolyte theory of the gene and the aperiodic crystal view of the gene being described as agnostic biosignatures, these theories are terra-, or earth-life, centric. It is unknown what life on another world might be; while it is often stated that life of any kind needs biomolecules and water, this may not be true.