DNA mismatch repair (MMR) is a system for recognizing and repairing erroneous insertion, deletion, and mis-incorporation of bases that can arise during DNA replication and recombination, as well as repairing some forms of DNA damage.[1] [2]
Mismatch repair is strand-specific. During DNA synthesis the newly synthesised (daughter) strand will commonly include errors. In order to begin repair, the mismatch repair machinery distinguishes the newly synthesised strand from the template (parental). In gram-negative bacteria, transient hemimethylation distinguishes the strands (the parental is methylated and daughter is not). However, in other prokaryotes and eukaryotes, the exact mechanism is not clear. It is suspected that, in eukaryotes, newly synthesized lagging-strand DNA transiently contains nicks (before being sealed by DNA ligase) and provides a signal that directs mismatch proofreading systems to the appropriate strand. This implies that these nicks must be present in the leading strand, and evidence for this has recently been found.[3] Recent work[4] has shown that nicks are sites for RFC-dependent loading of the replication sliding clamp, proliferating cell nuclear antigen (PCNA), in an orientation-specific manner, such that one face of the donut-shape protein is juxtaposed toward the 3'-OH end at the nick. Loaded PCNA then directs the action of the MutLalpha endonuclease [5] to the daughter strand in the presence of a mismatch and MutSalpha or MutSbeta.
Any mutational event that disrupts the superhelical structure of DNA carries with it the potential to compromise the genetic stability of a cell. The fact that the damage detection and repair systems are as complex as the replication machinery itself highlights the importance evolution has attached to DNA fidelity.
Examples of mismatched bases include a G/T or A/C pairing (see DNA repair). Mismatches are commonly due to tautomerization of bases during DNA replication. The damage is repaired by recognition of the deformity caused by the mismatch, determining the template and non-template strand, and excising the wrongly incorporated base and replacing it with the correct nucleotide. The removal process involves more than just the mismatched nucleotide itself. A few or up to thousands of base pairs of the newly synthesized DNA strand can be removed.
Symbol: | DNA_mis_repair |
DNA mismatch repair protein, C-terminal domain | |
Pfam: | PF01119 |
Pfam Clan: | CL0329 |
Interpro: | IPR013507 |
Prosite: | PDOC00057 |
Scop: | 1bkn |
Mismatch repair is a highly conserved process from prokaryotes to eukaryotes. The first evidence for mismatch repair was obtained from S. pneumoniae (the hexA and hexB genes). Subsequent work on E. coli has identified a number of genes that, when mutationally inactivated, cause hypermutable strains. The gene products are, therefore, called the "Mut" proteins, and are the major active components of the mismatch repair system. Three of these proteins are essential in detecting the mismatch and directing repair machinery to it: MutS, MutH and MutL (MutS is a homologue of HexA and MutL of HexB).
MutS forms a dimer (MutS2) that recognises the mismatched base on the daughter strand and binds the mutated DNA. MutH binds at hemimethylated sites along the daughter DNA, but its action is latent, being activated only upon contact by a MutL dimer (MutL2), which binds the MutS-DNA complex and acts as a mediator between MutS2 and MutH, activating the latter. The DNA is looped out to search for the nearest d(GATC) methylation site to the mismatch, which could be up to 1 kb away. Upon activation by the MutS-DNA complex, MutH nicks the daughter strand near the hemimethylated site. MutL recruits UvrD helicase (DNA Helicase II) to separate the two strands with a specific 3' to 5' polarity. The entire MutSHL complex then slides along the DNA in the direction of the mismatch, liberating the strand to be excised as it goes. An exonuclease trails the complex and digests the ss-DNA tail. The exonuclease recruited is dependent on which side of the mismatch MutH incises the strand – 5' or 3'. If the nick made by MutH is on the 5' end of the mismatch, either RecJ or ExoVII (both 5' to 3' exonucleases) is used. If, however, the nick is on the 3' end of the mismatch, ExoI (a 3' to 5' enzyme) is used.
The entire process ends past the mismatch site - i.e., both the site itself and its surrounding nucleotides are fully excised. The single-strand gap created by the exonuclease can then be repaired by DNA Polymerase III (assisted by single-strand-binding protein), which uses the other strand as a template, and finally sealed by DNA ligase. DNA methylase then rapidly methylates the daughter strand.
When bound, the MutS2 dimer bends the DNA helix and shields approximately 20 base pairs. It has weak ATPase activity, and binding of ATP leads to the formation of tertiary structures on the surface of the molecule. The crystal structure of MutS reveals that it is exceptionally asymmetric, and, while its active conformation is a dimer, only one of the two halves interacts with the mismatch site.
In eukaryotes, MutS homologs form two major heterodimers: Msh2/Msh6 (MutSα) and Msh2/Msh3 (MutSβ). The MutSα pathway is involved primarily in base substitution and small-loop mismatch repair. The MutSβ pathway is also involved in small-loop repair, in addition to large-loop (~10 nucleotide loops) repair. However, MutSβ does not repair base substitutions.
MutL also has weak ATPase activity (it uses ATP for purposes of movement). It forms a complex with MutS and MutH, increasing the MutS footprint on the DNA.
However, the processivity (the distance the enzyme can move along the DNA before dissociating) of UvrD is only ~40–50 bp. Because the distance between the nick created by MutH and the mismatch can average ~600 bp, if there is not another UvrD loaded the unwound section is then free to re-anneal to its complementary strand, forcing the process to start over. However, when assisted by MutL, the rate of UvrD loading is greatly increased. While the processivity (and ATP utilisation) of the individual UvrD molecules remains the same, the total effect on the DNA is boosted considerably; the DNA has no chance to re-anneal, as each UvrD unwinds 40-50 bp of DNA, dissociates, and then is immediately replaced by another UvrD, repeating the process. This exposes large sections of DNA to exonuclease digestion, allowing for quick excision (and later replacement) of the incorrect DNA.
Eukaryotes have five MutL homologs designated as MLH1, MLH2, MLH3, PMS1, and PMS2. They form heterodimers that mimic MutL in E. coli. The human homologs of prokaryotic MutL form three complexes referred to as MutLα, MutLβ, and MutLγ. The MutLα complex is made of MLH1 and PMS2 subunits, the MutLβ heterodimer is made of MLH1 and PMS1, whereas MutLγ is made of MLH1 and MLH3. MutLα acts as an endonuclease that introduces strand breaks in the daughter strand upon activation by mismatch and other required proteins, MutSα and PCNA. These strand interruptions serve as entry points for an exonuclease activity that removes mismatched DNA. Roles played by MutLβ and MutLγ in mismatch repair are less-understood.
MutH is a very weak endonuclease that is activated once bound to MutL (which itself is bound to MutS). It nicks unmethylated DNA and the unmethylated strand of hemimethylated DNA but does not nick fully methylated DNA. Experiments have shown that mismatch repair is random if neither strand is methylated. These behaviours led to the proposal that MutH determines which strand contains the mismatch. MutH has no eukaryotic homolog. Its endonuclease function is taken up by MutL homologs, which have some specialized 5'-3' exonuclease activity. The strand bias for removing mismatches from the newly synthesized daughter strand in eukaryotes may be provided by the free 3' ends of Okazaki fragments in the new strand created during replication.
PCNA and the β-sliding clamp associate with MutSα/β and MutS, respectively. Although initial reports suggested that the PCNA-MutSα complex may enhance mismatch recognition,[6] it has been recently demonstrated[7] that there is no apparent change in affinity of MutSα for a mismatch in the presence or absence of PCNA. Furthermore, mutants of MutSα that are unable to interact with PCNA in vitro exhibit the capacity to carry out mismatch recognition and mismatch excision to near wild type levels. Such mutants are defective in the repair reaction directed by a 5' strand break, suggesting for the first time MutSα function in a post-excision step of the reaction.
Mutations in the human homologues of the Mut proteins affect genomic stability, which can result in microsatellite instability (MSI), implicated in some human cancers. In specific, the hereditary nonpolyposis colorectal cancers (HNPCC or Lynch syndrome) are attributed to damaging germline variants in the genes encoding the MutS and MutL homologues MSH2 and MLH1 respectively, which are thus classified as tumour suppressor genes. One subtype of HNPCC, the Muir-Torre Syndrome (MTS), is associated with skin tumors. If both inherited copies (alleles) of a MMR gene bear damaging genetic variants, this results in a very rare and severe condition: the mismatch repair cancer syndrome (or constitutional mismatch repair deficiency, CMMR-D), manifesting as multiple occurrences of tumors at an early age, often colon and brain tumors.
Sporadic cancers with a DNA repair deficiency only rarely have a mutation in a DNA repair gene, but they instead tend to have epigenetic alterations such as promoter methylation that inhibit DNA repair gene expression.[8] About 13% of colorectal cancers are deficient in DNA mismatch repair, commonly due to loss of MLH1 (9.8%), or sometimes MSH2, MSH6 or PMS2 (all ≤1.5%).[9] For most MLH1-deficient sporadic colorectal cancers, the deficiency was due to MLH1 promoter methylation. Other cancer types have higher frequencies of MLH1 loss (see table below), which are again largely a result of methylation of the promoter of the MLH1 gene. A different epigenetic mechanism underlying MMR deficiencies might involve over-expression of a microRNA, for example miR-155 levels inversely correlate with expression of MLH1 or MSH2 in colorectal cancer.[10]
Frequency of deficiency in cancer | Frequency of deficiency in adjacent field defect | ||
---|---|---|---|
Stomach | 32%[11] [12] | 24%-28% | |
Stomach (foveolar type tumors) | 74%[13] | 71% | |
Stomach in high-incidence Kashmir Valley | 73%[14] | 20% | |
Esophageal | 73%[15] | 27% | |
Head and neck squamous cell carcinoma (HNSCC) | 31%-33%[16] [17] | 20%-25% | |
Non-small cell lung cancer (NSCLC) | 69%[18] | 72% | |
Colorectal | 10% |
A field defect (field cancerization) is an area of epithelium that has been preconditioned by epigenetic or genetic changes, predisposing it towards development of cancer. As pointed out by Rubin " ...there is evidence that more than 80% of the somatic mutations found in mutator phenotype human colorectal tumors occur before the onset of terminal clonal expansion."[19] [20] Similarly, Vogelstein et al.[21] point out that more than half of somatic mutations identified in tumors occurred in a pre-neoplastic phase (in a field defect), during growth of apparently normal cells.
MLH1 deficiencies were common in the field defects (histologically normal tissues) surrounding tumors; see Table above. Epigenetically silenced or mutated MLH1 would likely not confer a selective advantage upon a stem cell, however, it would cause increased mutation rates, and one or more of the mutated genes may provide the cell with a selective advantage. The deficientMLH1 gene could then be carried along as a selectively near-neutral passenger (hitch-hiker) gene when the mutated stem cell generates an expanded clone. The continued presence of a clone with an epigenetically repressed MLH1 would continue to generate further mutations, some of which could produce a tumor.
MMR and mismatch repair mutations were initially observed to associate with immune checkpoint blockade efficacy in a study examining responders to anti-PD1.[22] The association between MSI positivity and positive response to anti-PD1 was subsequently validated in a prospective clinical trial and approved by the FDA.[23]
In humans, seven DNA mismatch repair (MMR) proteins (MLH1, MLH3, MSH2, MSH3, MSH6, PMS1 and PMS2) work coordinately in sequential steps to initiate repair of DNA mismatches.[24] In addition, there are Exo1-dependent and Exo1-independent MMR subpathways.[25]
Other gene products involved in mismatch repair (subsequent to initiation by MMR genes) in humans include DNA polymerase delta, PCNA, RPA, HMGB1, RFC and DNA ligase I, plus histone and chromatin modifying factors.[26] [27]
In certain circumstances, the MMR pathway may recruit an error-prone DNA polymerase eta (POLH). This happens in B-lymphocytes during somatic hypermutation, where POLH is used to introduce genetic variation into antibody genes.[28] However, this error-prone MMR pathway may be triggered in other types of human cells upon exposure to genotoxins [29] and indeed it is broadly active in various human cancers, causing mutations that bear a signature of POLH activity.[30]
Recognizing and repairing mismatches and indels is important for cells because failure to do so results in microsatellite instability (MSI) and an elevated spontaneous mutation rate (mutator phenotype). In comparison to other cancer types, MMR-deficient (MSI) cancer has a very high frequency of mutations, close to melanoma and lung cancer,[31] cancer types caused by much exposure to UV radiation and mutagenic chemicals.
In addition to a very high mutation burden, MMR deficiencies result in an unusual distribution of somatic mutations across the human genome: this suggests that MMR preferentially protects the gene-rich, early-replicating euchromatic regions.[32] In contrast, the gene-poor, late-replicating heterochromatic genome regions exhibit high mutation rates in many human tumors.[33]
The histone modification H3K36me3, an epigenetic mark of active chromatin, has the ability to recruit the MSH2-MSH6 (hMutSα) complex.[34] Consistently, regions of the human genome with high levels of H3K36me3 accumulate less mutations due to MMR activity.
Lack of MMR often occurs in coordination with loss of other DNA repair genes. For example, MMR genes MLH1 and MLH3 as well as 11 other DNA repair genes (such as MGMT and many NER pathway genes) were significantly down-regulated in lower grade as well as in higher grade astrocytomas, in contrast to normal brain tissue.[35] Moreover, MLH1 and MGMT expression was closely correlated in 135 specimens of gastric cancer and loss of MLH1 and MGMT appeared to be synchronously accelerated during tumor progression.[36]
Deficient expression of multiple DNA repair genes is often found in cancers, and may contribute to the thousands of mutations usually found in cancers (see Mutation frequencies in cancers).
A popular idea, that has failed to gain significant experimental support, is the idea that mutation, as distinct from DNA damage, is the primary cause of aging. Mice defective in the mutL homolog Pms2 have about a 100-fold elevated mutation frequency in all tissues, but do not appear to age more rapidly.[37] These mice display mostly normal development and life, except for early onset carcinogenesis and male infertility.