RNA polymerase IV (RNAP IV) is an enzyme that synthesizes small interfering RNA (siRNA) in plants, which silence gene expression.[1] [2] [3] RNAP IV belongs to a family of enzymes that catalyze the process of transcription known as RNA Polymerases, which synthesize RNA from DNA templates.[4] Discovered via phylogenetic studies of land plants, genes of RNAP IV are thought to have resulted from multistep evolution processes that occurred in RNA Polymerase II phylogenies.[5] Such an evolutionary pathway is supported by the fact that RNAP IV is composed of 12 protein subunits that are either similar or identical to RNA polymerase II, and is specific to plant genomes.[6] Via its synthesis of siRNA, RNAP IV is involved in regulation of heterochromatin formation in a process known as RNA directed DNA Methylation (RdDM).
Phylogenetic studies of land plants have led to the discovery of RNA Polymerase IV. Analysis of the largest (RPD1) and second-largest subunits (RPD2) of RNAP IV were analogous to the Blast searches of RNAP II genes. Genes for RPD1 and RPD2 were found in all terrestrial plants, and the largest gene was found in the algal taxon, Charale. Further analysis of the origin of the protein indicates a gene duplication event of the largest subunit which suggested that the duplication event occurred after the divergence of Charales and land plants and algae. Specifically, the largest subunit in RNAP II formed RPD1 through a duplication event and the RPD2 gene arose due to a divergence. Evidence of these duplication events imply that the RNAP IV genes come from RNAP II phylogenies in a multistep process. In other words, the divergence of the first subunit is the first step of multiple in the evolution of new RNAPs. RNAP IV also shares multiple subunits with RNAP II, in addition to the largest and second largest subunits, which was also suggested by continuous duplication events of particular lineages.[7]
Arabidopsis expresses two forms of RNAP IV, formerly referred to as RNAP IVa and RNAP IVb, which differ at the largest subunit and have non redundant actions.[8] Efficient silencing of transposons requires both RNAP IV forms while only RNAP IVa is required for basal silencing. This finding suggested the requirement of both forms for the mechanism of transposon methylation. Later experiments have shown that what was once thought to be two forms of RNAP IV are actually two structurally and functionally distinct polymerases.[9] RNAP IVa was specified to be RNAP IV while RNAP IVb became known as RNAP V.
RNA Polymerase IV is composed of 12 protein subunits that are either similar or identical to the 12 subunits composing RNA Polymerase II. Only four subunits distinguish RNAP IV structure from RNAP II and RNAP V. RNA Polymerase V differs from RNAP II by six subunits, indicating that both RNAP IV and RNAP V evolved from RNAP II in plants. In Arabidopsis, two unique genes were found to encode subunits that distinguish RNAP IV from RNAP II.[10] The largest subunit is encoded by NRPD1 (formerly NRPD1a), while the second largest subunit is encoded by NRPD2 and is shared with RNAP V. These subunits contain carboxyl-terminal domains (CTDs) which are necessary for the production of 20-30% of the siRNAs produced by RNA Polymerase IV, yet are not required for DNA methylation.[11]
There is evidence that RNA Polymerase IV (RNAP IV) is responsible for producing heterochromatin, as dysfunction of either RNAP IV catalytic subunit (NRPD1 and NRPD2) disrupts the formation of heterochromatin. As heterochromatin is the silenced portion of DNA, it is formed when RNAP IV amplifies production of small interfering RNAs (siRNA) that are responsible for methylating cytosine bases in DNA; this methylation silences segments of the genetic code, which can still be transcribed into mRNA but not translated into proteins.[12] RNAP IV is involved in setting the methylation patterns in the 5S genes during plant maturation, resulting in the development of adult features in plants.[13]
In the first step of heterochromatin formation, RNAP IV couples with an RNA-dependent RNA polymerase known as RDR2 to make a double stranded precursor to siRNA.[14] Next, DICER-Like Protein 3 (DLP3), an enzyme which slices double stranded RNA substrates, cleaves the double stranded precursor into siRNAs that are each 24 nucleotides long.[15] These siRNAs are then methylated at their 3’ ends by a protein known as HUA ENHANCER 1 (HEN1).[16] Finally, these methylated siRNAs complex with a protein known as ARGONAUTE-4 (AGO4) in order to form the silencing complex that can perform the required methylation for heterochromatin production.[17] This process is referred to as RNA-directed DNA Methylation (RdDM) or Pol IV-mediated silencing as the introduction of these methyl groups by siRNAs silence both transposons and repetitive sequences of DNA.
SAWADEE HOMEODOMAIN HOMOLOG 1 (SHH1) is a protein that interacts with RNAP IV and is critical in its regulation through methylation. SHH1 can only bind to chromatin at specified “marked” segments, as its “SAWADEE” domain is a chromatin binding domain that probes for unmethylated K4 and methylated K9 modifications on the histone 3 (H3) tail of chromatin; its binding pockets then attach to chromatin at these sites and allow RNAP IV occupancy at these same loci. In this manner, SHH1 functions to enable RNAP IV recruitment and stability at the most actively targeted genomic loci in RdDM in order to promote the previously mentioned siRNA biogenesis of 24 nucleotide-long siRNA. Furthermore, it binds to repressive histone modifications, and any mutations that interfere with this process are associated with a reduction in DNA methylation and siRNA production.[18] Regulation of siRNA production by RNAP IV through this mechanism results in major downstream effects, as the siRNAs produced in this manner defend the genome against the proliferation of invading viruses and endogenous transposable elements.[19]