ORF1ab explained

Replicase polyprotein
Organism:	SARS-CoV
Symbol:	rep
Uniprot:	P0C6X7

Replicase polyprotein
Organism:	SARS-CoV-2
Symbol:	rep
Uniprot:	P0DTD1

ORF1ab (also ORF1a/b) refers collectively to two open reading frames (ORFs), ORF1a and ORF1b, that are conserved in the genomes of nidoviruses, a group of viruses that includes coronaviruses. The genes express large polyproteins that undergo proteolysis to form several nonstructural proteins with various functions in the viral life cycle, including proteases and the components of the replicase-transcriptase complex (RTC).^[1] ^[2] ^[3] Together the two ORFs are sometimes referred to as the replicase gene.^[4] They are related by a programmed ribosomal frameshift that allows the ribosome to continue translating past the stop codon at the end of ORF1a, in a -1 reading frame. The resulting polyproteins are known as pp1a and pp1ab.

Expression

Taxid:	86693
Size:	29,903 bases
Year:	2020
Ucsc Assembly:	wuhCor1

ORF1a is the first open reading frame at the 5' end of the genome. Together ORF1ab occupies about two thirds of the genome, with the remaining third at the 3' end encoding the structural proteins and accessory proteins. It is translated from a 5' capped RNA by cap-dependent translation. Nidoviruses have a complex system of discontinuous subgenomic RNA production to enable expression of genes in their relatively large RNA genomes (typically 27-32kb for coronaviruses), but ORF1ab is translated directly from the genomic RNA.^[5] ORF1ab sequences have been observed in noncanonical subgenomic RNAs, though their functional significance is unclear.

A programmed ribosomal frameshift allows reading through the stop codon that terminates ORF1a to continue in a -1 reading frame, producing the longer polyprotein pp1ab. The frameshift occurs at a slippery sequence which is followed by a pseudoknot RNA secondary structure. This has been measured at between 20-50% efficiency for murine coronavirus,^[6] or 45-70% in SARS-CoV-2^[7] yielding a stoichiometry of roughly 1.5 to 2 times as much pp1a as pp1ab protein expressed.

Processing

The polyproteins pp1a and pp1ab contain about 13 to 17 nonstructural proteins. They undergo auto-proteolysis to release the nonstructural proteins due to the actions of internal cysteine protease domains.

In coronaviruses, there are a total of 16 nonstructural proteins; pp1a protein contains nonstructural proteins nsp1-11 and the pp1ab protein contains nsp1-10 and nsp12-16. Proteolytic processing is performed by two proteases: the papain-like protease protein domain located in the multidomain protein nsp3 cleaves up to nsp4, and the 3CL protease (also known as the main protease, nsp5) performs the remaining cleavages of nsp5 through the polyprotein C-terminus. Proteins nsp12-16, the C-terminal components of the pp1ab polyprotein, contain the core enzymatic activities necessary for viral replication. After proteolytic processing, several of the nonstructural proteins assemble into a large protein complex known as the replicase-transcriptase complex (RTC) which performs genome replication and transcription.

Components

Core replicase domains

A set of five conserved "core replicase" protein domains are present in all nidovirus lineages (arteriviruses, mesoniviruses, roniviruses, and coronaviruses): from ORF1a, the main protease flanked on either end by transmembrane domains; and from ORF1b, a nucleotidyltransferase domain known as NiRAN, RNA-dependent RNA polymerase (RdRp), a zinc-binding domain, and a helicase.^[3] ^[8] (This is sometimes considered seven domains, counting the transmembrane regions separately.) In addition, an endoribonuclease domain is found in all nidoviruses that infect vertebrate hosts. Arteriviruses, which have smaller genomes than the other nidovirus lineages, also lack methyltransferases as well as a proofreading exoribonuclease, a domain that is conserved in nidoviruses with larger genomes. This proofreading functionality is thought to be required for sufficient fidelity to replicate large RNA genomes, but may also play additional roles in some viruses.

Coronaviruses

In coronaviruses, pp1a and pp1ab together contain sixteen nonstructural proteins, which have the following functions:^[9] ^[10]

Nonstructural proteins derived from coronavirus pp1a and pp1ab proteins
Nonstructural protein	Function
	Cellular mRNA degradation, host cell translation inhibition, interferon inhibition; not present in Gammacoronavirus
	Unknown; binds prohibitin
	Multi-domain protein with one or two papain-like protease domains for polyprotein processing; interferon antagonist; multiple other roles
	Double-membrane vesicle formation
	3CL protease for polyprotein processing; interferon inhibition
	Double-membrane vesicle formation
	Cofactor and processivity factor for RdRp; forms complex with nsp8 and nsp12
	Cofactor and processivity factor for RdRp; forms complex with nsp7 and nsp12
	Single-stranded RNA binding
	Cofactor for nsp14 and nsp16
	Unknown
nonstructural protein 12	RNA-dependent RNA polymerase (RdRp) and nucleotidyltransferase
	Helicase and RNA triphosphatase
	Proofreading exonuclease, RNA cap formation, guanosine N7-methyltransferase
	Endoribonuclease, immune evasion function
	Ribose 2'-O-methyltransferase, RNA cap formation

Evolution

The structure and organization of the genome, including ORF1a, ORF1b, and the frameshift separating them, is conserved among nidoviruses. Some "non-canonical" nidovirus structures have been described, mainly involving gene fusions. The largest known nidovirus, planarian secretory cell nidovirus (PSCNV), with a 41kb genome, has a non-canonical genome structure in which ORF1a, ORF1b, and downstream ORFs containing structural proteins are fused and expressed as a single large ORF encoding a polyprotein of over 13,000 amino acids.^[11] In these non-canonical genomes, other frameshift locations or stop codon readthrough may be used to regulate the stoichiometry of viral proteins.

Nidoviruses vary widely in genome size, from arteriviruses with typically 12-15kb genomes to coronaviruses at 27-32kb. Their evolutionary history has been of research interest in understanding the replication of very large RNA genomes despite the relatively low-fidelity replication mechanism of the viral RNA-dependent RNA polymerase (RdRp). The larger nidovirus genomes (above around 20kb) encode a proofreading exoribonuclease (nsp14 in coronaviruses) thought to be required for replication fidelity.

Among coronaviruses, ORF1ab is more highly conserved than the 3' ORFs encoding structural proteins. Throughout the COVID-19 pandemic, the genome of SARS-CoV-2 viruses has been sequenced many times, resulting in identification of thousands of distinct variants. In a World Health Organization analysis from July 2020, ORF1ab was the most frequently mutated gene, followed by the S gene encoding the spike protein. The most commonly mutated protein within ORF1ab was papain-like protease (nsp3), and the single most commonly observed missense mutation was in RNA-dependent RNA polymerase.^[12] Some PCR tests that detect COVID-19 analyze the specimen for the ORF1ab gene, among others.^[13]

Notes and References

Hartenian E, Nandakumar D, Lari A, Ly M, Tucker JM, Glaunsinger BA . The molecular virology of coronaviruses . The Journal of Biological Chemistry . 295 . 37 . 12910–12934 . September 2020 . 32661197 . 7489918 . 10.1074/jbc.REV120.013930 . free .
V'kovski P, Kratzel A, Steiner S, Stalder H, Thiel V . Coronavirus biology and replication: implications for SARS-CoV-2 . Nature Reviews. Microbiology . 19 . 3 . 155–170 . March 2021 . 33116300 . 7592455 . 10.1038/s41579-020-00468-6 .
Posthuma CC, Te Velthuis AJ, Snijder EJ . Nidovirus RNA polymerases: Complex enzymes handling exceptional RNA genomes . Virus Research . 234 . 58–73 . April 2017 . 28174054 . 7114556 . 10.1016/j.virusres.2017.01.023 .
Gulyaeva AA, Gorbalenya AE . A nidovirus perspective on SARS-CoV-2 . Biochemical and Biophysical Research Communications . 538 . 24–34 . January 2021 . 33413979 . 7664520 . 10.1016/j.bbrc.2020.11.015 .
Wang D, Jiang A, Feng J, Li G, Guo D, Sajid M, Wu K, Zhang Q, Ponty Y, Will S, Liu F, Yu X, Li S, Liu Q, Yang XL, Guo M, Li X, Chen M, Shi ZL, Lan K, Chen Y, Zhou Y . 6 . The SARS-CoV-2 subgenome landscape and its novel regulatory features . Molecular Cell . 81 . 10 . 2135–2147.e5 . May 2021 . 33713597 . 7927579 . 10.1016/j.molcel.2021.02.036 .
Irigoyen N, Firth AE, Jones JD, Chung BY, Siddell SG, Brierley I . High-Resolution Analysis of Coronavirus Gene Expression by RNA Sequencing and Ribosome Profiling . PLOS Pathogens . 12 . 2 . e1005473 . February 2016 . 26919232 . 4769073 . 10.1371/journal.ppat.1005473 . free .
Finkel Y, Mizrahi O, Nachshon A, Weingarten-Gabbay S, Morgenstern D, Yahalom-Ronen Y, Tamir H, Achdout H, Stein D, Israeli O, Beth-Din A, Melamed S, Weiss S, Israely T, Paran N, Schwartz M, Stern-Ginossar N . 6 . The coding capacity of SARS-CoV-2 . Nature . 589 . 7840 . 125–130 . January 2021 . 32906143 . 10.1038/s41586-020-2739-1 . 221624633 . 2021Natur.589..125F . free .
Ogando NS, Ferron F, Decroly E, Canard B, Posthuma CC, Snijder EJ . The Curious Case of the Nidovirus Exoribonuclease: Its Role in RNA Synthesis and Replication Fidelity . Frontiers in Microbiology . 10 . 1813 . 7 August 2019 . 31440227 . 6693484 . 10.3389/fmicb.2019.01813 . free .
Rohaim MA, El Naggar RF, Clayton E, Munir M . Structural and functional insights into non-structural proteins of coronaviruses . Microbial Pathogenesis . 150 . 104641 . January 2021 . 33242646 . 7682334 . 10.1016/j.micpath.2020.104641 .
Chen Y, Liu Q, Guo D . Emerging coronaviruses: Genome structure, replication, and pathogenesis . Journal of Medical Virology . 92 . 4 . 418–423 . April 2020 . 31967327 . 7167049 . 10.1002/jmv.25681 .
Saberi A, Gulyaeva AA, Brubacher JL, Newmark PA, Gorbalenya AE . A planarian nidovirus expands the limits of RNA genome size . PLOS Pathogens . 14 . 11 . e1007314 . November 2018 . 30383829 . 10.1371/journal.ppat.1007314 . 6211748 . 53872740 . free .
Koyama T, Platt D, Parida L . Variant analysis of SARS-CoV-2 genomes . Bulletin of the World Health Organization . 98 . 7 . 495–504 . July 2020 . 32742035 . 7375210 . 10.2471/BLT.20.253591 .
News: Richardson . Robin . Open Wide . 21 November 2022 . The Marshall News Messenger . August 22, 2021 . A1, A2 . en.