Adenovirus genomes are linear, non-segmented double-stranded (ds) DNA molecules that are typically 26-46 Kbp long, containing 23-46 protein-coding genes.[1] The example used for the following description is Human adenovirus E, a mastadenovirus with a 36 Kbp genome containing 38 protein-coding genes.[2] While the precise number and identity of genes varies among adenoviruses, the basic principles of genome organization and the functions of most of the genes described in this article are shared among all adenoviruses.
The 38 genes in the Human adenovirus E genome are organized in 17 transcription units, each containing 1-8 coding sequences.[3] Alternative splicing during processing of the pre-mRNAs produced by each transcription unit enable multiple different mRNAs to be produced from one transcription unit.
The E1A, E1B, E2A, E2B, E3, and E4 transcription units are successively transcribed early in the viral reproductive cycle. The proteins coded for by genes within these transcription units are mostly involved in regulation of viral transcription, in replication of viral DNA, and in suppression of the host response to infection.[4]
The L1-L5 transcription units are transcribed later in the viral reproductive cycle, and code mostly for proteins that make up components of the viral capsid or are involved in assembly of the capsid. The L1-L5 transcription units are all regulated by the same promoter region and share the same transcription start site. As a result, transcription of all five late transcription units begins at the same point in the viral reproductive cycle.[5]
Transcription of pre-mRNAs beginning at the late promoter is randomly terminated at one of five termination sites, producing a population of transcripts of five different lengths. The pre-mRNAs of any given length are then alternatively spliced to produce 1-4 different mRNAs coding for a corresponding number of proteins.
The names, locations, and properties of the 38 protein-coding genes in the Human Adenovirus E genome are given in the following table.[6] [7]
Protein name | Protein identifier | Transcription unit | Start base | Stop base | Strand | Length (amino acids) | |
---|---|---|---|---|---|---|---|
control protein E1A | YP_068018.1 | E1A | 576 | 1441 | + | 257 | |
control protein E1B 19K | YP_068019.1 | E1B | 1600 | 2115 | + | 171 | |
control protein E1B 55K | YP_068020.1 | E1B | 1905 | 3356 | + | 483 | |
capsid protein IX | YP_068021.1 | IX | 3441 | 3869 | + | 142 | |
encapsidation protein IVa2 | YP_068022.1 | IVa2 | 3930 | 5554 | - | 448 | |
DNA polymerase | YP_068023.1 | E2B | 5033 | 13773 | - | 1193 | |
protein 13.6K | YP_001661328.1 | L1 | 7814 | 9476 | + | 139 | |
terminal protein precursor pTP | YP_068024.1 | E2B | 8404 | 13773 | - | 642 | |
encapsidation protein 52K | YP_068025.1 | L1 | 10765 | 11937 | + | 390 | |
capsid protein precursor pIIIa | YP_068026.1 | L1 | 11961 | 13736 | + | 591 | |
penton base (capsid protein III) | YP_068027.1 | L2 | 13815 | 15422 | + | 535 | |
core protein precursor pVII | YP_068028.1 | L2 | 15426 | 16007 | + | 193 | |
core protein V | YP_068029.1 | L2 | 16055 | 17080 | + | 341 | |
core protein precursor pX | YP_068030.1 | L2 | 17103 | 17336 | + | 77 | |
capsid protein precursor pVI | YP_068031.1 | L3 | 17413 | 18141 | + | 242 | |
hexon (capsid protein II) | YP_068032.1 | L3 | 18248 | 21058 | + | 936 | |
protease | YP_068033.1 | L3 | 21082 | 21702 | + | 206 | |
single-stranded DNA-binding protein | YP_068034.1 | E2A-L | 21774 | 23312 | - | 512 | |
hexon assembly protein 100K | YP_068035.1 | L4 | 23341 | 25716 | + | 791 | |
protein 33K | YP_068036.1 | L4 | 25439 | 26252 | + | 214 | |
encapsidation protein 22K | YP_068037.1 | L4 | 25439 | 25978 | + | 179 | |
capsid protein precursor pVIII | YP_068038.1 | L4 | 26321 | 27004 | + | 227 | |
control protein E3 12.5K | YP_068039.1 | E3 | 27005 | 27325 | + | 106 | |
membrane glycoprotein E3 CR1-alpha | YP_068040.1 | E3 | 27279 | 27911 | + | 210 | |
membrane glycoprotein E3 gp19K | YP_068041.1 | E3 | 27893 | 28417 | + | 174 | |
membrane glycoprotein E3 CR1-beta | YP_068042.1 | E3 | 28449 | 29111 | + | 220 | |
membrane glycoprotein E3 CR1-delta | YP_068043.1 | E3 | 29440 | 30264 | + | 274 | |
membrane protein E3 RID-alpha | YP_068044.1 | E3 | 30273 | 30548 | + | 91 | |
membrane protein E3 RID-beta | YP_068045.1 | E3 | 30554 | 30994 | + | 146 | |
control protein E3 14.7K | YP_068046.1 | E3 | 30987 | 31388 | + | 133 | |
protein U | YP_068047.1 | U | 31481 | 31632 | - | 50 | |
fiber (capsid protein IV) | YP_068048.1 | L5 | 31649 | 32926 | + | 425 | |
control protein E4orf6/7 | YP_068049.1 | E4 | 33022 | 34169 | - | 141 | |
control protein E4 34K | YP_068050.1 | E4 | 33270 | 34169 | - | 299 | |
control protein E4orf4 | YP_068051.1 | E4 | 34072 | 34440 | - | 122 | |
control protein E4orf3 | YP_068052.1 | E4 | 34449 | 34802 | - | 117 | |
control protein E4orf2 | YP_068053.1 | E4 | 34799 | 35188 | - | 129 | |
control protein E4orf1 | YP_068054.1 | E4 | 35236 | 35610 | - | 124 |
The functions of many adenovirus proteins are known:[5]