Glutamate Rich Protein 2 is a protein in humans encoded by the gene ERICH2. This protein is expressed heavily in male tissues specifically in the testes, and proteins are specifically found in the nucleoli fibrillar center and the vesicles of these testicular cells.[1] The protein has multiple protein interactions which indicate that it may play a role in histone modification and proper histone functioning.
ERICH2 is located on human Chromosome 2, at 2q31.1.[2] It contains 10 distinct exons. The gene itself is 28,930 base pairs long and is flanked by the EIF2S2P4 and GAD1 genes. There are no known paralogs of the ERICH2 gene.
ERICH2 transcription produces three validated distinct mRNA variants. The longest transcript variant is 1,388 base pairs in length, 1,311 of which are coding. The second variant differs from the first in its 5' UTR. It also has coding sequence differences and a distinct N-terminus compared to variant 1. Variant 3 lacks several exons, has a distinct 3' UTR and C- terminus coding region. This variant is also shorter than the other two at 1,063 base pairs.
The ERICH2 protein is 436 amino acids in length, and has a molecular weight of approximately 48,000 kD,[3] with an isoelectric point of approximately 5. The protein is determined to be rich in the amino acid proline and low in tyrosine and glycine.
Two known motifs were found in the human ERICH2 protein. The KKNT motif functions in cAMP- and cGMP- dependent protein phosphorylation, this protein motif was found only in primates.[4] There is also a FGRR motif conserved in mammals that is defined as an amidation site.[5] Finally the ERICH2 protein contains the PHA03247 domain that is 32 amino acids long. This domain is not generally conserved through orthologs and the function is unknown. It is present in the proteins that make up the herpes virion.[6]
Secondary structure prediction shows one alpha helix and one beta strand formation. The alpha helix encompasses the entire conserved section as seen in the cartoon of the ERICH2 protein. The beta strand is predicted 12 amino acids down from the amidation site and encompasses 4 amino acids.[7] Four nuclear localization signals were found in the protein, two pat4 signals and two pat7 signals, their locations are shown in the cartoon.[8] It is predicted in the 78th percent that the protein resides in the nucleus.
ERICH2 is not ubiquitously expressed. It however, has been shown to be expressed narrowly in the choroid plexus of a developing fetus and in the testes of adults.[9] Lung and female tissue expression were also present but expression was greatly decreased.[10] Proteins are specifically located in the nucleoli fibrillar center and the vesicles within cells.[11]
Many phosphorylation sites are predicted for the ERICH2 protein. None are predicted on tyrosines only on serines and the threonines.[12] [13] There is also a predicted acetylation site at the N-terminus of the protein, specifically it is predicted on the third amino acid.[14] Many SOX/SRY-sex/testis determining and related HMG box factor transcription factors and estrogen related transcription factors are predicted to bind and regulate transcription of ERICH2.[15]
ERICH2 interacts with proteins in the H2A family.[16] The H2A proteins specifically play a role in the octamer structure of histone. ERICH2 is specifically known to interact with the H2AFY protein, which plays a key role in the stable X chromosome inactivation and can function by replacing a normal H2A in certain nucleosomes and thus repressing transcription.[17]
ERICH2 is also known to interact with the protein SDCB1 which functions in vesicle trafficking and the regulation of growth and proliferation of certain cancer cells.[18]
The IWS1 protein also interacts with ERICH2. This protein functions as a transcription factor and plays a key role in defining the composition of the RNA polymerase II elongation complex.[19] This complex then plays a role in histone modification and proper splicing.
Two-hybrid assays and other protein interaction methods have shown an interaction with the PSORS1C2 protein, but the function of this protein remains unknown.
No paralogs for the ERICH2 protein are known. ERICH2 has 124 known orthologs spanning multiple taxa.
Genus and Species | Common Name | Date of Divergence (MYA)[20] | Sequence Length (aa) | Sequence Identity | Sequence similarity | |
---|---|---|---|---|---|---|
Homo sapiens | Human | 0 | 436 | -- | -- | |
Rousettus aegyptiacus | Egyptian fruit bat | 94 | 430 | 58% | 63% | |
Propithecus coquereli | Coquerel's sifaka | 74 | 323 | 54% | 60% | |
Mus musculus | Mouse | 90 | 463 | 47% | 56% | |
Ursus Maritimus | Polar Bear | 94 | 296 | 50% | 53% | |
Alligator mississippiensis | American Alligator | 320 | 370 | 28% | 38% | |
Thamnophis sirtalis | Common Garter Snake | 320 | 309 | 27% | 37% | |
Callorhinchus milii | Australian Ghost Shark | 465 | 319 | 22% | 33% | |
Danio rerio | Zebra Fish | 432 | 310 | 24% | 30% | |
Strongylocentrotus purpuratus | Purple Sea Urchin | 627 | 470 | 23% | 25% | |
Crassostrea gigas | Pacific Oyster | 758 | 293 | 17% | 22% | |
Bemisia tabaci | Silverleaf Whitefly | 758 | 213 | 13% | 14% | |
Trichoplax adhaerens | Trichoplax | 930 | 164 | 12% | 15% |