C2orf81 Explained

C2orf81 is a human gene encoding protein c2orf81, which is predicted to have nuclear localization.

Gene

C2orf81's aliases are LOC388963 and hCG40743.[1] The gene spans from bases 74,414,176 to 74,421,591 on the minus (-) strand of chromosome 2, and contains 4 exons. The coding region is 2086 base pairs, and the protein sequence contains 615 amino acids.[2]

Expression

The protein encoded by c2orf81 is expressed highly in testis, kidneys, and about 18 other tissues in humans.[3] Disease states in which it is expressed include in gliomas, neoplasm, and lymphoma.[4]

Transcription Variants

Only a few mutations have been documented to occur in c2orf81. Three common missense mutations occur in the 3’ UTR and in the coding sequence which change serine to leucine in the protein. Nonsense mutations have been documented as well, occurring exclusively in the codon for proline.

mRNA

The mRNA sequence contains and 2086 base pairs and 4 isoforms.

Protein

Properties and Composition

C2orf81 has a molecular weight of 66.6 kDa and its isoelectric point is 5.32.[5] It contains a high amount of prolines in the human protein and most mammalian homologs, but a higher amount of glutamic acid residues in non-mammalian vertebrate homologs.[6] C2orf81 has 4 isoforms and its most common isoform contains 615 amino acids. Isoforms 2 through 4 have 566, 520 and 588 amino acids respectively. C2orf81 is the only member of superfamily cl25621.[7]

Domains

Domain of unknown function (DUF) 4639 is unique to the c2orf81 protein and is conserved in eukaryotes.[8] DUF 4639 spans from amino acid 17 to the end of the protein in human c2orf81.

Subcellular Localization

C2orf81 is primarily predicted to be nuclear, but potentially also cytoplasmic and mitochondrial.[9]

Interacting proteins

C2orf81 protein is predicted to interact highly with enoyl-CoA hydratase and hydroxyacyl-CoA dehydrogenase, based on textmining and database searches.[10] Other predicted interacting proteins are acetyl-CoA carboxylases A and B, glycine dehydrogenase, 3-oxoacid CoA transferase 2.

Structure

The c2orf81 is composed mainly of alpha helices. It contains fewer beta pleated sheets, turns, and coils.[11]

Function

Despite consisting almost entirely of domain of unknown function, the c2orf81 gene has been analyzed in a study of sites prone to DNA methylation. Another study found the gene c2orf81 to overlap with other genes.[12] Genes from its loci have been related to Alstrom syndrome, cleft palate, neurodevelopmental delays, macrocephaly, and Perry syndrome.

Post-translational modifications

In human c2orf81, phosphorylation is expected to be undergone only in serines, but not in any threonines or tyrosines.[13] O-linked glycosylation is predicted to occur at 3 sites toward the C-terminus.[14] These sites are well-conserved in all homologs. C2orf81 contains one potential SUMOylation site towards the end of the protein with the sequence GKAE.[15]

Homology

Paralogs

C2orf81 was found to have one paralog, Homo sapiens BAC clone RP11-523H20.[16]

Homologs

The c2orf81 protein is conserved highly in primates and other mammals, but less so in non-mammalian vertebrates. Its most distant homolog is in the Asian swamp eel.[17] Below is a table showing homologs of c2orf81 and their date of divergence and percent identity to the c2orf81 protein sequence.

!Species!Date of divergence (mya)!Protein identity
Bonobo6.499%
Gorilla8.6194%
Orangutan15.295%
Macaque28.192%
Lemur8272%
Mouse8852%
Minke whale9469%
Cow9466%
Pig9464%
Chinese softshell turtle32069%
Ostrich32062%
American golden eagle32042%
Asian swamp eel43235%

Evolution

C2orf81 has evolved quickly over time.[18] The N-terminus of the protein has evolved less quickly than the rest of the protein.

References

  1. Web site: Gene Cards.
  2. Web site: NCBI Protein c2orf81.
  3. Seow, W. J., Kile, M. L., Baccarelli, A. A., Pan, W.-C., Byun, H.-M., Mostofa, G., Quamruzzaman, Q., Rahman, M., Lin, X. and Christiani, D. C. (2014), Epigenome-wide DNA methylation changes with development of arsenic-induced skin lesions in Bangladesh: A case–control follow-up study. Environ. Mol. Mutagen., 55: 449 –456. doi:10.1002/em.21860
  4. Web site: EST Profile - Hs.445377. Group. Schuler. www.ncbi.nlm.nih.gov. 2018-05-06.
  5. Web site: CALCULATION OF PROTEIN ISOELECTRIC POINT. Kozlowski. Lukasz P.. isoelectric.org. en. 2018-05-06.
  6. Web site: Composition/Molecular Weight Calculation [PIR - Protein Information Resource]]. pir.georgetown.edu. 2018-05-06. 2018-01-31. https://web.archive.org/web/20180131053638/http://pir.georgetown.edu/pirwww/search/comp_mw.shtml. dead.
  7. Web site: NCBI CDD Conserved Protein Domain DUF4639. group. NIH/NLM/NCBI/IEB/CDD. www.ncbi.nlm.nih.gov. en. 2018-05-06.
  8. Web site: DUF4639. pfam.xfam.org. 2018-05-06.
  9. Web site: PSORTII.
  10. Web site: STRING.
  11. Web site: CFSSP: Chou & Fasman Secondary Structure Prediction Server. Kumar. Prof. T. Ashok. www.biogem.org. 2018-05-06.
  12. Figure 5: Genic alleles in the DAnc(YRI, Europe, UI) tail and overlapping genes.. www.nature.com. en. 2018-05-11.
  13. Web site: DISPHOS 1.3. www.dabi.temple.edu. 2018-05-06. 2018-05-13. https://web.archive.org/web/20180513115637/http://www.dabi.temple.edu/disphos/pred/predict. dead.
  14. Web site: DictyOGlyc 1.1. www.cbs.dtu.dk. 2018-05-06.
  15. Web site: SUMOplot™ Analysis Program Abgent. www.abgent.com. en. 2018-05-06.
  16. Web site: C2orf81 Gene - GeneCards CB081 Protein CB081 Antibody. Database. GeneCards Human Gene. www.genecards.org. 2018-05-06.
  17. Web site: BLAST: Basic Local Alignment Search Tool. blast.ncbi.nlm.nih.gov. 2018-05-06.
  18. Web site: TimeTree :: The Timescale of Life. www.timetree.org. 2018-05-06.