C2orf81 is a human gene encoding protein c2orf81, which is predicted to have nuclear localization.
C2orf81's aliases are LOC388963 and hCG40743.[1] The gene spans from bases 74,414,176 to 74,421,591 on the minus (-) strand of chromosome 2, and contains 4 exons. The coding region is 2086 base pairs, and the protein sequence contains 615 amino acids.[2]
The protein encoded by c2orf81 is expressed highly in testis, kidneys, and about 18 other tissues in humans.[3] Disease states in which it is expressed include in gliomas, neoplasm, and lymphoma.[4]
Only a few mutations have been documented to occur in c2orf81. Three common missense mutations occur in the 3’ UTR and in the coding sequence which change serine to leucine in the protein. Nonsense mutations have been documented as well, occurring exclusively in the codon for proline.
The mRNA sequence contains and 2086 base pairs and 4 isoforms.
C2orf81 has a molecular weight of 66.6 kDa and its isoelectric point is 5.32.[5] It contains a high amount of prolines in the human protein and most mammalian homologs, but a higher amount of glutamic acid residues in non-mammalian vertebrate homologs.[6] C2orf81 has 4 isoforms and its most common isoform contains 615 amino acids. Isoforms 2 through 4 have 566, 520 and 588 amino acids respectively. C2orf81 is the only member of superfamily cl25621.[7]
Domain of unknown function (DUF) 4639 is unique to the c2orf81 protein and is conserved in eukaryotes.[8] DUF 4639 spans from amino acid 17 to the end of the protein in human c2orf81.
C2orf81 is primarily predicted to be nuclear, but potentially also cytoplasmic and mitochondrial.[9]
C2orf81 protein is predicted to interact highly with enoyl-CoA hydratase and hydroxyacyl-CoA dehydrogenase, based on textmining and database searches.[10] Other predicted interacting proteins are acetyl-CoA carboxylases A and B, glycine dehydrogenase, 3-oxoacid CoA transferase 2.
The c2orf81 is composed mainly of alpha helices. It contains fewer beta pleated sheets, turns, and coils.[11]
Despite consisting almost entirely of domain of unknown function, the c2orf81 gene has been analyzed in a study of sites prone to DNA methylation. Another study found the gene c2orf81 to overlap with other genes.[12] Genes from its loci have been related to Alstrom syndrome, cleft palate, neurodevelopmental delays, macrocephaly, and Perry syndrome.
In human c2orf81, phosphorylation is expected to be undergone only in serines, but not in any threonines or tyrosines.[13] O-linked glycosylation is predicted to occur at 3 sites toward the C-terminus.[14] These sites are well-conserved in all homologs. C2orf81 contains one potential SUMOylation site towards the end of the protein with the sequence GKAE.[15]
C2orf81 was found to have one paralog, Homo sapiens BAC clone RP11-523H20.[16]
The c2orf81 protein is conserved highly in primates and other mammals, but less so in non-mammalian vertebrates. Its most distant homolog is in the Asian swamp eel.[17] Below is a table showing homologs of c2orf81 and their date of divergence and percent identity to the c2orf81 protein sequence.
Bonobo | 6.4 | 99% | |
Gorilla | 8.61 | 94% | |
Orangutan | 15.2 | 95% | |
Macaque | 28.1 | 92% | |
Lemur | 82 | 72% | |
Mouse | 88 | 52% | |
Minke whale | 94 | 69% | |
Cow | 94 | 66% | |
Pig | 94 | 64% | |
Chinese softshell turtle | 320 | 69% | |
Ostrich | 320 | 62% | |
American golden eagle | 320 | 42% | |
Asian swamp eel | 432 | 35% |
C2orf81 has evolved quickly over time.[18] The N-terminus of the protein has evolved less quickly than the rest of the protein.