C16orf86 Explained

Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene.[1] It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.

Function

C16orf86 protein function is still not well understood, however, based on the DNA microarray data and the post-translational modifications data below, this protein could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles.

Localization

Tissue

C16orf86 has tissue expression high in the testes along with expression in regions such as the kidney, colon, brain, fat, spleen, liver.

C16orf86 microarray data was found using NCBI UniGene and going to GeoProfiles for C16orf86.[2] This data below shows C16orf86 tissue expression patterns for cell cycle regulation in kidney cells, colon cancer cells, and adipose tissue.

This DNA microarray figure below was done on MIF deficient cells and control cells using cDNA.[3] [4] Results showed that the MIF cytoplasmic protein is a regulator for promoting cell proliferation and cell cycle progression in kidney cells, for example, HEK293.[5] When MIF is inhibited, P53 blocks cell cycle of G1/S phase progression. Also, inhibition of E2F and AP1 and activation of P53 contribute to cell cycle regulators that result in cell cycle arrest at the G0/G1 phase in MIF cells. These are transcription factors in the C16orf86 promoter. E2F is important for cell cycle progression with AP1 and these are blocked by MIF and P53 takes over. C16orf86 could be important in cell cycle progression in the kidney, where it is expressed in the tissues.This DNA microarray figure below shows purified T98G Glioblastoma Cells that were cycled, G0 arrested, or released into S phase for 10 to 16 hours.[6] [7] The researchers tested to see how the mechanism of PRB, p107, and p130 represses the E2F target genes and how P130 complex interacts with Dp, RB like, and other E2F transcription factors to help module DREAM in cell cycle arrest.[8] The results showed that the E2F4 along with P130 and other transcription factors mediate the repression of the cell cycle from G1 cell to G0. If there is activation, S phase is going to bind E2F1/2/3 with other transcription factors to activate transcription in the cell cycle. C16orf86 could be important in cell cycle progression in the brain due to the E2F4 and the E2F1/2/3 transcription factors being located in its promoter sequence.This DNA microarray experiment below uses the idea of Infinium HumanMethylation450 BeadChip arrays with GWAS to figure out the DNA methylation profiles at day 3, day 8, and day 15 for skeletal myoblasts.[9] [10] This DNA methylation at day 3, day 8, and day 15 for skeletal myoblasts profiles were used to study myogenic cell differentiation.[11] The results showed that methylation patterns do indeed affect myogenic cell differentiation. One of the transcription factors tested in this experiment in particular, as pertaining to one of the transcription factors in the experiment, MYF6, it is a transcription factor that is located in C16orf86 promoter. This transcription factor are supposed to be down-regulated during muscle cell differentiation. This can be seen when first introduced with the stimulus and never being able to reach its top peak. This could mean that C16orf86 could be muscle cell differentiation in skeletal myoblast cells.

Subcellular

Protein C16orf86 is mainly localized in the nucleus along with being in the cytoplasm, mitochondria, and endoplasmic reticulum. This result were found using the protein tool on Expasy called PSORTII. This tool was used to put in sequence data along with comparing the results to its distant orthologs of Weddell seal and red fox.[12] [13]

Gene

Location

C16orf86 (Chromosome 16 Open Reading Frame 86) is a gene found on the long arm of chromosome 16 at position q22.11. It has a genomic sequence that starts at 67,667,030 base pair and ends at base pair 67,668,590.[14] Its genomic sequence is read in the forward direction with the positive strand.

C16orf86 is part of the ENKD1 region. This region contains 3 genes with the ENKD1 protein along with its isoforms ENKD1 isoform X1 and ENKD1 isoform X2.[15] Other genes located near C16orf86 are GFOD2 to the right, ACD to the left, and PARD6A to the left.

Exons and introns

C16orf86 has a total of 4 Exon regions within its protein sequence. The first exon boundary is located at amino acid 34 and 35 within base pairs G and T. Then, the second exon boundary is located at amino acid 111 and 112 within base pairs A and G. Finally, the third exon boundary is located between amino acid 184 and 186 within base pairs C and G.

C16orf86 has a total of 3 Intron regions within its protein sequence.

Length of coding gene

C16orf86 spans 317 amino acids long and starts transcription at a amino acid 1 Methionine and goes until amino acid 317, which is a stop codon.[16]

Isoforms

There are 2 isoforms of C16orf86, which is uncharacterized protein C16orf86 isoform X1 and uncharacterized protein C16orf86 isoform X2.

uncharacterized protein C16orf86 isoform X1 has a span of 332 amino acids long and has a total of 2 exon regions and 1 intron region.[17] [18]

uncharacterized protein C16orf86 isoform X2 has a span of 326 amino acids long and has a total of 4 exon's and 3 introns regions.[19] [20]

Gene regulation

Promoter

There are three different promoter sequences in C16orf86. These promoter sequences were found using the tool on Genomatix called Gene2Promoter for C16orf86.[21] These promoter sequences were each compared to C16orf86 distant ortholog promoters with the human C16orf86 human protein sequence in the program Clustal Omega multiple sequence alignment.[22] The results had promoter GXP_107609 match more closely in its sequence compared to the GXP_7544221 promoter and the GXP_6033384 promoter.

Transcription factor binding sites

Promoter for C16orf86 protein (GXP_107609) had transcription factor binding sites that were found using the Genomatix tool Gene2Promoter and clicking on analyze binding sites. Binding sites were chosen based on a high matrix score along with a high amount of occurrences within the promoter. The transcription factors that was in the conserved regions of the promoter sequence for C16orf86 (GXP_107609) was MYF3, MYF4, E2F, and CCCTC binding factor. These transcription factors all deal with cell cycle regulation.

Transcript level regulation

5'UTR region

For C16orf86, there was a multiple sequence alignment done on Clustal Omega for 5'UTRs for orangutans, gorillas, chimpanzees, macaque, and humans. The results of the MSA was compared with figures of the structure of the 5'UTR. These figures were created using the bioinformatics tool called m-fold[23] The sequences that stood out in the 5'UTR compared within the MSA is base pairs 105 to 113. These regions could have a stem-loop region pertaining to a certain function or dealing with protein interactions.

3'UTR region

For C16orf86, there was a multiple sequence alignment done on Clustal Omega for 5'UTRs for orangutans, gorillas, chimpanzees, macaque, and humans. The results of the MSA was compared with figures of the structure of the 3'UTR. These figures were created using the bioinformatics tool called m-fold. The sequences that stood out in the 3'UTR compared within the MSA is base pairs 1294 to 1300. These regions could have a stem loop region pertaining to certain function or dealing with protein interactions.

Structure

C16orf86 has found to have a molecular weight of 33.5 kilodaltons and a PI of 5.30.[24]

C16orf86 protein sequence is rich in Proline and Glutamate having a total of 39 Proline's (P) and 39 Glutamate's (E).[25] In addition, C16orf86 has low amino acid regions of Asparagine (N), Threonine (T), Isoleucine (I), and Phenylalanine (F). These regions have 3 Asparagine's, 9 Threonine's, 2 Isoleucine's, and 1 Phenylalanine. This makes the protein acidic with a low PH.

C16orf86 contains Domain of Unknown Function (DUF4691) from amino acid 1 to 184 and a Nuclear Localization Signal from amino acids 105–109.[26] [27] This figure was created using the Expasy prosite tool.[28]

For the C16orf86 protein, there is a nuclear localization signal that is from amino acid 105 to 109 and is composed of (PKRKP) in the forward direction.[29] This pattern is conserved and seen in humans and its distant orthologs such as the red fox and Weddell seal.

Secondary

C16orf86 overall has a high census of alpha helices compared to beta sheets. For the predicting location of alpha helices and beta sheets, Phyre 2 was used. For the alpha helices, there is a high-level prediction for amino acids 187–199, 231–244, 265–270, and 294–307. In addition to the alpha helices, there is a high level of prediction for beta strands at amino acids 96–97.[30]

Tertiary

The tertiary structure for C16orf86 PDB file was taken from Phyre2 and I-Tasser.[31] The PDB files were put into EZmol bioinformatics tool to create the tertiary structure.[32] This figure has amino acids labeled with sites that pertain to Phosphorylation, Nuclear Localization Signaling, and Nuclear Export Signaling.

Post-translational modifications

C16orf86 post-translational modifications were found using protein modification tools from Expasy. For this protein, the sites that were most intriguing for this protein was its nuclear export signals (L rich regions), Nuclear localization signals, and phosphorylation sites. The nuclear localization signals and export signals allow for this protein to become localized within the cell's nucleus. In addition, this protein sequence has phosphorylation sites for CDK5, GSK3, P38MAPK, PKA, PKC, CDC2, ATM, CKII, and DNAPK. These all play a specific role in cell cycle regulation. There is also a conceptual translation for C16orf86 below with the rest of the post-translation modifications.

Evolution

The orthologs were sorted by increasing data of divergence and sequence similarity

GenusSpeciesCommon nameTaxonomic groupDate of divergence (MYA)Accession numberSequence length (AA)Sequency identity to humanSequence similarity to human
HomosapiensHumansPrimates0.00NP_001013002.2317100.00%100.00%
PongoabeliiSumatran orangutanPrimates15.20XP_002826596.131895.00%96.00%
RhinopithecusbietiBlack snub-nosed monkeyPrimates28.10XP_017707751.131492.00%94.00%
OtolemurgarnettiiNorthern greater galagoPrimates73.00XP_003799435.131974.00%79.00%
OchotonaprincepsAmerican pikaLagomorphas88.00XP_004584223.41760.0065.00%
Cricetulus
griseusChinese hampsterRodentias88.00XP_007647376.132464.35%71.00%
CastorcanadensisAmerican beaverRodentias88.00XP_020026748.132867.00%73.00%
SorexaraneusCommon shrewSoricomorphas94.00XP_004600963.132063.87%70.00%
Rousettus
aegyptiacusEgyptian fruit batChiropteras94.00XP_016019485.133964.81%71.00%
Leptonychotes
weddelliiWeddell sealCarnivoras94.00XP_006749032.132467.68%72.00%
VulpesvulpesRed foxCarnivoras94.00XP_025867300.132570.46%70.00%
OvisariesSheepArtiodactylas94.00XP_027833899.132970.61%76.00%
ElephantulusedwardiiCape elephant shrewMacroscelideas102.00XP_006878955.129858.12%61.00%
VombatusursinusCommon wombatMarsupials160.00XP_027703451.128152.00%61.00%
AptenodytesforsteriEmperor penguinBirds320.00XP_009289088.126237.00%42.00%
PogonavitticepsCentral bearded dragonReptiles320.00XP_020667121.126640.00%52.00%
NotechisscutatusTiger snakeReptiles320.00XP_026531742.126642.00%50.00%
PythonbivittatusBurmese pythonReptiles320.00XP_025026382.126744.00%54.00%
LatimeriachalumnaeWest Indian Ocean coelacanthFish414.00XP_014342026.127540.00%48.00%
RhincodontypusWhale sharkFish465.00XP_020387814.124229.00%44.00%

Paralogs

After conducting a search with NCBI Blast and after finding no paralog sequences similar to C16orf86 in BLAT, it was confirmed that C16orf86 does not have any paralogs. Only isoforms were shown below for the sequence, but no full sequences.

Orthologs

C16orf86 orthologs include dogs, chimpanzee, cows, rats, mice, and chimpanzees.[33] [34]

Ortholog space: C16orf86 orthologs include only placental mammals. This means there are no other mammal groups, birds, fungi, archaea, protists, reptiles, plants, or any other invertebrate species that are orthologs to C16orf86. The most distant ortholog in the placental mammal group, macroscelidea, was the most diverged species from C16orf86, which was 102 million years ago.[35]

Homologs

The most distant homologs with partial sequences to C16orf86 include marsupial mammals, reptiles, and fish. The furthest homolog for C16orf86 was the whale shark that diverged 465 million ago from humans.

Notes and References

  1. Web site: C16orf86 chromosome 16 open reading frame 86 [Homo sapiens (human)] - Gene - NCBI]. www.ncbi.nlm.nih.gov. 2019-02-10.
  2. Web site: GEO Profile Links for UniGene (Select 2139102) - GEO Profiles - NCBI. www.ncbi.nlm.nih.gov. 2019-05-02.
  3. Web site: GDS3626 / ILMN_1697800. www.ncbi.nlm.nih.gov. 2019-05-05.
  4. Web site: 62756576 - GEO Profiles - NCBI. www.ncbi.nlm.nih.gov. 2019-05-05.
  5. Liu L, Ji C, Chen J, Li Y, Fu X, Xie Y, Gu S, Mao Y . A global genomic view of MIF knockdown-mediated cell cycle arrest . Cell Cycle . 7 . 11 . 1678–92 . June 2008 . 18469521 . 10.4161/cc.7.11.6011 . free.
  6. Web site: 81993008 - GEO Profiles - NCBI. www.ncbi.nlm.nih.gov. 2019-05-05.
  7. Web site: GDS3364 / 231153_at. www.ncbi.nlm.nih.gov. 2019-05-05.
  8. Smith MJ, Simco BA, Warren CO . Comparative effects of antimycin A on isolated mitochondria of channel catfish (Ictalurus punctatus) and rainbow trout (Salmo gairdneri) . Comparative Biochemistry and Physiology C . 52 . 2 . 113–7 . December 1975 . 3364 . 10.1016/0306-4492(75)90024-6 .
  9. Web site: 129260808 - GEO Profiles - NCBI. www.ncbi.nlm.nih.gov. 2019-05-05.
  10. Web site: GDS5632 / 231153_at. www.ncbi.nlm.nih.gov. 2019-05-05.
  11. Pollow K, Lübbert H, Pollow B . On the mitochondrial 17beta-hydroxysteroid dehydrogenase from human endometrium and endometrial carcinoma: characterization and intramitochondrial distribution . Journal of Steroid Biochemistry . 7 . 1 . 45–50 . January 1976 . 5632 . 10.1016/0022-4731(76)90163-1 .
  12. Web site: PREDICTED: uncharacterized protein C16orf86 homolog [Leptonychotes wed - Protein - NCBI|website=www.ncbi.nlm.nih.gov|access-date=2019-05-05].
  13. Web site: uncharacterized protein C16orf86 homolog [Vulpes vulpes] - Protein - NCBI]. www.ncbi.nlm.nih.gov. 2019-05-05.
  14. Web site: User Sequence vs Genomic. genome.ucsc.edu. 2019-04-30.
  15. Web site: ENKD1 enkurin domain containing 1 [Homo sapiens (human)] - Gene - NCBI]. www.ncbi.nlm.nih.gov. 2019-04-22.
  16. Web site: uncharacterized protein C16orf86 [Homo sapiens] - Protein - NCBI]. www.ncbi.nlm.nih.gov. 2019-05-05.
  17. Web site: uncharacterized protein C16orf86 isoform X1 [Homo sapiens] - Protein - NCBI]. www.ncbi.nlm.nih.gov. 2019-04-30.
  18. Web site: User Sequence vs Genomic. genome.ucsc.edu. 2019-04-30.
  19. Web site: uncharacterized protein C16orf86 isoform X2 [Homo sapiens] - Protein - NCBI]. www.ncbi.nlm.nih.gov. 2019-04-30.
  20. Web site: User Sequence vs Genomic. genome.ucsc.edu. 2019-04-30.
  21. Web site: Genomatix: Login Page. www.genomatix.de. 2019-05-02.
  22. Web site: Clustal Omega < Multiple Sequence Alignment < EMBL-EBI. www.ebi.ac.uk. 2019-05-02.
  23. Web site: RNA Folding Form mfold.rit.albany.edu. unafold.rna.albany.edu. 2019-05-03.
  24. Web site: ExPASy - Compute pI/Mw tool. web.expasy.org. 2019-04-30.
  25. Web site: SAPS < Sequence Statistics < EMBL-EBI. www.ebi.ac.uk. 2019-05-05.
  26. Web site: uncharacterized protein C16orf86 [Homo sapiens] - Protein - NCBI]. www.ncbi.nlm.nih.gov. 2019-04-22.
  27. Web site: ExPASy: SIB Bioinformatics Resource Portal - Categories. www.expasy.org. 2019-05-02.
  28. Web site: ExPASy - PROSITE. prosite.expasy.org. 2019-05-03.
  29. Web site: Welcome to psort.org!!. www.psort.org. 2019-05-05.
  30. Web site: Phyre 2 Results for Undefined. www.sbg.bio.ic.ac.uk. 2019-05-02. https://web.archive.org/web/20190502051243/http://www.sbg.bio.ic.ac.uk/phyre2/phyre2_output/267ba7624a54af7e/summary.html. 2019-05-02. dead.
  31. Web site: I-TASSER results. zhanglab.ccmb.med.umich.edu. 2019-05-02. 2019-05-02. https://web.archive.org/web/20190502051237/https://zhanglab.ccmb.med.umich.edu/I-TASSER/output/S461078/. dead.
  32. Web site: EzMol - Molecular display wizard. www.sbg.bio.ic.ac.uk. 2019-05-05.
  33. Web site: C16orf86 Gene - GeneCards | CP086 Protein | CP086 Antibody . www.genecards.org. 2019-02-10.
  34. Web site: HomoloGene - NCBI. www.ncbi.nlm.nih.gov. 2019-02-10.
  35. Web site: TimeTree :: The Timescale of Life. www.timetree.org. 2019-04-22.