C13orf46 Explained

Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene.[1] In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.

Gene

An alternative name for C13orf46 is LOC100507747.[2] C13orf46 spans 47,563 base pairs, contains 11 exons, and is on the minus strand of chromosome 13 at 13q34.[3] [4]

Gene neighbors

The neighboring genes around C13orf46 include LINC00454, LINC00452, SWINGN, RASA3, and LOC124903221.[5]

LINC00454 and LINC00452 (Long Intergenic NonProtein Coding RNA 454 & 452) are both long non-coding RNAs (lncRNA) that regulate epigenetic gene expression, chromatin remodeling, and levels of gene transcription and translation.[6] [7] Both LINC00454 and LINC00452 expression are restricted to the testis. LINC00454 has been associated with Factor X Deficiency while LINC00452 has been found to promote ovarian carcinogenesis.[8] [9]

SWINGN (SWI/SNF Complex Interacting GAS6 Enhancer Non-Coding RNA) is also a lncRNA that neighbors C13orf46.[10] SWINGN regulates the activation of the GAS6 (Growth Arrest Specific 6) oncogene, by interacting with matrix associated and actin dependent regulators of chromatin.[11]

The RASA3 (RAS p21 Protein Activator 3) gene encodes the Ras GTPase activating protein. This protein binds inositol 1,3,4,5-tetrakisphosphate to stimulate the activity of Ras p21 and negatively regulates the Ras signaling pathway.[12] RASA3 is most highly expressed in fat, lymph nodes, and the spleen. The encoded protein is localized to the cell membrane.

mRNA

Eight different transcript variants have been identified for C13orf46. These transcript variants are alternatively spliced to include variations of 11 different exons. Depending on the different transcript variant that is translated, 5 different possible protein isoforms are encoded by C13orf46. The most common protein product encoded by C13orf46 is isoform 1, which is 212 amino acids long.[13]

Table of Variants and Exons!Transcript Variant!mRNA Length (nt)!Protein Isoform!Protein Length (aa)!Molecular Weight (kDA)!Exon 1 (bp)!Exon 2 (bp)!Exon 3 (bp)!Exon 4 (bp)!Exon 5 (bp)!Exon 6 (bp)!Exon 7 (bp)!Exon 8 (bp)!Exon 9 (bp)!Exon 10 (bp)!Exon 11 (bp)
C13orf46 transcript variant 13786 C13orf46 protein isoform 121223.4269521664848683135
C13orf46 transcript variant X126461C13orf46 protein isoform X162466.7 3269252192021020
C13orf46 transcript variant X226389C13orf46 protein isoform X162466.7 3269252209031965
C13orf46 transcript variant X325642C13orf46 protein isoform X162466.7 3269252197452376
C13orf46 transcript variant X426573C13orf46 protein isoform X258762.8326925223052
C13orf46 transcript variant X529437C13orf46 protein isoform X321223.426952166484868333725449
C13orf46 transcript variant X6961C13orf46 protein isoform X419221.126952166484868310
C13orf46 transcript variant X7833C13orf46 protein isoform X517518.9269521664824652

Protein

The primary protein isoform of the C13orf46 gene consists of 212 amino acids.[14] The longest encoded isoform, known as C13orf46 protein isoform X1, is 624 amino acids long.[15] Other protein isoforms encoded by the C13orf46 gene are similar to either of these two versions of the C13orf46 protein. Varying forms of the primary 212 amino acid protein is encoded by transcript variants 1, X5, X6, and X7. Variations of the longest C13orf46 protein isoform are encoded by transcript variants X1, X2, X3, and X4.

Protein isoform 1

Properties and composition

C13orf46 Isoform 1 has a theoretical isoelectric point of 4.84 and a predicted molecular weight of 23.4 kDA.[16] Higher relative amounts of glutamic acid (15.1%) and aspartic acid (7.5%) are found within this isoform, while the amino acids phenylalanine (0.9%) and threonine (0.5%) are found to be less abundant within the protein composition.[17] C13orf46 Isoform 1 also has a glutamic acid rich region where multiple glutamic acid and lysine doublets are present, some of which occur side by side. A total of 14 multiplets are found within the protein overall, 12 of which are charged. C13orf46 Isoform 1 is not predicted to contain any charge clusters, hydrophobic segments, or transmembrane segments.

Structure

C13orf46 Protein Isoform 1 is predicted to consist of 7 alpha helices and 2 beta strands in addition to regions of random free coils.[18] [19]

Domains and motifs

C13orf46 Isoform 1 has two identified disordered regions spanning between amino acid residues 1 through 148 and 168 to 190.[20] In addition, C13orf46 Isoform 1 has a glutamic acid rich region spanning along amino acid residues 109 to 191.

Regulation and post translational modifications

C13orf46 Isoform 1 is predicted to undergo several post-translational modifications such as phosphorylation,[21] [22] [23] O-GlcNAcylation, mucin type GalNAc O-glycosylation, palmitoylation,[24] [25] and sumoylation.[26] PKA, PKC, CKII, PKG, GSK3, cdc2, RSK, and ATM are kinases that are predicted to bind and phosphorylate the human C13orf46 Isoform 1. There is also one predicted phosphoprotein-binding phosphosite on the protein.[27]

Protein isoform X1

Properties

C13orf46 Isoform X1 has a theoretical isoelectric point of 9.33 and a predicted molecular weight of 66.7 kDA. C13orf46 Isoform X1 protein contains much higher relative amounts of serine (18.4%) and leucine (18.8%) compared to other human proteins and also has high amounts of proline (14.4%). Roughly equal amounts of serine and leucine are found within the protein. C13orf46 Isoform X1 protein is also composed of lower than usual amounts of glutamic acid (1.3%), phenylalanine (0.3%), and lysine (0.5%) and also has low amounts of valine (2.4%). Asparagine is not found within the C13orf46 Isoform X1 protein. Within this isoform, 100 amino acid multiplets are found, 5 of which are charged. No charge clusters, hydrophobic segments or transmembrane domains are predicted within the protein.

Structure

C13orf46 Isoform X1 is predicted to consist of a combination of alpha helices, beta sheets, and free random coil regions.[28] There are 22 predicted alpha helices and 18 predicted beta sheets within the predicted structure of C13orf46 Isoform X1.

C13orf46 Isoform X1 contains a series of 26 repeats, which vary in sequence structure and length.[29] Out of the 26 identified repeat sequences, 14 sequences consisted of 20 amino acids, while 5 of the repeats consisted of 21 amino acids, 3 repeats consisted of 22 amino acids, and 4 repeats were 23 amino acids long.[30] Each repeat sequence beings with either the amino acid methionine, isoleucine, or leucine. The main sequence structure of the amino acids within the 26 repeats is MLLLSTGCSSSPPDAPPLHQ. An alignment of the 26 repeats indicate that the most conserved part of the repeat sequence occurs in the middle of the sequences with a triplet of the amino acid serine.[31]

Domains and motifs

C13orf46 Isoform X1 has a predicted a dimerization domain between amino acids residues 69 to 87.[32]

Regulation and post translational modification

C13orf46 Isoform X1 is predicted to undergo several post-translational modifications such as phosphorylation, O-GlcNAcylation, mucin type GalNAc O-glycosylation, palmitoylation, and sumoylation. The human C13orf46 Isoform X1 protein also has 11 predicted PPBD-specific binding phosphosites. The most conserved phosphorylation sites occur on the third serine of 23 out of 26 repeats. PKC, PKG, PKA, p38MAPK, GSK3, DNAPK, CKI, cdk5, CKII, and cdc2 are kinases predicted to bind and phosphorylate the human C13orf46 Isoform X1 protein. Predicted phosphorylated sites are also predicted to be sites where O-glycosylation can occur.

Protein interactions

C13orf46 protein isoform X1 has several predicted S-phase cyclin binding sites, in addition to MAPK and p38 interacting motifs.[33]

Expression

RNA sequencing shows the expression of C13orf46 is most observed in the lungs, prostate, pancreas, and stomach at intermediate levels.[34] C13orf46 also has lower expression levels in the bone marrow, spleen, thyroid, lymph node, gall bladder, and thymus.

Cellular localization

C13orf46 Isoform 1 is predicted to be mostly localized within the nucleus.[35] This protein isoform may also be localized on the cell membrane. C13orf46 Isoform X1 is predicted to be mostly localized within the nucleus or cytoplasm.

Homology

Orthologs

The C13orf46 gene has orthologs to the human C13orf46 isoform 1 protein and C13orf46 isoform X1 protein, found within primates, mammals, birds, reptiles, fish, and invertebrates.[36]

Isoform 1

Orthologs to the human C13orf46 isoform 1 protein are only known to be found in primates and mammals, suggesting that this part of the C13orf46 gene encoding the C13orf46 isoform 1 protein appeared around 99 million years ago.

Table of Orthologs to Human Protein C13orf46 Isoform 1!Genus and Species!Common Name!Taxonomic Group!Median Date of Divergence (mya)!Accession #!Sequence Length (aa)!Sequence Identity (%)!Sequence Similarity (%)
Homo sapiensHumanPrimates0NP_001352384.1212100.0%100.0%
Pan paniscusBonoboPrimates6.4XP_034792262.121298.1%98.1%
Gorilla gorilla gorillaWestern Lowland GorillaPrimates8.6XP_030857272.121295.3%98.1%
Papio anubisOlive BaboonPrimates28.9XP_021785522.121288.7%92.5%
Cercocebus atysSooty MangabeyPrimates28.9XP_011913555.119287.3%91.0%
Macaca mulattaRhesus MacaquePrimates28.9XP_014977020.119279.2%82.5%
Ursus arctosBrown BearCarnivora87XP_048071403.122259.2%71.3%
Callorhinus ursinusNorthern Fur SealCarnivora87XP_025730354.118457.5%65.6%
Lontra canadensisNorthern River OtterCarnivora87XP_032736869.123246.4%53.6%
Odobenus rosmarus divergensPacific WalrusCarnivora87XP_004412327.131036.4%41.7%
Loxodonta africanaAfrican Bush ElephantProboscidea87XP_010591994.121462.6%73.4%
Choloepus didactylusTwo-Toed SlothPilosa87XP_037662557.121460.7%73.8%
Orycteropus afer aferAardvarkTubulidentata87XP_007940592.121460.0%69.3%
Castor canadensisNorth American BeaverRodentia87XP_020020073.121759.6%72.0%
Pteropus giganteusIndian Flying Fox Chiroptera94XP_039734682.121367.6%76.5%
Eptesicus fuscusBig Brown BatChiroptera94XP_028004567.121465.9%75.2%
Trichechus manatus latirostrisAntillean ManateeSirenia94XP_023589319.121463.1%74.3%
Balaenoptera musculusBlue WhaleCetacea94XP_036687016.120757.3%69.0%
Urocitellus parryiiArctic Ground SquirrelRodentia94XP_026237314.121661.9%72.0%
Sciurus carolinensisEastern Gray SquirrelRodentia94XP_047409299.123855.6%66.1%
Ictidomys tridecemlineatusThirteen-Lined Ground SquirrelRodentia94XP_013221671.227649.3%57.6%
Chinchilla lanigeraLong-Tailed ChinchillaRodentia99XP_005373979.121755.5%66.4%
Arvicola amphibiusEuropean Water VoleRodentia99XP_038185081.123753.1%65.1%
Mesocricetus auratusGolden HamsterRodentia99XP_005082676.123751.0%61.8%
Arvicanthis niloticusAfrican Grass RatRodentia99XP_034376776.124150.0%61.9%

Isoform X1

Predicted orthologs to the human C13orf46 isoform X1 protein are found in primates, mammals, birds, reptiles, fish, and as far as back as invertebrates of the bacterial phylum Legionella.

Table of Predicted Orthologs to Human Protein C13orf46 Isoform X1!Genus and Species!Common Name!Taxonomic Group!Median Date of Divergence (mya)!Accession #!Sequence Length (aa)!Sequence Identity (%)!Sequence Similarity (%)
Homo sapiensHumanPrimates0XP_047285937.1624100.0%100.0%
Pan troglodytesChimpanzeePrimates6.4XP_024209271.172054.9%61.0%
Microtus ochrogasterPrairie VoleRodentia87KAH0512811.193610.3%15.8%
Phoca vitulinaEuropean Harbour SealCarnivora94XP_032285971.151018.8%30.1%
Orcinus orcaKiller WhaleCetacea94XP_049556886.134814.2%19.3%
Myotis davidiiWhiskered BatChiroptera94ELK34143.153026.1%33.6%
Phasianus colchicusRing-Necked PheasantGalliformes319XP_031464934.149923.2%33.5%
Corvus hawaiiensisHawaiian CrowPasseriformes319XP_048182949.131617.3%23.4%
Hirundo rusticaBarn SwallowPasseriformes319XP_039927228.1118510.0%14.9%
Pelodiscus sinensisChinese Soft-Shelled TurtleTestudines319XP_025042872.155417.3%26.4%
Rana temporariaGrass FrogAnura353XP_040201915.1114712.9%19.7%
Bufo bufoCommon ToadAnura353XP_040296088.125912.3%19.7%
Lithobates catesbeianusAmerican BullfrogAnura353PIO00716.124512.0%18.5%
Larimichthys croceaLarge Yellow CroakerPerciformes431KAE8277666.147828.9%37.9%
Coregonus clupeaformisLake WhitefishSalmoniformes431XP_041725148.260927.0%26.0%
Austrofundulus limnaeusKillifishCyprinodontiformes431XP_013856594.124422.7%25.4%
Oncorhynchus tshawytschaChinook Blackmouth SalmonSalmoniformes431XP_042158955.171423.5%26.2%
Salmo salarAtlantic SalmonSalmoniformes431XP_045562793.132419.9%20.9%
Prochilodus magdalenaeColumbian Freshwater FishCharaciformes431KAI4891011.138818.6%25.4%
Oncorhynchus mykissRainbow TroutSalmoniformes431XP_036845983.133218.5%27.3%
Chiloscyllium punctatumBrownbanded Bamboo SharkOrectolobiformes464GCC17506.162526.7%26.8%
Biomphalaria glabrataFreshwater SnailBasommatophora694KAI8768938.130814.4%18.9%
Bulinus truncatusFreshwater SnailBasommatophora694KAH9489149.187910.3%30.7%
Owenia fusiformisBristle WormCanalipalpata694CAH1787814.122414.3%15.2%
Legionella falloniiLegionellaLegionellales3036WP_045095679.169515.2%30.7%

Paralogs

Human C13orf46 isoform X1 protein has one predicted paralog among mucins, specifically mucin-1. Mucins play a role in creating protective mucus barriers on epithelial tissues.[37] The MUC1 gene is located on chromosome 1 at 1q22, contains 11 exons, and has 22 different isoforms.[38] Mucins are highly O-glycosylated and contain tandem repeat domains abundant with proline, serine, and threonine.[39] Surrounding the repeat domains are cysteine rich regions. Mucin genes do not always share a common ancestry, are prone to convergent evolution, and are grouped based on their functionality instead of common evolutionary history.[40]

Notes and References

  1. Web site: C13orf46 chromosome 13 open reading frame 46 [Homo sapiens (human)] - Gene - NCBI ]. 2022-12-15 . www.ncbi.nlm.nih.gov.
  2. Web site: GEO Profile Links for Gene (Select 100507747) - GEO Profiles - NCBI . 2022-12-17 . www.ncbi.nlm.nih.gov.
  3. Web site: 2022-04-06 . Homo sapiens chromosome 13, GRCh38.p14 Primary Assembly . en-US.
  4. Web site: GeneCards . C13orf46 Gene - CM046 Protein CM046 Antibody . 2022-12-14 .
  5. Web site: C13orf46 chromosome 13 open reading frame 46 [Homo sapiens (human)] - Gene - NCBI ]. 2022-12-14 . www.ncbi.nlm.nih.gov.
  6. Web site: LINC00452 long intergenic non-protein coding RNA 452 [Homo sapiens (human)] - Gene - NCBI ]. 2022-12-15 . www.ncbi.nlm.nih.gov.
  7. Zhang X, Wang W, Zhu W, Dong J, Cheng Y, Yin Z, Shen F . Mechanisms and Functions of Long Non-Coding RNAs at Multiple Regulatory Levels . International Journal of Molecular Sciences . 20 . 22 . 5573 . November 2019 . 31717266 . 6888083 . 10.3390/ijms20225573 . free .
  8. Web site: Alliance of Genome Resources . 2022-12-15 . www.alliancegenome.org.
  9. Yang J, Wang WG, Zhang KQ . LINC00452 promotes ovarian carcinogenesis through increasing ROCK1 by sponging miR-501-3p and suppressing ubiquitin-mediated degradation . Aging . 12 . 21 . 21129–21146 . November 2020 . 33168781 . 7695380 . 10.18632/aging.103758 .
  10. Web site: SWINGN SWI/SNF complex interacting GAS6 enhancer non-coding RNA [Homo sapiens (human)] - Gene - NCBI ]. 2022-12-15 . www.ncbi.nlm.nih.gov.
  11. Grossi E, Raimondi I, Goñi E, González J, Marchese FP, Chapaprieta V, Martín-Subero JI, Guo S, Huarte M . 6 . A lncRNA-SWI/SNF complex crosstalk controls transcriptional activation at specific promoter regions . Nature Communications . 11 . 1 . 936 . February 2020 . 32071317 . 7028943 . 10.1038/s41467-020-14623-3 . 2020NatCo..11..936G .
  12. Web site: RASA3 RAS p21 protein activator 3 [Homo sapiens (human)] - Gene - NCBI ]. 2022-12-15 . www.ncbi.nlm.nih.gov.
  13. Web site: UniProt . 2022-12-15 . www.uniprot.org.
  14. Web site: uncharacterized protein C13orf46 [Homo sapiens] - Protein - NCBI ]. 2022-12-15 . www.ncbi.nlm.nih.gov.
  15. Web site: uncharacterized protein C13orf46 isoform X1 [Homo sapiens] - Protein - NCBI ]. 2022-12-15 . www.ncbi.nlm.nih.gov.
  16. Web site: Expasy - Compute pI/Mw tool . 2022-12-15 . web.expasy.org.
  17. Web site: SAPS < Sequence Statistics < EMBL-EBI . 2022-12-15 . www.ebi.ac.uk.
  18. Web site: AlphaFold Protein Structure Database . 2022-12-15 . alphafold.ebi.ac.uk.
  19. Web site: I-TASSER results . 2022-12-15 . seq2fun.dcmb.med.umich.edu.
  20. Web site: uncharacterized protein C13orf46 [Homo sapiens] - Protein - NCBI ]. 2022-12-15 . www.ncbi.nlm.nih.gov.
  21. Web site: GPS 5.0 - Kinase-specific Phosphorylation Site Prediction . 2022-12-15 . gps.biocuckoo.cn.
  22. Web site: Services . 2022-12-15 . healthtech.dtu.dk . en.
  23. Web site: Motif Scan . 2022-12-15 . myhits.sib.swiss . en.
  24. Web site: GPS-Palm: A Graphic Presentation System for Palmitoylation Site Prediction . 2022-12-15 . gpspalm.biocuckoo.cn.
  25. Web site: GPS-Lipid - Prediction of Lipid Modifications (S-Palmitoylation, N-Myristoylation, S-Farnesylation, S-Geranylgeranylation) . 2022-12-15 . lipid.biocuckoo.org.
  26. Web site: GPS-SUMO: Prediction of SUMOylation Sites & SUMO-interaction Motifs . 2022-12-15 . sumosp.biocuckoo.org . 2013-05-10 . https://web.archive.org/web/20130510131129/http://sumosp.biocuckoo.org/ . dead .
  27. Web site: GPS-PBS - PPBDs–specific binding p-site prediction . 2022-12-15 . pbs.biocuckoo.cn.
  28. Web site: I-TASSER results . 2022-12-15 . seq2fun.dcmb.med.umich.edu.
  29. Web site: Dotlet JS . 2022-12-15 . dotlet.vital-it.ch.
  30. Web site: Clustal Omega < Multiple Sequence Alignment < EMBL-EBI . 2022-12-15 . www.ebi.ac.uk.
  31. Web site: Multiple Sequence Alignment - CLUSTALW . 2022-12-15 . www.genome.jp.
  32. Web site: Motif Scan . 2022-12-15 . myhits.sib.swiss . en.
  33. Web site: ELM - Search the ELM resource . 2022-12-17 . elm.eu.org . en.
  34. Web site: Tissue expression of C13orf46 - Summary - The Human Protein Atlas . 2022-12-16 . www.proteinatlas.org.
  35. Web site: PSORT II Prediction . 2022-12-15 . psort.hgc.jp.
  36. Web site: BLAST: Basic Local Alignment Search Tool . 2022-12-16 . blast.ncbi.nlm.nih.gov.
  37. Reznik N, Gallo AD, Rush KW, Javitt G, Fridmann-Sirkis Y, Ilani T, Nairner NA, Fishilevich S, Gokhman D, Chacón KN, Franz KJ, Fass D . 6 . Intestinal mucin is a chaperone of multivalent copper . English . Cell . 185 . 22 . 4206–4215.e11 . October 2022 . 36206754 . 10.1016/j.cell.2022.09.021 . 245671675 . free .
  38. Web site: MUC1 mucin 1, cell surface associated [Homo sapiens (human)] - Gene - NCBI ]. 2022-12-16 . www.ncbi.nlm.nih.gov.
  39. Pinzón Martín S, Seeberger PH, Varón Silva D . Mucins and Pathogenic Mucin-Like Molecules Are Immunomodulators During Infection and Targets for Diagnostics and Vaccines . Frontiers in Chemistry . 7 . 710 . 2019 . 31696111 . 6817596 . 10.3389/fchem.2019.00710 . 2019FrCh....7..710V . free .
  40. Pajic P, Shen S, Qu J, May AJ, Knox S, Ruhl S, Gokcumen O . A mechanism of gene evolution generating mucin function . Science Advances . 8 . 34 . eabm8757 . August 2022 . 36026444 . 9417175 . 10.1126/sciadv.abm8757 . 2022SciA....8M8757P .