C5orf46 is a protein coding gene located on chromosome 5 in humans. It is also known as sssp1, or skin and saliva secreted protein 1. There are two known isoforms known in humans, with isoform 2 (analyzed throughout this page) being the longer of the two. The protein encoded is predicted to have one transmembrane domain, and has a predicted molecular weight of 9,692 Da, and a basal isoelectric point of 4.67.[1]
Found on the minus strand of chromosome 5, the c5orf46 isoform X2 is 4679 nucleotides in length and has 4 exons.
C5orf46 orthologs are only found in Chordata, with the earliest instance being found in the Ornithorhynchus anatinus around 177 million years ago.[2] Highly conserved regions include the signal peptide sequence found towards the N-terminus of the protein. There are no paralogs found in humans.
Homo sapiens | Human | Primates | XP_005268503.2 | 102 | 100 | 100 | |
Pan paniscus | Bonobo | Primates | XP_003829110.1 | 87 | 98 | 98 | |
Octodon degus | Common degu | Rodentia | XP_004631400.1 | 73 | 68 | 82 | |
Mus musculus | House mouse | Rodentia | NP_001028452.1 | 93 | 68 | 77 | |
Urocitellus parryii | Arctic ground squirrel | Rodentia | XP_026240295.1 | 92 | 59 | 72 | |
Leptonychotes weddelli | Weddell seal | Carnivora | XP_006732700.1 | 124 | 78 | 94 | |
Acinonyx jubatus | Cheetah | Carnivora | XP_026897868.1 | 147 | 76 | 87 | |
Zalophus califronianus | California sea lion | Carnivora | XP_027462230.1 | 84 | 75 | 86 | |
Sorex araneus | Common shrew | Eulipotyphla | XP_004618219.1 | 73 | 74 | 79 | |
Vicugna pacos | Alpaca | Artiodactyla | XP_006204536.1 | 88 | 73 | 81 | |
Delphinapertus leucas | Beluga whale | Artiodactyla | XP_030618631.1 | 88 | 73 | 81 | |
Camelus bactrianus | Bactrian camel | Artiodactyla | XP_010954370.1 | 83 | 73 | 79 | |
Orcinus orca | Killer Whale | Artiodactyla | XP_004280428.1 | 88 | 71 | 83 | |
Sus scrofa | Wild boar | Artiodactyla | XP_003354397.2 | 90 | 67 | 82 | |
Manis javanica | Seunda pangolin | Pholidota | XP_017496222.1 | 155 | 62 | 79 | |
Myotis davidii | Bat | Chiroptera | XP_015428095.1 | 104 | 41 | 50 | |
Elephantulus edwardii | Cape elephant shrew | Macroscelidea | XP_006893786.1 | 76 | 78 | 80 | |
Dasypus novemcinctus | Nine-banded armadillo | Cingulata | XP_004447160.1 | 88 | 72 | 82 | |
Vombatus urinus | Common wombat | Diprotodontia | XP_027697780.1 | 78 | 45 | 62 | |
Phascolarctos cinereus | Koala | Diprotodontia | XP_020854530.1 | 78 | 44 | 62 | |
Ornithorhynchus anatinus | Platypus | Monotremata | XP_028912384.1 | 80 | 43 | 61 |
A Genomatix ElDorado promoter database search predicted one promoter for c5orf46. This promoter has the ID number of GXP_123762 and transcript ID number GXT_22785522. The promoter is located on the minus strand of chromosome 5, and was predicted to range from nucleotides 147906451 to 147908007, making it 1557 nucleotides in length.
A total of 428 transcription factor binding sites were predicted to be located within the predicted promoter sequence. The predictions included the following transcription factors:[3]
C5orf46 is largely expressed in salivary glands and skin tissue, though some expression in heart tissue, testis, and placenta is also observed.[4] Microarray data measuring c5orf46 expression in psoriasis patients revealed a trend of low expression in patients with lesional psoriasis. Samples from lesional psoriasis patients had significantly lower c5orf46 expression compared to non-lesional psoriasis patients and healthy control samples.
C5orf46 is 102 amino acids in length. The protein has a signal peptide sequence at its N-terminus. The signal peptide sequence is highly conserved in orthologs. The amino acid sequence includes a DDKPD sequence that is repeated, with an aspartate and lysine rich region.
Through prediction software including the Chou and Fasman Secondary Structure Prediction server and Prabi GOR IV Prediction analysis, two alpha-helical segments were predicted.[5] [6]
Predictive models made by Phyre2 and SWISS-Model have shown two alpha-helical domains with a bend between them.[7] [8]
C5orf46 has multiple predicted post-translation modification sites, and one modification identified through mass spectrometry. Mass spectrometry analysis of extracts from a NCI-H2228 lung cancer cell line have identified an acetylation site at K42.[9] C5orf46 has predicted phosphorylation sites at T14, S52, S84, and S86.[10] Predicted sumoylation sites are present at K41, K44, K48, K54, and K57.[11] There are two predicted O-GlcNAcylation sites found at S100 and S101.[12]
An analysis of the c5orf46 amino acid sequence revealed that the protein is likely to be secreted.[13] Further sequence analyses have predicted that the protein has one transmembrane domain, with an intracellular N-terminal domain.[14] [15]
C5orf46 has been predicted to interact with phosphopantothenoylcysteine synthetase (PPCS) and transmembrane BAX inhibitor motif containing 6 (TMBIM6) through affinity purification-mass spectrometry methods.[16]
C5orf46 has been shown to be a prognostic marker in renal and cervical cancer, with high expression being linked to unfavorable outcomes. These conclusions were based on Human Protein Pathology Atlas gene expression analyses and survival outcomes of 651 and 291 patients with renal and cervical cancer respectively.[17] In these analyses, patients that were classified with high expression of c5orf46 were shown to have a 50% lower survival rate after 10 years than patients with low expression.