C1orf159 is a protein that in human is encoded by the C1orf159 gene located on chromosome 1.[1] [2] This gene is also found to be an unfavorable prognosis marker for renal and liver cancer, and a favorable prognosis marker for urothelial cancer.[3]
The Homo sapiens C1orf159 gene (UniProt ID: Q96HA4) is a gene located on the short arm of chromosome 1 at locus 1p36.33. The gene is 34,247 base pairs in length, located at Chromosome 1 position 1,081,818 to 1,116,089 on the reverse strand.[4]
The longest variant of human C1orf159 gene encodes an mRNA that is 2,432 nucleotides in length with 12 exons.[5] A promoter region was predicted using UCSC Genome Browser,[6] which is 762 nucleotides long, including a 434 nucleotide upstream of the transcriptional start site, exon 1, and a 298 nucleotide region of intron 1.
Alternative splicing of the gene creates 5 protein isoforms. The longest isoform is 380 amino acids in length with a molecular mass of 40.382 kDa.
1 | Q96HA4-1 | 380 | |
2 | Q96HA4-2 | 185 | |
3 | Q96HA4-3 | 189 | |
4 | Q96HA4-4 | 198 | |
5 | Q96HA4-5 | 254 |
C1orf159 protein is a proline- and arginine-rich, and a lysine- and glutamic acid- poor protein. The isoelectric point of the human C1orf159 protein is 10.07,[8] which is more basic than the average human proteomic protein pI of 7.36.[9]
The human C1orf159 protein contains a domain of unknown function DUF4501. Although the exact function of the domain is not clear, it is thought to be a single pass-membrane protein with highly conserved cysteine residues.
The protein also contains a transmembrane domain at positions 144-169 and a signal peptide at positions 1-18.
Alphafold predicts the structure of human C1orf159 protein to be mainly composed of alpha-helices.[10]
The predicted post-translational modifications of the C1orf159 protein includes N-linked glycosylation on asparagine at positions 104, 111, and 128.[11]
Orthologs of human C1orf159 are found in vertebrates including mammals, birds, reptiles, amphibians, and fish[12] with the most distantly related group of organisms being cartilaginous fish, with a date of divergence of approximately 450 million years ago.[13] Orthologs are not found in jawless fish or invertebrates.
Group | TaxonomicGroup | NCBI Protein Accession Number | Protein SequenceSimilarity (% Relative to Human Protein) | ||
Human | Mammals | Primates | NP_001317235.1 | 100.0 | |
Chimpanzee | Primates | XP_024204744.1 | 98.4 | ||
Bonobo | Primates | XP_008975653.2 | 88.9 | ||
House Mouse | Rodentia | NP_796179.1 | 40.9 | ||
Cattle | Artiodactyla | NP_001026925.1 | 36.6 | ||
Sunda Flying Lemur | Dermoptera | XP_008567908.1 | 39.4 | ||
Chinese Tree Shrew | Scandentia | XP_027622332.1 | 35.8 | ||
Cougar | Carnivora | XP_025768111.1 | 41.7 | ||
Chicken | Birds | Galliformes | XP_024998437.2 | 32.8 | |
Rock Pigeon | Columbiformes | XP_013226562.2 | 35.7 | ||
Hooded Crow | Passeriformes | XP_039420032.1 | 29.9 | ||
Golden-collared Manakin | Passeriformes | XP_017934783.1 | 36.5 | ||
Gharial | Reptiles | Crocodilia | XP_019367354.1 | 36.8 | |
Leatherback Sea Turtle | Testudines | XP_027584571.1 | 35.9 | ||
Chinese Softshell Turtle | Testudines | XP_006127168.1 | 35.2 | ||
Western Clawed Frog | Anura | NP_001039047.1 | 34.6 | ||
Two-lined Caecilian | Amphibians | Gymnophiona | XP_029433955.1 | 33.9 | |
Asiatic Toad | Anura | XP_044137731.1 | 31.6 | ||
Zebrafish | Fish | Cypriniformes | NP_001313355.1 | 26.4 | |
Sterlet | Acipenseriformes | XP_034760226.1 | 32.8 | ||
Reedfish | Polypteriformes | XP_028663678.1 | 32.9 | ||
Small-spotted Catshark | Carcharhiniformes | XP_038629468.1 | 28.0 | ||
Whale Shark | Orectolobiformes | XP_020381962.1 | 32.9 |
When compared with the evolution rate with cytochrome c and fibrinogen alpha, the C1orf159 protein has a similar evolutionary rate of change to the fast-evolving fibrinogen alpha protein, C1orf159 protein has a relatively fast evolution rate.
The Human Protein Atlas shows that C1orf159 is an unfavorable prognosis marker for renal and liver cancer, and a favorable prognosis marker for urothelial cancer, indicating that a high expression of C1orf159 is associated with a lower survival probability for patients with renal and liver cancer, and is associated with a higher survival probability for patients with urothelial cancer.