Chromosome 4 open reading frame 50 explained

Chromosome 4 open reading frame 50 is a protein that in humans is encoded by the C4orf50 gene.[1] The protein localizes in the nucleus. C4orf50 has orthologs in vertebrates but not invertebrates[2]

Gene

The C4orf50 gene is on chromosome 4 at position 4p16.2 and is located on the minus strand.[3] The gene's longest isoform consists of 11 exons, a coding sequence of 6370 nucleotides, and an upstream in-frame stop codon.[4] Other genes in the gene neighborhood include: CRMP1 and JAKMIP1

Protein

C4orf50 is 1508 amino acids long and has a calculated molecular weight of 30 kDa. The isoelectric point is at approximately a pH of 5.6.[5] In addition, the protein has higher than normal amounts of glutamic acid and arginine, and lower than normal amounts of phenylalanine and tyrosine.[6]

Tertiary structure

i-TASSER and Phyre 2 predict C4orf50 to have a tertiary structure rich in alpha helices concentrated near the N-terminus and C-terminus.[7] [8]

Gene level regulation

Expression

C4orf50 RNA is expressed lowly and ubiquitously in most tissue types. C4orf50 is expressed at a much higher level in the brain, testis, adrenal, and prostate. C4orf50 was expressed in specific parts of the brain including the hippocampus and striatum. Other tissues with moderate expression included the frontal lobe, parietal lobe, and amygdala. In all available RNA-sequencing data shows C4orf50 is found in the brain.

Protein level regulation

Modification

It is predicted that C4orf50 has 21 phosphorylation sites, one sulfonation site, one N-glycosylation site, and several O-glycosylation sites.[9]

Subcellular localization

The primary subcellular location is the nucleus. Immunofluorescent staining of C4orf50 antibodies show that C4orf50 is present in the nucleus, but the reason remains unknown.[10] C4orf50 is less abundant than most proteins in humans[10]

Evolution

OrthologsC4orf50 in Homo sapiens is poorly conserved. It is found in vertebrates but not invertebrates and has many orthologs including mammals, reptiles, birds, amphibians, and fish.[11] Table 1 below shows orthologs of C4orf50 in mammals, reptiles, birds, amphibians, and fish. C4orf50 is evolving considerably quickly compared to reference sequences Cytochrome C and Fibrinogen alpha. This is shown to the right when comparing the divergence rates of C4orf50, Cytochrome C, and Fibrinogen Alpha.

Genus and SpeciesCommon NameTaxonomic GroupMedian Date of Divergence (MYA*)Accession #Sequence Length (aa)Sequence Identity to Human Protein (%)Sequence Similarity to Human Protein (%)
Homo sapiensHumanPrimate0XP_0472716221508100100
Tupaia chinensisChinese Tree ShrewTupaiidae85XP_02762200714489353.2
Mus musculusHouse MouseRodentia87XP_00650429912389041.9
Talpa occidentalisIberian MoleTalpidae94XP_03738643613647944.3
Mauremys muticaYellow Pond TurtleTestudines319XP_04487444819546230.5
Alligator mississippiensisAmerican AlligatorCrocodilia319XP_01933319818933728.3
Apteryx rowiOkarito KiwiApterygiformes319XP_0259106221459847.2
Aquila chrysaetos chrysaetosGolden EagleAccipitriformes319XP_04097908116111038.3
Gallus gallusChickenGalliformes319XP_0467726701627744.6
Anser cygnoidesSwan GooseAnseriformes319XP_04790211815961831.7
Falco cherrugSaker FalconFalconiformes319XP_0276699801518850.4
StrigopsKakapoPsittaciformes319XP_0303472511497850.4
Geotrypetes seraphiniGaboon CaecillianDermophiidae353XP_03381540418971137.8
Halichoerus grypusGrey SealPhocidae94XP_03596056615368551
Amblyraja radiataThorny SkateRajiformes464XP_03287699224347450.8

Notes and References

  1. Web site: C4orf50 Gene - GeneCards CD050 Protein CD050 Antibody . 2022-07-29 . www.genecards.org.
  2. Web site: Home - Protein - NCBI . 2022-07-29 . www.ncbi.nlm.nih.gov.
  3. Web site: C4orf50 chromosome 4 open reading frame 50 [Homo sapiens (human)] - Gene - NCBI ]. 2022-07-29 . www.ncbi.nlm.nih.gov.
  4. 2022-04-05 . PREDICTED: Homo sapiens chromosome 4 open reading frame 50 (C4orf50), transcript variant X2, mRNA . en-US.
  5. Web site: ExPASy - Compute pI/Mw tool . 2022-07-29 . web.expasy.org.
  6. Web site: SAPS < Sequence Statistics < EMBL-EBI . 2022-07-29 . www.ebi.ac.uk.
  7. Web site: http://www.sbg.bio.ic.ac.uk/~phyre2/html/ . 2022-07-29 . www.sbg.bio.ic.ac.uk.
  8. Web site: I-TASSER results . 2022-07-29 . seq2fun.dcmb.med.umich.edu .
  9. Web site: Services . 2022-07-29 . www.healthtech.dtu.dk . en.
  10. Web site: C4orf50 Antibody (PA5-63550) . 2022-07-29 . www.thermofisher.com.
  11. Web site: BLAST: Basic Local Alignment Search Tool . 2022-07-29 . blast.ncbi.nlm.nih.gov.