CCDC121 explained

Coiled-coil domain containing 121 (CCDC121) is a protein encoded by the CCDC121 gene in humans. CCDC121 is located on the minus strand of chromosome 2 and encodes three protein isoforms.[1] All isoforms of CCDC121 contain a domain of unknown function referred to as DUF4515 or pfam14988.

Gene

Aliases, locus and size

CCDC121 has known aliases of FLJ43364, FLJ13646, hCG_1988995, LOC79635, and coiled-coil domain containing 121.[2]

CCDC121 is located on the minus strand of chromosome 2 at 2q23.3. It is 3,394 base pairs in length.[1]

Isoforms and alternative splicing

CCDC121 produces four different mRNAs: three alternatively spliced variants and one unspliced form.[3] The three alternatively spliced mRNAs give rise to three known protein isoforms. Transcripts for isoforms 1-3 are 2,880, 2,762 and 2,361 base pairs in length respectively.[4] [5] [6] Each of the mRNA variants contains two exons separated by a gt-ag intron.[1]

Protein

Primary sequence, molecular weight and pI

Protein Accession Number Length (amino acids) Molecular Weight (kDa) Predicted pI
Isoform 1[7] XP_005264617 442 50.8 9.80
Isoform 2[8] NP_001136155 440 50.9 9.81
Isoform 3[9] NP_078860 278 33.1 9.84

Molecular weight and pI were calculated using ExPasy.[10]

Compositional analysis

Compositional analysis of all isoforms shows that they have below-average levels of aspartate (D) and valine (V) and above-average levels of glutamine (Q). In addition, they have above-average levels of lysine (K) and arginine (R) groupings. Isoform 3 also exhibits above-average lysine levels and below-average proline and glycine levels. Chimpanzee, dog, and ferret orthologs also exhibited above-average glutamine levels and lysine and arginine groupings.[11]

Secondary and tertiary structure

The secondary structure prediction for CCDC121 was obtained using Ali2D.[12] CCDC121 adopts a predominant alpha helical secondary structure (shown in red) due to the presence of the Coiled-coil motif.[13]

The tertiary structure of CCDC121 is composed mostly of alpha helices and contains some random coil.[14]

Domains, motifs and post-translational modifications

CCDC121 contains one domain of unknown function, DUF4515 or pfam14988. It also contains three predicted coiled-coil motifs from residues A165 to E192, L264 to E305 and N363 to E397.[15]

CCDC121 is predicted to have post-translational modification sites for: acetylation,[16] [17] Protein Kinase C and Casein Kinase II phosphorylation,[18] glycation,[19] GalNAc O-glycosylation,[20] SUMOylation,[21] [22] and O-β-GlcNAc attachment.[23]

Subcellular localization

Current evidence suggests that CCDC121 is partially localized in the nucleus. CCDC121 has a predicted nuclear localization signal from amino acids R327 to L337. This sequence has a score of 7, which is consistent with being a partial nuclear protein.[24] In addition, PSORT II found that there is 56.5% chance that CCDC121 is found in the nucleus.[25]

There is also evidence to suggest that CCDC121 is partially located in the cytosol. Cytochemistry studies of the Anti-CCDC121 antibody from The Human Protein Atlas indicate that CCDC121 is expressed in the cytosol and actin filaments. These tentative results are promising but further research of other anti-CCDC121 antibodies is needed.[26] Additionally, TargetP did not find a mitochondrial transfer peptide, which suggests that CCDC121 is likely not a mitochondrial protein.[27]

Expression and function

CCDC121 is expressed at the highest levels in the testes, ovaries, prostate, and thyroid.[28] It is expressed 40% less than the average gene so it is considered to have low levels of expression.[3]

The function of CCDC121 protein is not yet well understood by the scientific community. There is no known phenotype associated with the CCDC121 gene.[3]

Homology

Rate of molecular evolution

Cytochrome c is a highly conserved protein and fibrinogen is a rapidly evolving protein. CCDC121 has a faster rate of molecular of evolution relative to both these proteins, suggesting that CCDC121 evolves very rapidly on an evolutionary timescale.

Orthologs

There are 126 confirmed orthologs of CCDC121.[29] CCDC121 orthologs are most abundant in mammals. 122 of the 126 orthologs are within the Eutheria, Marsupialia, and Monotremata clades. 119 of the 122 mammalian orthologs are found within eutherian mammals. The four remaining orthologs are the two-lined caecilian, the West Indian Ocean coelacanth, the electric eel, and the northern pike. These orthologs represent the Amphibia, Sarcopterygii, and Actinopterygii clades respectively. The CCDC121 gene likely appeared 433 million years ago in a common ancestor of Actinopterygii and Sarcopterygii.

Paralogs

CCDC166 is the only known paralog of CCDC121. They share a 23% sequence identity. Both CCDC121 and CCDC166 include the Domain of Unknown Function 4515 (DUF4515), or pfam14988, as a highly conserved sequence.[30]

Clinical significance

Mutations in the CCDC121 gene have been found in patients with certain cancers such as endometrial, lung, bladder, gastric/stomach, head/neck, and prostate cancers but no causal relationship has been determined.[31] [32] CCDC121 may also serve as a marker gene for inner ear development.[33]

References

  1. https://www.genecards.org/cgi-bin/carddisp.pl?gene=CCDC121 GeneCards entry on CCDC121
  2. https://www.genenames.org/data/gene-symbol-report/#!/hgnc_id/25833 HGNC (HUGO Gene Nomenclature Committee) entry on CCDC121
  3. https://www.ncbi.nlm.nih.gov/IEB/Research/Acembly/av.cgi?db=human&term=ccdc121&submit=Go NCBI-Aceview entry on CCDC121
  4. PREDICTED: Homo sapiens coiled-coil domain containing 121 (CCDC121), transcript variant X1, mRNA. NCBI Nucleotide. https://www.ncbi.nlm.nih.gov/nuccore/XM_005264560.4
  5. Homo sapiens coiled-coil domain containing 121 (CCDC121), transcript variant 2, mRNA. NCBI Nucleotide. https://www.ncbi.nlm.nih.gov/nuccore/NM_001142683.2
  6. Homo sapiens coiled-coil domain containing 121 (CCDC121), transcript variant 3, mRNA. NCBI Nucleotide. https://www.ncbi.nlm.nih.gov/nuccore/NM_024584.4
  7. coiled-coil domain-containing protein 121 isoform X1 [Homo sapiens]. NCBI Protein. https://www.ncbi.nlm.nih.gov/protein/530368104
  8. coiled-coil domain-containing protein 121 isoform 2 [Homo sapiens]. NCBI Protein. https://www.ncbi.nlm.nih.gov/protein/218083720
  9. coiled-coil domain-containing protein 121 isoform 3 [Homo sapiens]. NCBI Protein. https://www.ncbi.nlm.nih.gov/protein/39979626
  10. https://web.expasy.org/compute_pi/ ExPASy Compute pI/MW tool
  11. https://www.ebi.ac.uk/Tools/seqstats/saps/ Statistical Analysis of Protein Sequences Tool
  12. https://toolkit.tuebingen.mpg.de/tools/ali2d Secondary Structure prediction for CCDC121. Ali2D
  13. Secondary Structure Prediction for CCDC121. Chou Fassman Secondary Structure Prediction Server (CFSSP). http://www.biogem.org/tool/chou-fasman/
  14. The Phyre2 web portal for protein modeling, prediction and analysis. Kelley LA et al.. Nature Protocols 10, 845-858 (2015) http://www.sbg.bio.ic.ac.uk/~phyre2/html/page.cgi?id=index
  15. https://npsa-prabi.ibcp.fr/cgi-bin/npsa_automat.pl? Coiled-Coils Prediction. Prediction for CCDC121 isoform 1
  16. http://www.cbs.dtu.dk/services/NetAcet/ NETAcet-1.0 Server
  17. http://terminus.unige.ch/ Terminus--N-Terminal PTM prediction. Swiss Institute of Bioinformatics
  18. http://www.cbs.dtu.dk/services/NetPhos/ NetPhos-3.1 Server. Prediction for CCDC121
  19. http://www.cbs.dtu.dk/services/NetGlycate/ NetGlycate-1.0 Server. Predition for CCDC121
  20. http://www.cbs.dtu.dk/services/NetOGlyc/ NetOGlyc-4.0 Server. Prediction for CCDC121
  21. http://www.abcepta.com/sumoplot SUMOplotTM Analysis Program. Prediction for CCDC121
  22. GPS-SUMO: Prediction of SUMOylation Sites & SUMO-binding Motifs. Prediction for CCDC121. http://sumosp.biocuckoo.org/online.php
  23. http://www.cbs.dtu.dk/services/YinOYang/ YinOYang 1.2 Server. Prediction for CCDC121
  24. Web site: NLS Mapper. Predition for CCDC121 . 2020-05-03 . 2021-11-22 . https://web.archive.org/web/20211122095245/http://nls-mapper.iab.keio.ac.jp/cgi-bin/NLS_Mapper_form.cgi . dead .
  25. https://psort.hgc.jp/form2.html PSORT II Prediction Tool
  26. https://www.proteinatlas.org/ENSG00000176714-CCDC121/cell#rna Human Protein Atlas entry on CCDC121
  27. http://www.cbs.dtu.dk/services/TargetP/ TargetP-2.0 Server
  28. https://www.ncbi.nlm.nih.gov/geo/tools/profileGraph.cgi?ID=GDS3113:181244 NCBI GeoProfile entry on CCDC121. GDS3113 Various normal tissues
  29. https://www.ncbi.nlm.nih.gov/gene/79635#gene-expression NCBI Gene Database entry on CCDC121
  30. Clustal Omega: Multiple Sequence Alignment Tool. Alignment of CCDC166 and CCDC121 isoforms 1 and 2. https://www.ebi.ac.uk/Tools/msa/clustalo/
  31. Zhang J, Huang JY, Chen YN, Yuan F, Zhang H, Yan FH, Wang MJ, Wang G, Su M, Lu G, Huang Y, Dai H, Ji J, Zhang J, Zhang JN, Jiang YN, Chen SJ, Zhu ZG, Yu YY . 6 . Erratum: Whole genome and transcriptome sequencing of matched primary and peritoneal metastatic gastric carcinoma . Scientific Reports . 5 . 15309 . October 2015 . 26485306 . 4613365 . 10.1038/srep15309 . 2015NatSR...515309Z .
  32. https://www.phosphosite.org/proteinAction?id=21651&showAllSites=true#appletMsg PhosphoSitePlus® entry on CCDC121 protein
  33. Liu, Q., Chen, J., Gao, X., Ding, J., Tang, Z., Zhang, C., … Wang, J. (2015). Identification of stage-specific markers during the differentiation of hair cells from mouse inner ear stem cells or progenitor cells in vitro. International Journal of Biochemistry and Cell Biology, 60, 99–111. https://doi.org/10.1016/j.biocel.2014.12.024