FAM120AOS explained

FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function.[1] The gene ontology describes the gene to be protein binding.[2] Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.

The microarray-assessed tissue expression pattern of multiple normal tissues for FAM120AOS in humans was found using GDS3834 data.[3] The three tissues in the 90th percentile and higher for FAM120AOS gene expression are as follows: the bladder, epididymis, and thyroid. The thyroid is in the 91st percentile, while the other two are in the 90th percentile. Since high thyroid expression was also seen across the RNA-seq data,[4] [5] [6] [7] it appears that FAM120AOS expression may be important in the thyroid.

Gene

Common aliases

The common aliases for FAM120AOS are C9orf10OS, FLJ31534, LOC158293, and putative FAM120A opposite strand protein.

Locus

There are two genomic locations for the gene, the first of which is chr9:93,431,441-93,453,601(GRCh38/hg38) with a length of 22,161 base pairs (bp), oriented on the minus strand of the chromosome.[8] The second genomic location for the gene is at chr9:96,208,776-96,215,874(GRCh37/hg19) with a length of 7,099 bp, also oriented on the minus strand of the chromosome. The genes found upstream of FAM120AOS on chromosome 9 are FGD3, SUSD, C9orf89, WNK2, C9orf129, and NINJ1. The genes found downstream from FAM120AOS on chromosome 9 are FAM120A and PHF2.

Number of exons

The longest isoform of FAM120AOS in humans contains 3 exons.

Span of gene

The mRNA transcript variant that encodes for human FAM120AOS isoform 1 is 5922 bp long and contains an upstream in-frame stop codon (taa) at 807-809 bp.[9]

Transcripts

There are 12 known isoforms of the human FAM120AOS gene. The longest and most common transcript variant is isoform 1, which is 5922 bp in length.[10] Transcript variants 3-12 are all non-coding RNAs, meaning that they do not code for a protein. The only isoforms that are protein-encoding are isoform 1 and 2 of the human FAM120AOS gene.

Isoform 2 is 5008 bp in length and contains an alternate exon in the 5' UTR, is missing a portion of the 5' coding region, and initiates translation at an alternate start codon, in comparison to isoform 1.[11] The variant also has a shorter and more distinct N-terminus in comparison to isoform 1.

Non-coding RNAS

All of the following variations mentioned are in comparison to isoform 1 of the human FAM120AOS gene. Isoform 3 is 2199 bp and uses an alternate splice site in the first exon.[12] The transcript variants (e.g. isoforms) 6-12 are all candidates for nonsense-mediated mRNA decay (NMD).

Isoform 4 of the gene is 2320 bp and uses an alternate splice site in the first exon and contains an alternate internal exon.[13] Isoform 5 is 6043 bp and contains an alternate internal exon. Isoform 6 is 5272 bp and contains an alternate first exon and an alternate internal exon.[14] Isoform 7 is 5095 bp and contains an alternate first exon.[15] Isoform 8 is 5129 bp and contains an alternate first exon and alternate internal exon.[16] Isoform 9 is 5151 bp and contains an alternate first exon.[17] Isoform 10 is 5354 bp and contains an alternate first exon.[18] Isoform 11 is 5475 bp and contains an alternate first exon and an alternate internal exon.[19] Lastly, isoform 12 5216 is bp and contains an alternate first exon and an alternate internal exon.

Proteins

Isoforms

There are two different isoforms of the human FAM120AOS gene that encode a protein, isoforms 1 and 2. The uncharacterized protein FAM120AOS isoform 1 is 256 amino acids long[20] and the uncharacterized protein FAM120AOS isoform 2 is 74 amino acids long.[21] Uncharacterized protein FAM120AOS isoform 1 is the longer and more abundant isoform found in humans, and contains protein domain Q5T035. The isoform also has a protein interactant, Q5T035-F120S_HUMAN, and CRISPR reagents and clone products of the protein available.

Molecular weight

Uncharacterized protein FAM120AOS isoform 1 (protein isoform 1) in humans has a calculated molecular weight of 27.8 kDa. A theoretical value of 11.93 for the isoelectric point of the protein was determined through the use of ExPASy.[22] The basic isoelectric point indicates that protein isoform 1 is primarily basic. Table 1 shows the isoelectric points and molecular weights for all the different orthologs of the human FAM120AOS protein 1 across Primates and Artiodactyla.[23] The isoelectric point of the protein remains within a pH of 10.05-11.93 across all orthologs, indicating that the protein is primarily basic. However, the molecular weight of the FAM120AOS protein seems to vary greatly between orthologs, ranging from values of 8.1 kDa to 17.9 kDa, with a maximum value of 29.8 kDa. Many of the sequences with a lower molecular weight were found to be composed of fewer amino acids than the sequences with larger molecular weights. These length differences could also be attributed to possible different isoforms of the FAM120AOS protein being analyzed.

Organism
Taxonomic GroupIsoelectric PointMolecular Weight (in kDa)
Homo sapiensPrimates11.9327.9
Pan troglodytesPrimates11.9227.7
Pongo abeliiPrimates11.6927.8
Nomascus leucogenysPrimates10.327.9
Hylobates molochPrimates10.068.1
Trachypithecus francoisiPrimates11.358.1
Rhinopithecus roxellanaPrimates11.578.3
Macaca nemestrinaPrimates11.358.1
Papio anubisPrimates10.988.2
Carlito syrichtaPrimates11.3625.8
Microcebus murinusPrimates11.5229.8
Muntiacus muntjakArtiodactyla11.2117.9

Amino acid composition

Protein isoform 1 contains two different internal repeats in its amino acid composition, determined through analysis of the protein sequence using Dotlet JS.[24] The first internal repeat occurs at amino acid positions 41-59 and 88–105. The second internal repeat occurs at amino acid positions 145-153 and 160–168. There is an upstream in-frame stop codon (taa) present at amino acid positions 806–808. There is an alternate polyadenylation site present at amino acid positions 2726–2731. The polyadenylation signal used is present from amino acid positions 5889–5893. The amino acid positions from L206-S211, H213, H215, K219-P225, and K227-C233 were found to be conserved across all of the strict orthologs of the human uncharacterized protein FAM120AOS isoform 1.[25] The amino acid G95 was found to be conserved across all Primates and Artiodactyla for which sequences were identified. The human FAM120AOS protein 1 was found to arginine-rich, and glutamic acid and tyrosine-poor.[26]

Domains and motifs

The uncharacterized protein FAM120AOS isoform 1 in humans contains the protein domain Q5T035.

Two notable motifs found using a eukaryotic linear motif analysis for the human FAM120AOS protein 1 are TRG_RT_diArg_1 and TRG_NLS_MonoExtN_4.[27] The TRG_RT_diArg_1 motif is a di Arginine retention/retrieving signal that is present on membrane proteins, where it serves for ER localization. The TRG_NLS_MonoExtN_4 is a NLS classical nuclear localization signal, which is possessed by many nuclear proteins, indicating that the human FAM120AOS protein 1 is a nuclear protein.

Secondary structure

The secondary structure of the human FAM120AOS protein 1 was predicted by the I-TASSER server and shows 11 alpha helices as follows, in order of position: SER15-TRP18, PRO25-SER27, THR34-TRP40, ALA85-ARG88, LYS111-ALA121, CYS145-ARG155, HIS158-ALA163, LEU169-LYS171, PRO179-ARG198, PRO225-CYS233, and PRO246-PHE252.[28]

Tertiary and quaternary structure

The tertiary structure of the human FAM120AOS protein 1 was predicted by the I-TASSER server with a C-score of -4.00. It appears that the outermost parts of the protein are more solvent accessible, while the inner areas are less solvent accessible. The protein appears to be primarily blue, again indicating that it is a basic structure. The protein also indicated the presence of a peripheral likelihood of 1.48 at amino acid position 132.[29] The NUCDISC results indicated the presence of pat 7 PLKKTKS (4) starting at amino acid position 168.

Gene regulation

Promoter

There are four different promoters for the human FAM120AOS protein 1, which are depicted in the table below.[30] The promoter used for further analysis below (GXP_1829163) is 1665 base pairs long from coordinates 93450944–93452608, with five coding transcripts.

Promoter
Size (in base pairs)CoordinatesStrandCoding Transcripts
GXP_9004065104093437082-93438121-|None (non-coding only)|-|GXP_228179|1040|93446357-93447396|-None (non-coding only)
GXP_1829163166593450944-93452608-|5|-|GXP_2255852|1487|93453115-93454601|-|2|}

Transcription factor binding sites

The transcription factors described below were identified on the Human FAM120A protein 1 promoter.[31]

Code Name
Full NameBindingMatrix ScoreStart siteEnd site
AP2FActivator protein 2agcGCCAgacggcac0.862336350
STEMMotif composed of binding sites for pluripotency or stem cell factorscccgtctGCATggcccact0.912255273
ZF20C2H2 Zinc finger transcription factors 20tgcggttACCA0.791447457
E2FFE2F-myc activator/cell cycle regulatortggacacggGATAatgg0.7542945
ZF5FZF5 POZ domain zinc fingerccctgaGCGCcccaggc0.9572844
P53FP53 tumor suppressortgcggttaccaaaggCAAGtcagtg0.954312336
RXRFRXR heterodimer binding sitesttattgacctagGGTCatattatag0.857156180
EBOXE-box binding factorsattatccCGTGtccaga0.901466482
ZF02C2H2 Zinc finger transcription factors 2caaaagcaCCCCcctacacccgc0.93391113
AP1RMAF and AP1 related factorsttggttGCTGagaaatttctagtag0.842356380
PLAGpleomorphic adenoma genetaggGGGGtgcttttgctttcct0.871114136
KLFSKrueppel like transcription factorsagagcttAAAGgattcttc0.976118136
ETSFHuman and murine ETS1 factorsttcagtgaGGAAagcaaaagc0.933196216

Expression pattern

An immunohistochemical staining of the FAM120AOS protein in the human prostate using a FAM120AOS polyclonal antibody indicates the presence of FAM120AOS in the nucleus of glandular cells.[32]

In Homo sapiens (humans), the gene exhibits high levels of expression (in RPKM) in the colon, fat, placenta, prostate, and thyroid, as determined through quantitative transcriptomic analysis (RNA-Seq) with the following respective values: 12.598, 11.727, 10.978, 11.277, and 13.511. During human fetal development, the gene exhibits the highest levels of expression in the intestine at 20 weeks and the lungs at 17 weeks, as determined through the use of circular RNA with the following respective mean RPKM values: 5.066 and 4.365. The sequencing of RNA from 20 human tissues showed the highest levels of FAM120AOS expression in the placenta, prostate, and thyroid, with respective mean RPKM values of 7.057, 3.978, and 4.396. Transcription profiling through high throughput sequencing of both individual and mixtures of 16 human tissues RNA also found high levels of FAM120AOS gene expression in the thyroid, with a mean RPKM of 9.518.

Transcript level regulation

There are 4 large stem loops present in the 5' UTR of the human FAM120AOS protein 1.[33] There are 8 miRNA binding sites identified for the human FAM120AOS protein 1.[34]

miRNA Name
miRNA sequenceTarget ScoreSeed Location
has-miR-4286ACCCCACUCCUGGUACC94475
has-miR-3059-5pUUUCCUCUCUGCCCCAUAGGGUGU88199, 396
has-miR-3152UGUGUUAGAAUAGGGGCAAUAA87173,735
has-miR-4499AAGACUGAGAGGAGGGA83730
has-miR-129-2-3pAAGCCCUUACCCCAAAAAGCAU831022
has-miR-129-1-3pAAGCCCUUACCCCAAAAAGUAU831022
has-miR-6881-3pAUCCUCUUUCGUCCUUCCCACU82199, 395
has-miR-10400-3pCUGGGCUCCCGGACGAGGCGGG81337

Protein level regulation

The K-NN prediction results for the human FAM120AOS protein 1 predicted it to be present in the nucleus of cells. There is a possible transmembrane domain for the protein, present from amino acid position 131–148.[35]

Homology/evolution

There were no paralogs identified for human FAM120AOS. The most distant homolog for human FAM120AOS detectable is the Microcebus murinus, with a 61.17% sequence identity to the human protein.[36] There was a total of 11 orthologs identified for human FAM120AOS protein 1.[37] No proteins with homologous domains to the human FAM120AOS sequence were identified.[38] FAM120AOS seems to be evolving at a moderate rate, in between that of cytochrome c and fibrinogen alpha.

Genus and species
Common NameTaxonomic groupDate of divergence (in MYA)Accession numberSequence length (in aa)Sequence Identity to human proteinSequence similarity to human protein
Homo sapiensHumanPrimates0NP_942138.2256100.00%100%
Pan troglodytesChimpanzeePrimates6.4PNI17265.125598.44%100%
Pongo abeliiSumtran orangutanPrimates15.2PNJ71424.125395.70%100%
Nomascus leucogenysNorthern white-cheeked gibbonPrimates19.8XP_030657822.17394.12%26%
Hylobates molochSilvery gibbonPrimates19.8XP_032020454.18692.65%26%
Trachypithecus francoisiFrancois' leaf monkeyPrimates28.81XP_033092605.17492.75%26%
Rhinopithecus roxellanaGolden snub-nosed monkeyPrimates28.81XP_030775307.17492.75%26%
Macaca nemestrinaSouthern pit-tailed macaquePrimates28.81XP_024642522.17492.75%26%
Papio anubisOlive baboonPrimates28.81XP_003912044.17492.30%26%
Carlito syrichtaPhilippine tarsierPrimates69XP_021572479.123666.82%78%
Microcebus murinusMouse lemurPrimates74.1XP_02014479227461.17%76%
Muntiacus muntjakIndian muntjacArtiodactyla94KAB0347543.116197.18%27%

Function/biochemistry

The function and biochemistry of the human FAM120AOS protein are currently unknown. The single nucleotide polymorphisms (SNPs) did not show any mutations in conserved amino acids, so it is lis likely that two copies of the FAM120AOS gene are necessary for proper function.

Interacting proteins

The FAM120AOS protein is physically associated with the following proteins: MDFI, ELAV1, TRIM25, and APEX1.[39] [40] [41] [42]

Clinical significance

A missense mutation in the FAM120AOS protein from amino acid threonine at position 248 to isoleucine (T248I) has been linked in one whole-of-exome sequencing study to: coarse facial features, scoliosis, pectus excavatum, skin laxity, hypotonia, GERD, hyperreactive airway disease, and undescended testicles.[43]

References

Notes and References

  1. Web site: AceView: Gene:FAM120AOS, a comprehensive annotation of human, mouse and worm genes with mRNAs or ESTsAceView.. 2020-10-04. www.ncbi.nlm.nih.gov.
  2. Web site: FAM120AOS family with sequence similarity 120A opposite strand [Homo sapiens (human)] - Gene - NCBI]. 2020-10-04. www.ncbi.nlm.nih.gov.
  3. Web site: GDS3834 / 7875. 2020-12-18. www.ncbi.nlm.nih.gov.
  4. Fagerberg L, Hallström BM, Oksvold P, Kampf C, Djureinovic D, Odeberg J, Habuka M, Tahmasebpoor S, Danielsson A, Edlund K, Asplund A, Sjöstedt E, Lundberg E, Szigyarto CA, Skogs M, Takanen JO, Berling H, Tegel H, Mulder J, Nilsson P, Schwenk JM, Lindskog C, Danielsson F, Mardinoglu A, Sivertsson A, von Feilitzen K, Forsberg M, Zwahlen M, Olsson I, Navani S, Huss M, Nielsen J, Ponten F, Uhlén M . 6 . Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics . Molecular & Cellular Proteomics . 13 . 2 . 397–406 . February 2014 . 24309898 . 3916642 . 10.1074/mcp.M113.035600 . free .
  5. Szabo L, Morey R, Palpant NJ, Wang PL, Afari N, Jiang C, Parast MM, Murry CE, Laurent LC, Salzman J . 6 . Statistically based splicing detection reveals neural enrichment and tissue-specific induction of circular RNA during human fetal development . Genome Biology . 16 . 126 . June 2015 . 1 . 26076956 . 4506483 . 10.1186/s13059-015-0690-5 . free .
  6. Web site: Illumina bodyMap2 transcriptome (ID 204271) - BioProject - NCBI. 2020-12-18. www.ncbi.nlm.nih.gov.
  7. Duff MO, Olson S, Wei X, Garrett SC, Osman A, Bolisetty M, Plocik A, Celniker SE, Graveley BR . 6 . Genome-wide identification of zero nucleotide recursive splicing in Drosophila . Nature . 521 . 7552 . 376–379 . May 2015 . 25970244 . 4529404 . 10.1038/nature14475 . 2015Natur.521..376D .
  8. Web site: FAM120AOS Gene - F120S Protein - F120S Antibody. 2020-10-04. www.genecards.org.
  9. NCBI. 2020-10-12. Homo sapiens family with sequence similarity 120A opposite strand (FAM120AOS), transcript variant 1, mRNA - NCBI Reference Sequence: NM_198841.4. GenBank Nucleotide. en-US.
  10. 2020-10-12. Homo sapiens family with sequence similarity 120A opposite strand (FAM120AOS), transcript variant 1, mRNA. en-US.
  11. 2020-12-12. Homo sapiens family with sequence similarity 120A opposite strand (FAM120AOS), transcript variant 2, mRNA. en-US.
  12. 2020-07-23. Homo sapiens family with sequence similarity 120A opposite strand (FAM120AOS), transcript variant 3, non-coding RNA. en-US.
  13. 2020-07-23. Homo sapiens family with sequence similarity 120A opposite strand (FAM120AOS), transcript variant 4, non-coding RNA. en-US.
  14. 2020-07-25. Homo sapiens family with sequence similarity 120A opposite strand (FAM120AOS), transcript variant 6, non-coding RNA. en-US.
  15. 2020-07-25. Homo sapiens family with sequence similarity 120A opposite strand (FAM120AOS), transcript variant 7, non-coding RNA. en-US.
  16. 2020-07-25. Homo sapiens family with sequence similarity 120A opposite strand (FAM120AOS), transcript variant 8, non-coding RNA. en-US.
  17. 2020-07-25. Homo sapiens family with sequence similarity 120A opposite strand (FAM120AOS), transcript variant 9, non-coding RNA. en-US.
  18. 2020-07-25. Homo sapiens family with sequence similarity 120A opposite strand (FAM120AOS), transcript variant 10, non-coding RNA. en-US.
  19. 2020-07-25. Homo sapiens family with sequence similarity 120A opposite strand (FAM120AOS), transcript variant 11, non-coding RNA. en-US.
  20. Web site: Uncharacterized protein FAM120AOS isoform 1 [Homo sapiens] - Protein - NCBI]. 2020-12-18. www.ncbi.nlm.nih.gov.
  21. Web site: Uncharacterized protein FAM120AOS isoform 2 [Homo sapiens] - Protein - NCBI]. 2020-12-18. www.ncbi.nlm.nih.gov.
  22. Gasteiger E, Gattiker A, Hoogland C, Ivanyi I, Appel RD, Bairoch A . ExPASy: The proteomics server for in-depth protein knowledge and analysis . Nucleic Acids Research . 31 . 13 . 3784–8 . July 2003 . 12824418 . 168970 . 10.1093/nar/gkg563 .
  23. Web site: Compute pI/MW - SIB Swiss Institute of Bioinformatics. 2020-12-19. www.Expasy.org.
  24. Web site: Dotlet JS. 2020-12-18. dotlet.vital-it.ch.
  25. Web site: Clustal Omega < Multiple Sequence Alignment < EMBL-EBI . 2020-12-19. www.ebi.ac.uk.
  26. Web site: EBI Tools . 2020-12-19. www.ebi.ac.uk.
  27. Web site: ELM . 2020-12-19 . elm.eu.org.
  28. Web site: I-TASSER server for protein structure and function prediction. 2020-12-19. zhanglab.ccmb.med.umich.edu.
  29. Web site: PSORT II Prediction. 2020-12-19. psort.hgc.jp.
  30. Web site: Gene2Promoter. 2020-12-19. www.genomatix.de.
  31. Web site: MatInspector: Search for transcription factor binding sites. 2020-12-19. www.genomatix.de. 2002-08-12. https://web.archive.org/web/20020812010040/http://www.genomatix.de/online_help/help_matinspector/matinspector_help.html. dead.
  32. Web site: FAM120AOS Antibodies. ThermoFisher Scientific.
  33. Web site: RNAfold web server. 2020-12-19. rna.tbi.univie.ac.at.
  34. Web site: miRDB - Custom Prediction. 2020-12-19. mirdb.org.
  35. Web site: EBI Tools: Job not available. 2020-12-19. www.ebi.ac.uk.
  36. Web site: LOW QUALITY PROTEIN: uncharacterized protein FAM120AOS [Microcebus mur - Protein - NCBI|url=https://www.ncbi.nlm.nih.gov/protein/XP_020144792|access-date=2020-12-19|website=www.ncbi.nlm.nih.gov].
  37. Web site: BLAST: Basic Local Alignment Search Tool. 2020-12-19. blast.ncbi.nlm.nih.gov.
  38. Web site: Human BLAT Search. 2020-12-19. genome.ucsc.edu.
  39. Web site: MDFI - MyoD family inhibitor, isoform CRA_a - Homo sapiens (Human) - MDFI gene & protein. 2020-12-19. www.uniprot.org.
  40. Web site: ELAVL1 - ELAV-like protein 1 - Homo sapiens (Human) - ELAVL1 gene & protein. 2020-12-19. www.uniprot.org. en.
  41. Web site: TRIM25 Gene - TRI25 Protein -TRI25 Antibody. 2020-12-19. www.genecards.org.
  42. Web site: FAM120AOS (RP11-165J3.1) Result Summary - BioGRID. 2020-12-19. thebiogrid.org.
  43. Alazami AM, Patel N, Shamseldin HE, Anazi S, Al-Dosari MS, Alzahrani F, Hijazi H, Alshammari M, Aldahmesh MA, Salih MA, Faqeih E, Alhashem A, Bashiri FA, Al-Owain M, Kentab AY, Sogaty S, Al Tala S, Temsah MH, Tulbah M, Aljelaify RF, Alshahwan SA, Seidahmed MZ, Alhadid AA, Aldhalaan H, AlQallaf F, Kurdi W, Alfadhel M, Babay Z, Alsogheer M, Kaya N, Al-Hassnan ZN, Abdel-Salam GM, Al-Sannaa N, Al Mutairi F, El Khashab HY, Bohlega S, Jia X, Nguyen HC, Hammami R, Adly N, Mohamed JY, Abdulwahab F, Ibrahim N, Naim EA, Al-Younes B, Meyer BF, Hashem M, Shaheen R, Xiong Y, Abouelhoda M, Aldeeri AA, Monies DM, Alkuraya FS . 6 . Accelerating novel candidate gene discovery in neurogenetic disorders via whole-exome sequencing of prescreened multiplex consanguineous families . English . Cell Reports . 10 . 2 . 148–61 . January 2015 . 25558065 . 10.1016/j.celrep.2014.12.015 . free .