C11orf49 Explained

C11orf49 is a protein coding gene that in humans encodes for the C11orf49 protein. It is heavily expressed in brain tissue and peripheral blood mononuclear cells, with the latter being an important component of the immune system.[1] [2] It is predicted that the C11orf49 protein acts as a kinase, and has been shown to interact with HTT (determining protein for Huntington's disease) and APOE2 (risk protein for Alzheimer's).[3] __TOC__

Gene

Aliases

Common aliases are UPF0705, FLJ22210, and MGC4707.

Location

C11orf49 is found at locus p11.2 on human chromosome 11, with a plus strand orientation. The gene is 224,830 bp long including introns, and spans from position 46,936,806 to 47,161,635 on chromosome 11. [4]

Transcript Variants

There are 7 known transcript variants for the mRNA of C11orf49, with variant 2 encoding for the most complete protein. Variant 1 lacks a 3’ splice junction, which results in a truncated 3’ terminus compared to variant 2. Variant 3 contains an alternate splice site at the 3’ end, which lacks an internal region near the 3’ terminus compared to variant 2. Variant 4 has an alternate 3’ terminus exon, resulting in a truncated 3’ terminus compared to variant 2. Variant 5 lacks an exon in the 5’ coding region which results in an upstream start codon, and has alternate splice site near the 3’ region. This results in a distinct N-terminus and a missing internal region near the 3’ terminus compared to variant 2. Variants 6 and 7 are both represented as candidates for nonsense-mediated mRNA decay (NMD), and do not encode for viable proteins.[5]

NameAccession NumberNumbers of ExonsSize (bp)
Transcript Variant 1NM_001003676.381923
Transcript Variant 2NM_001003677.391668
Transcript Variant 3NM_024113.581650
Transcript Variant 4NM_001003678.391159
Transcript Variant 5NM_001278222.181619
Transcript Variant 6NR_103471.2101895
Transcript Variant 7NR_103472.281519
Table 1. Known human mRNA transcript variants for C11orf49.

Protein

Isoforms

There are 5 known isoforms for the C11orf49 protein with isoform 2 being the most complete protein, encoded by transcript variant 2.

NameAccession NumberSize (AA)
Isoform 1NP_001003676.1274
Isoform 2NP_001003677.1337
Isoform 3NP_077018.1331
Isoform 4NP_001003678.1326
Isoform 5NP_001265151.1322
Table 2. Known human protein isoforms for C11orf49.

Composition

The C11orf49 protein has a molecular weight of 38.1 kD, and an isoelectric point of about pH = 5.[6] Protein composition falls under normal levels for each amino acid, and there are no conserved repeats, patterns, or charged clusters to be seen. There are no hydrophobic or transmembrane regions to be seen.[7]

Protein Domain

The C11orf49 protein is predicted to contain a protein kinase domain near the N' terminus (residues 12-51)[8]

Secondary Structure

Secondary structure prediction tools such as Ali2D, Phyre2, and i-Tasser all predict that the C11orf49 protein is mostly composed of alpha helices, with no predicted beta sheets.[9] [10] Information on where these alpha helices are located can be seen to the right of the page.

Tertiary Structure

i-Tasser predicted tertiary structure is included to the right of the page.

Post-Translational Modifications

Phosphorylation

The C11orf49 protein is predicted to be phosphorylated at 4 different sites, mainly on serine residues, but also on one threonine residue.[11]

PositionAAKinase
310SerineAGC/Akt
48ThreonineAGC/Akt/AKT1
66SerineAGC/Akt
318SerineAGC/Akt
Table 3. Predicted phosphorylation sites for the C11orf49 human protein.

Sumoylation

The C11orf49 protein is predicted to be sumoylated at positions 119 and 320, both lysine residues.

Subcellular Localization

The C11orf49 protein found in humans is predicted to be localized in the cytoplasm.[12]

Gene Level Regulation

Promoters

There are 7 promoters listed on Genomatix, however only one of the promoters (GXP_204543) starts at the beginning of the C11orf49 gene that is found in humans, and also has the greatest number of encoding transcripts.[13]

Promoter IDStart PositionEnd PositionSize (bp)OrientationTotal # of transcripts
GXP_20454346935524469368191296plus strand32
GXP_316228047050923470519621040plus strand1
GXP_316228147051454470525001047plus strand2
GXP_316228347136696471377351040plus strand1
GXP_316228447153395471544341040plus strand1
GXP_316228547153944471549831040plus strand1
GXP_20454247159105471601441040plus strand1
Table 4. List of promoters associated with the C11orf49 human gene.

Transcription Factors

The following transcription factors are predicted to bind to the GXP_204543 promoter. [14] The higher the matrix score, the more likely the transcription factor is to bind to the promoter. Information on where these transcription factors bind on the GXP_204543 promoter is showcased in the image to the right of the page.

Matrix FamilyDetailed Family InfoDetailed Matrix InfoMatrix Score
V$NKXHNKX homeodomain factorsHomeodomain factor NKX-2.51
V$GATAGATA binding factorGATA-binding factor 30.992
V$LEFFLEF1/TCFInvolved in the Wnt signal pathway0.991
O$VTBPVertebrate TATA binding factorCellular and viral TATA box elements0.99
V$KLFSKrueppel like TFsGut-enriched Krueppel-like TF0.982
V$MYBLCellular and Viral myb-like TFsV-Myb0.978
V$E2FFE2F-myc activatorE2F TF 10.976
V$MEF3MEF3 binding sitesSine oculis homeobox homolog 20.972
V$XBBFX-box binding factorsX-box binding protein RFX10.966
V$ETSFHuman and murine ETS1 factorsElk-10.958
V$PBXCPBX-MEIS complexesPre-B-cell leukemia homeobox 30.949
V$CAATCCAAT binding factorsCellular and viral CCAAT box0.927
V$HEATHeat shock factorsHeat shock factor 10.927
V$MYT1MYT1 C2HC zinc finger proteinMyelin TF 1-like, neuronal C2H2 ZF 10.925
V$GCMFChorion-specific TFsGlial cells missing homolog 10.902
V$ZF04C2H2 zinc finger TF 4Zinc finger and BTB domain0.9
V$MAZFMyc associated zinc fingers (MAZ)MAZ0.875
V$PAX9Pax-9 binding sitesZebrafish Pax-9 binding site0.848
V$DMRTDM domain-containing TFsMab-3 related TF 10.817
Table 5. List of binding transcription factors to the GXP_204543 promoter.

Gene Expression

Tissue Specific Expression

Both microarray expression patterns and RNA-Seq data show very high levels of expression in the brain.[15] RNA-Seq data also shows high expression in lung fetal tissue. Additional information for other tissues is included to the right of the page.

Conditions of Differentiated Expression

C11orf49 expression is significantly increased after the overexpression of claudin-1 in lung adenocarcinoma cell lines.[16] Claudin-1 specifically prevents paracellular diffusion of small molecules through tight junctions in the epidermis.

C11orf49 expression is significantly decreased after the treatment of camptothecin on a renal epithelial cell line.[17] Camptothecin is an alkaloid that inhibits the nuclear enzyme DNA topoisomerase, and has exhibited antitumor activity. It has also shown the ability to cause apoptosis by changing the permeability of the mitochondrial membrane, releasing cytochrome C.

Post-Transcription Regulation

5' UTR

There is a predicted stem-loop structure in the 5' UTR of the C11orf49 transcript from nucleotides 15-26 shown to the right of the page.[18]

3' UTR

There are predicted stem-loop structures and miRNA binding sites for the 3' UTR of the C11orf49 transcript shown to the right of the page.[19]

Protein-Protein Interactions

The database provided by PSICQUIC indicates that the C11orf49 protein found in humans interacts with the following proteins listed in Table 6. All interactions were determined using two-hybrid screening experiments.

ProteinDescription
HTTHuntingtin protein
APOEApolipoprotein E
PRKAR1AcAMP-dependent protein kinase type I-alpha regulatory subunit
FHFumarate hydratase
GCAGrancalcin
PHF1PHD finger protein 1
VPS54Vacuolar protein sorting-associated protein 54
ZFHX3Zinc finger homeobox protein 3
RAB7L1RAS oncogene family-like 1
NDRG1Stress responsive protein
PNMA5Paraneoplastic antigen-like protein 5
TXN2Thioredoxin
Table 6. List of proteins that interact with the C11orf49 protein found in humans.

Homology and Evolution

Orthologs and Paralogs

C11orf49 can be found among a wide variety of taxonomic groups, including but not limited to Mammalia, Aves, Reptilia, Amphibia, Cyprinidae, Hemichordata, Cnidaria, Platyhelminthes, Arthropoda, Placozoa, Choanoflagellate, Spizellomyces, and Oomycota.[20] [21] However, C11orf49 could not be found in Insecta or Plantae. There are no known paralogs of C11orf49.

Genus and SpeciesCommon nameTaxonomic groupDivergence (MYA)Accession #AA lengthIdentity (%)Similarity (%)
Mus musculusMouseRodentia89NP_780332.13319295.5
Gallus gallusChickenAves318XP_015142672.133176.783.5
Chelonia mydasGreen Sea TurtleReptilia318XP_007054360.236273.479.9
Geotrypetes seraphiniGaboon caecilianAmphibia352XP_033784118.132969.981.7
Xenopus tropicalisTropical Clawed FrogAmphibia352NM_001079316.133061.977.3
Danio rerioZebrafishCyprinoidae433NP_001002479.133154.772.2
Sacoglossus kowalevskiiAcorn WormHemichordata627XP_006821066.129943.658.2
Nematostella vectensisStarlet Sea AnemoneCnidaria687XP_032240720.130042.457.3
Macrostomum lignanoFlatwormPlatyhelminthes692PAA77967.140429.742.3
Stegodyphus mimosarumAfrican Velvet SpiderArthropoda736KFM74201.132124.338.5
Trichoplax adhaerensTrichoplaxPlacozoa747XP_002108042.126324.739.6
Salpingoeca rosettaN/AChoanoflagellate928XP_004994083.123115.525.5
Spizellomyces palustrisAustralian fungusSpizellomyces1017TPX67906.129821.432.7
Saprolegnia diclinaCotton MouldOomycota1552XP_008621502.131422.935.6
Table 7. List of selected orthologs of C11orf49.

Evolution

History

Saprolegnia diclina is the most distantly related ortholog of C11orf49 known, with its divergence from ancestral humans approximately 1,552 MYA.[22]

Evolutionary Rate

After performing a molecular clock analysis, C11orf49 has evolved at a faster rate than Cytochrome c but slower than Fibrinogen alpha. The graph containing this analysis is to the right of the page.

Function

Protein Kinase Activity

C11orf49 is predicted to act as a cAMP-dependent protein kinase.

Clinical Significance

C11orf49 has been shown to interact with proteins HTT and APOE2, which are associated with Huntington's disease and Alzheimer's, respectively. Due to the predicted function of C11orf49, this interaction could be kinase-oriented.

C11orf49 expression is significantly increased after the overexpression of Claudin-1 in lung adenocarcinoma cells.

C11orf49 expression is significantly decreased after the treatment of camptothecin on a renal epithelial cell line.

Notes and References

  1. Web site: GDS596 / 203257_s_at. 2020-12-17. www.ncbi.nlm.nih.gov.
  2. Web site: C11orf49 Gene - GeneCards CK049 Protein CK049 Antibody. 2020-12-17. www.genecards.org.
  3. Web site: PSICQUIC View. 2020-12-17. www.ebi.ac.uk.
  4. Web site: Human BLAT Search. 2020-12-17. genome.ucsc.edu.
  5. Web site: C11orf49 chromosome 11 open reading frame 49 [Homo sapiens (human)] - Gene - NCBI]. 2020-12-17. www.ncbi.nlm.nih.gov.
  6. Web site: ExPASy - Compute pI/Mw tool. 2020-12-17. web.expasy.org.
  7. Web site: SAPS < Sequence Statistics < EMBL-EBI. 2020-12-18. www.ebi.ac.uk.
  8. Web site: Phyre 2 Results for Undefined. 2020-12-17. www.sbg.bio.ic.ac.uk.
  9. Web site: Bioinformatics Toolkit. 2020-12-18. toolkit.tuebingen.mpg.de.
  10. Web site: I-TASSER results. 2020-12-18. zhanglab.ccmb.med.umich.edu.
  11. Web site: SIB Swiss Institute of Bioinformatics Expasy. 2020-12-18. www.expasy.org.
  12. Web site: PSORT II Prediction. 2020-12-18. psort.hgc.jp.
  13. Web site: Genomatix: Retrieve and analyze promoters: Query Input. 2020-12-18. www.genomatix.de.
  14. Web site: Genomatix: MatInspector Input. 2020-12-18. www.genomatix.de.
  15. Web site: GDS596 / 203257_s_at. 2020-12-18. www.ncbi.nlm.nih.gov.
  16. Web site: 59175105 - GEO Profiles - NCBI. 2020-12-19. www.ncbi.nlm.nih.gov.
  17. Web site: 14476184 - GEO Profiles - NCBI. 2020-12-19. www.ncbi.nlm.nih.gov.
  18. Web site: RNAfold web server. 2020-12-19. rna.tbi.univie.ac.at.
  19. Web site: TargetScanHuman 7.2 predicted targeting of Human C11orf49. 2020-12-19. www.targetscan.org.
  20. Web site: BLAST: Basic Local Alignment Search Tool. 2020-12-19. blast.ncbi.nlm.nih.gov.
  21. Web site: Human BLAT Search. 2020-12-19. genome.ucsc.edu.
  22. Web site: TimeTree :: The Timescale of Life. 2020-12-19. www.timetree.org.