Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.[1]
The gene is located on the minus strand of the distal half of the long arm of Chromosome 21 at 21q22.3.[2] Transcript 1, including UTRs, is 22,740 bp and spans the chromosomal locus 46,301,130-46,323,875.
mRNA transcript variants 1-5 encode two validated protein isoforms of C21orf58.[3] Transcript variant 1 encodes the longer, primary isoform (1) (Accession: NP_470860). Transcript variants 2-5 encode the shorter isoform (2). Isoform 2 has a distinct N-terminus in comparison to Isoform 1 resulting from the use of an alternative start codon. A domain of unknown function, DUF4587, is conserved in all variants.
1 | Isoform 1 | 2975 | 322 | 8 | 234-291 | |
2 | Isoform 2 | 1674 | 216 | 9 | 128-185 | |
3 | Isoform 2 | 2900 | 216 | 7 | 128-185 | |
4 | Isoform 2 | 2941 | 216 | 9 | 128-185 | |
5 | Isoform 2 | 2624 | 216 | 9 | 128-185 |
The primary encoded protein consists of 322 amino acids, 8 total exons, and a molecular weight of 39.0 kDa.[4] [5] The predicted isoelectric point is 10.06, supporting predicted nuclear localization.
Human protein C21orf58 Isoform 1 is rich in proline and glutamine, and poor in cysteine, phenylalanine, and tyrosine. The protein is particularly tyrosine poor containing zero tyrosine residues. Isoform 1 contains 20 more positive charged residues than negative charged residues providing additional support for the predicted isoelectric point.
C21orf58 Isoform 1 has three conserved domains: proline-rich domain, histidine rich domain, and DUF4587. Proline-rich domain, Pro175-Pro322, is predicted to mediate protein-protein interactions.[6] Histidine-rich repeat domain, His292-His299, is predicted to facilitate localization.[7] [8] The domain of unknown function, DUF4587 (Arg234- His291), is a member of pfam15248 exclusively found in eukaryotes.[9]
C21orf58 contains a nuclear localization signal, The135-Leu144.[10]
Secondary structure of C21orf58 is predicted to consist primarily of random coil domains with four regions of alpha helices throughout the span of the protein.[11] [12] [13] Secondary structure predictions of C21orf58 orthologs revealed similar results; random coil and four regions of alpha helices with the addition of beta-sheets throughout.
C21orf58 is predicted to undergo multiple post-translational modifications including phosphorylation, O-GlcNAc, and SUMOylation.[14] [15] [16] [17]
Immunocytochemistry revealed localization of C21orf58 to nucleoplasm and nuclear bodies.[18] Presence of a nuclear localization sequence provides further evidence for protein import into the cell nucleus.
Subcellular localization predictions for C21orf58 based on the amino acid sequence (PSORTII) suggested nuclear localization.[19] Predictions across orthologs agreed with nuclear localization.
C21orf58 is constitutively expressed at low levels across various normal tissues (GDS3113), including but not limited to brain, endocrine, bone marrow, lung, and reproductive tissues.[20]
DNA microarray analysis from various experiments showed variable C21orf58 expression in unique physiological conditions.
C21orf58 was found to be expressed through all stages of development at similar levels throughout.[25]
C21orf58 ortholog in mouse 2610028H24Rik was found to be ubiquitously expressed at high levels throughout the mouse brain.[26]
The primary promoter for the longest variant of C21orf58 aligns with the start of the 5'UTR and is 1143bp in length.[27] The predicted promoter sequence overlaps with the 5'UTR and coding sequence of Pericentrin (PCNT) on the plus strand of Chromosome 21. Predicted transcription factors are associated with regulation of the cell cycle, neurogenesis, early development, and sex determination.
PLAG1 | Associated with nuclear importTranscriptional activator | |
WT1 | Role in the development of the urogenital system | |
ZFX | Implicated in mammalian sex determination | |
AP-2 | Activation of genes in early developmentExpression in neural crest cell lineages | |
E2F4 | Cell cycle controlTumor suppression | |
c-Myb | Regulation of hematopoiesis | |
Elk-1 | Transcriptional activator | |
KLF7 | Cell proliferation, differentiation, and survivalRegulates neurogenesis | |
ZBTB33 | Promotes histone deacetylation and the formation of repressive chromatic structures | |
Roaz | Involved in olfactory neuronal differentiation |
Yeast-two hybrid screening confirmed protein-protein interactions with PNMA1, MTUS2, GRB2.[28] Affinity Capture-MS indicated interactions with MTA2, ASH2L, and FAM199X. Two hybrid prey pooling followed by two hybrid array approach revealed interactions with Ccdc136, Ccdc125, KRT37, KRT27, KRT35, SPTA1, MKRN3, USHBP1, and KLHL20.[29]
Predicted interactions involved proteins associated with the cytoskeleton, cell migration, histone modification, and signal transduction.
Interactor | Function | |
---|---|---|
PNMA1 | Neuron- and testis- specific protein[30] Associated with paraneoplastic neurological disorders | |
MTUS2 | Microtubule associated scaffold protein[31] Role in cell migration and linking of microtubules to plasma membrane | |
GRB2 | Signal Transduction[32] | |
MTA2 | Component of NuRD, a nucleosome remodeling deacetylase complex[33] | |
ASH2L | Component of HMT Set1/Ash2 histone methyltransferase (HTM) complex[34] | |
Ccdc136 | Acrosome formation in spermatogenesis[35] | |
Ccdc125 | Regulation of Cell Migration[36] | |
KRT37 | Type 1 keratin that heterodimerizes with type II keratin to form hair and nails[37] | |
KRT27 | Member of Type I keratin family Involved in intermediate filament formation[38] | |
KRT35 | Type 1 keratin that heterodimerizes with type II keratin to form hair and nails[39] | |
SPTA1 | Molecular scaffold protein that links the plasma membrane to actin cytoskeleton[40] | |
MKRN3 | Plays a role in the onset of puberty Part of ubiquitin-proteasome system[41] | |
USHBP1 | Harmonin binding protein[42] Actin filament binding | |
KLHL20 | Actin filament binding[43] Adapter of BCR, a negative regulator of apoptosis |
No human paralogs for C21orf58 were identified.
C21orf58 orthologs were identified in bony fish but not in cartilaginous fish.[44] The first 35 bases of DUF4587, Arg234- Pro265, were conserved across ortholog sequences.[45] The most distantly related ortholog identified was the zebrafish.
The rate of C21orf58 evolution was determined through an application of the Molecular Clock Hypothesis. Through comparison with alpha fibrinogen and cytochrome C, it was determined that C21orf58 has evolved at an intermediate rate.