ZNF337 explained

ZNF337, also known as zinc finger protein 337, is a protein that in humans is encoded by the ZNF337 gene. The ZNF337 gene is located on human chromosome 20 (20p11.21). Its protein contains 751 amino acids, has a 4,237 base pair mRNA and contains 6 exons total.[1] In addition, alternative splicing results in multiple transcript variants.[2] The ZNF337 gene encodes a zinc finger domain containing protein, however, this gene/protein is not yet well understood by the scientific community. The function of this gene has been proposed to participate in a processes such as the regulation of transcription (DNA-dependent), and proteins are expected to have molecular functions such as DNA binding, metal ion binding, zinc ion binding, which would be further localized in various subcellular locations.[3] [4] While there are no commonly associated or known aliases, an important paralog of this gene is ZNF875.[5]

Gene

There are no commonly associated or known aliases beyond Zinc Finger 337, however, some potential ones could include LOC26152.[6] Its locus is found on chromosome 20, positioned 11.21 (20p11.21). Base coordinates are on the negative (minus) strand. There are 6 exons in total. The span of the ZNF337 gene (the start of transcription to the polyA site in base-pairs) is 4,237 base pairs (mRNA).

Transcripts

The ZNF337 gene contains two transcript variants (both encode the same protein); variant 1 represents the longer transcript (751 aa) while variant 2 differs in the 5’ UTR. There are also three isoforms (X1, X2, and X3). These isoforms represent one of many splice variants of the gene (while the transcript is an expressed sequence).

Proteins

ZNF337 has a predicted molecular weight of about 86.9 kdal and a predicted isoelectric point of 9.74 pI.[7] It is important to note that these are predictions as post translational modifications could affect these values. As suggested by the protein's name, there are several zinc fingers. There are no high scoring hydrophobic or transmembrane segments/regions and has no positive or negative charge clusters.[8]

Some amino acids found in ZNF337 are seen in unusual amounts as shown below. In amino acid distribution, glutamine (E), methionine (M), and alanine (A) are low while cysteine (C) and histidine (H) are high. It is rare for cysteine particularly to be highly expressed in amino acid sequences; the ZNF337 protein is an unusually basic protein. Because of its basic properties, it is DNA or RNA loving (i.e. able to bind to DNA or RNA fairly easily).

Domains and Motifs

As found through the MyHits program (found on ExPasy), there are about 6 different motifs (or pfams) present in ZNF337.[9]

!Motif Type!Amino Acid Sequence Position!e-value
KRAB (KRAB box)12-526.6e-26
PHD (PHD-finger)349-4120.0032
Rpr2 (RNAse P Rpr2/Rpr21/SNM1 domain)472-5510.00088
Zf-C2H2 (Zinc finger, C2H2 type)208-2304.3e-06
236-2583.8e-09
264-2866e-07
292-3142.4e-08
320-3424.6e-07
348-3701.9e-09
376-3982.8e-07
404-4265.9e-09
432-4541.2e-07
460-4821.8e-08
488-5103.1e-07
516-5383.8-07
544-5662.1e-06
572-5942.2e-06
600-6225e-07
628-6501.3e-08
656-6790.00014
685-7072.4e-07
713-7351.2e-07
Zf-C3HC4 (Zinc finger, C3HC4 type (RING finger))210-2690.00083
Zf-FCS (MYM-type zinc finger with FCS sequence motif)342-3850.02
Table 1. Six different motifs within the ZNF337 protein. The KRAB box, PHD finger, Rpr2, Zinc finger (C2H2 type), Zinc finger (C3HC4 type - RING finger), and Zinc finger (MYM-type zinc finger with FCS sequence motif) all play different functions and roles.

Secondary & Tertiary Structures

The secondary structure of ZNF337 is predicted to have many helices, sheets, turns and coils (especially random coils) as shown below.[10] [11]

Secondary Structure Composition!Type of Secondary Structure!Number of Amino Acids!Percent Composition
Alpha Helix16922.50%
Extended Strand15420.51%
Random Coil42856.99%

Both the H. Sapiens and P. troglodyte secondary structures are extremely similar; however, it is interesting to compare to S. dumerili where there is a stronger presence of sheets and coils between both 200-300 bp and 400-500 bp positions instead of sheets and helices. Additionally, comparing the beginning of the secondary structure (0-14 bp) of all species/orthologs shows that coils and turns make up the majority of the beginning, but not as much in some species such as S. dumerili (more helices and sheets instead).

Several tertiary structure modeling programs were unable to construct a model for ZNF337. When using the SWISS-model program, some models were constructed, however, to ZNF568. The ZNF568 protein sequence is 45.20% identical to that of ZNF337, has a sequence similarity of 0.44, and coverage of 0.37 with a range between the 345-623 bp amino acids in the ZNF337 protein sequence.[12] The predicted tertiary structure is shown in Figure 1. In this figure, there are several zinc ion ligands.

ZNF568 is a protein coding gene, associated with diseases such as transient neonatal diabetes mellitus. It has transcriptional repression activity, partially through the recruitment of the co-repressor TRIM28, but also has repression activity independently of this interaction. It is specifically important during embryonic development, where it acts as a direct repressor of a placental-specific transcript of IGF2 in early development and regulates convergent extension movements required for axis elongation and tissue morphogenesis in all germ layers. It is also crucial for normal morphogenesis of extraembryonic tissues including the yolk sac, extraembryonic mesoderm and placenta. Interestingly, it may enhance proliferation or maintenance of neural stem cells [13]

Gene Level Regulation

Promoter

The promoter region was chosen using ElDorado at Genomatrix, which assessed the ZNF337 gene locus for possible promoter regions. Out of the six possible promoter regions and sets, promoter set 6 (GXP_8991829) was chosen as it is the one best supported by transcripts (has six transcript ID's). Its start position is 25696627, its end position is 25697904 and its length is 1278 base pairs. Within GXP_8991829 (-), coding transcript GXT_26235925 was chosen as it has 5 exons, 37,403 CAGE tags, and corresponds with accession number XM_006723558 in NCBI (see Figure 2).

The promoter sequence contains a CpG island with a CpG count of 138. There is also a DNAse cluster (score =1000) present within the promoter sequence.

Transcription Factor Binding Sites

Possible transcription factors for the ZNF337 promoter region were determined using ElDorado at Genomatrix. These are listed below in Table 2.

Transcription FactorDetailed Matrix InformationAnchor Base/PositionMatrix SimilaritySequence
TF2BTranscription factor II B (TFIIB) recognition element9841.0ccgCGCC
VTBPAvian C-type LTR TATA box   210.814ctatagtTAAGaacaat
Avian C-type LTR TATA box   7430.825ttttattTAGGtagccc
Lentivirus LTR TATA box3140.83gtgTATAatatgctgat
Cellular and viral TATA box elements1770.961ccctaTAAAtatgtaca
Cellular and viral TATA box elements2750.911aaataTAAAgtctacgt
CAATCellular and viral CCAAT box5530.909taaaCCATtgagaga
CAATNuclear factor Y (Y-box binding factor)1140.939taccCCAAtcaccct
CEBPCCAAT/enhancer binding protein (C/EBP), epsilon2890.974gtggtttgGCAAgcc
Table 2. Possible transcription factors for ZNF337 promoter region.

There are 340 factors from 129 cell types of Transcription Factor ChIP-seq Clusters (from Encode3).[14] With that said, only the strong ones (indicated as black or dark grey) that also contain peaks within the promoter or enhancer regions are shown in Table 3.

LocationTranscription Factor – ChIPCell Type(s)
PromoterCTCFGM12878 (human lymphoblastoid), H1-hESC (human embryonic stem cells), K562 (myelogenous leukemia cells)
PromoterRFX5GM12878 (human lymphoblastoid)
PromoterSTAT1GM12878 (human lymphoblastoid)
PromoterTAF1GM12878 (human lymphoblastoid)
PromoterTRIM22GM12878 (human lymphoblastoid)
PromoterRESTH1-hESC (human embryonic stem cells)
PromoterGABPAHeLa-S3 (cervical cancer cell line)
PromoterMAFKHeLa-S3 (cervical cancer cell line)
PromoterTBPHeLa-S3 (cervical cancer cell line)
PromoterFOXA1HepG2 (human liver cancer cell line)
PromoterSIN3AHepG2 (human liver cancer cell line)
PromoterSP1HepG2 (human liver cancer cell line)
PromoterGATA2K562 (myelogenous leukemia cells)
PromoterMYCK562 (myelogenous leukemia cells)
PromoterPOLR2ABody of Pancreas
PromoterFOSEndothelial Cell of Umbilical Vein
Table 3. Transcription Factor-ChIP Clusters associated with specific cell types.

According to ORegAnno (literature curated TFBSs), there is no TF-ChIP signal overlap within the promoter/enhancer regions. Most of the ORegAnno citations correlate with a “NANP” gene, while transcription factors CTCF and CEBPA are confirmed in the enhancer region for the ZNF337 gene.

Expression Patterns

Both RNA sequence data from the Gene database records at NCBI and the Human Protein Atlas [15] using immunohistochemical staining to determine protein in various tissues show that the ZNF337 protein is expressed in many tissues. While ZNF337 mRNA tissue specificity is expressed in low tissue specificity levels, the mRNA is notably expressed in the cerebellum (brain) but is also more highly expressed in all tissues (distribution in all) compared to protein expression, especially higher in female tissues.

An antibody was developed against a recombinant protein corresponding to amino acids: ESSQGQRENPTEIDKVLKGIENSRWGAFKCAERGQDFSRKMMVIIHKKAHSRQKLFTCRECHQGFRDESALLLHQN. The specificity of human ZNF337 antibody was verified on a Protein Array containing target protein plus 383 other non-specific proteins. This isotype is IgG, its clonality is polyclonal, its host is rabbit, and its purity is immunogen affinity purified. This staining of human cerebellum shows cytoplasmic positivity in Purkinje cells (which regulate and coordinate motor movements through inhibitory functions and neurotransmitters).[16]

While there is little-some expressivity in a wide range of tissues, together, these results indicate a trend that expressivity is highest and most present in the brain, particularly the cerebellum. A few experiments and results also indicate expressivity in female (and some male) reproductive tissues.

Transcript Level Regulation

Multiple sequence alignments were created to observe conservation between different species. Specifically, a multiple sequence alignment (MSA) of the ZNF337 promoter region in primates and marsupial (opossum, chimpanzee, human, and rhesus monkey), or closely related species, shows little to no conservation in the beginning of sequences.

There are highly conserved regions in the beginning of both the 5’ UTR and 3’ UTR multiple sequence alignments. These could be functionally important based on stem-loop formations, miRNA binding capacity, or RNA binding protein binding capacity.

Protein Level Regulation

Localization

The prediction for localization of ZNF337 is highest in the nucleus (nuclear) at 95.7% followed by 4.3% in the mitochondria (mitochondrial).

Post-Translational Modifications

ZNF337 contains many predicted post-translational domains such as phosphorylation (serine and tyrosine kinases),[17] PEST motifs,[18] O-GlcNAc sites,[19] SUMOylation,[20] and glycation[21] as seen below:

!Modification!Amino Acid Number (in sequence)
Phosphorylation46, 109, 127, 155, 287, 446, 474, 483, 672, 695, 708, 743, 745, 751
PEST motif598-612
O-GlcNAc sites109, 142, 231, 384, 750, 751
SUMOylation633
Glycation94, 123, 125, 199, 206, 234, 248, 309, 339, 374, 388, 407, 430, 449, 486, 547, 556, 617, 668, 730
Table 4. Different post-translation modifications. Modifications can alter protein structure, thus affecting overall protein function and viability.

Predicted transmembrane domains, new signal peptides, N-terminal signal peptides, and cytoplasmic predictions

No predicted transmembrane domains were identified from tests run through SOSUI.[22] A prediction for a new signal peptide is very low and negative at -3.83. The GvH is also very negative at -8.69 (with a possible cleavage site between amino acids 56 and 57), indicating a low possibility that it has a cleavable signal sequence. Thus, ZNF337 is predicted to have no N-terminal signal peptide. Also, Reinhardt's method for cytoplasmic/nuclear discrimination has a cytoplasmic prediction for ZNF337 with a reliability score of 94.1.[23]

The nuclear localization signal is somewhat low at 0.75. Orthologs (P. troglodytes, S. dumerili, and C. asiatica) were used to confirm the significance of these predictions. Likewise, there were no predictions of no N-terminal peptide signals and transmembrane domains. All these ZNF337 orthologous proteins confirmed the prediction of nuclear location at 95.7%.

Homology/Evolution

An important paralog of the ZNF337 gene is ZNF875.

ZNF337 has many orthologs shown in a wide variety of species (vertebrates and invertebrates), such as primates, bony fishes, rodents, and even some plants as seen in Table 6 below. There are no orthologs found outside plants. Highly conserved amino acids and regions are shown in the middle-end of the ZNF337 protein sequence, suggesting that functions may differ due to less conservation in the beginning of ZNF337 sequences between species.

Phylogenetic trees highlight the evolution of species (specifically in relation to the evolution of the ZNF337 gene). Primates are clumped together closest to humans, while other species such as the megabat and mouse deviate from the cape golden mole or the zig zag eel and flier cichlid deviate from the greater amberjack. Species whose date of divergence from the human lineage (measured in units of millions of years ago) are greater show less sequence similarity and identity, which is also demonstrated through distance shown through phylogenetic trees.

!Genus and Species!Common Name!Taxonomic Group!Date of Divergence from Human Lineage (Million Years Ago - MYA)!Accession Number!Sequence Length (aa) !Sequence Identity to Human Protein (%)!e-value
Homo sapiensHumanPrimates0NP_056470751 aa100%0
Gorilla gorilla gorillaWestern gorillaPrimates8.6XP_004061979.1751 aa99.5%0
Pongo pygmaeusBornean orangutanPrimates15.2XP_009231663.1753 aa98.3%0
Colobus angolensis palliatesAngola colobusPrimates15.2XP_011807556.1758 aa96.7%0
Aotus nancymaaeNancy Ma's night monkeyPrimates42.9XP_012324051.1751 aa94.4%0
Pan troglodytesChimpanzeePrimates6.4XP_009435254.1751 aa95.6%0
Macaca mulattaRhesus macaquePrimates28.81XP_028683917.1751 aa81.4%0
Macaca fascicularisCrab-eating macaquePrimates28.81XP_015313198.1751 aa81.2%0
Cebus capucinus imitatorPanamanian white-faced capuchinPrimates42.9XP_017376089.1751 aa80.4%0
Pan paniscusBonoboPrimates6.4XP_014198483.1827 aa76.8%0
Tupaia chinensisChinese tree shrewScandentia85XP_006163813.1876 aa50.7%0
Carlito syrichtaPhilippine tarsierPrimates69XP_021573536.1807 aa52.5%0
Chrysochloris asiaticaCape golden moleAfrosoricida102XP_006877795.1764 aa45.0%0
Echinops telfairiLesser hedgehog tenrecAfrosoricida102XP_030742187.11487 aa26.1%0
Seriola dumeriliGreater amberjackCarangidae ("Bony fishes")433XP_022604330.1763 aa30.9%0
Oreochromis niloticusNile tilapiaCichildae ("Bony fishes")433XP_019222635.11033 aa10.2%0
Archocentrus centrachusFlier cichlidCichildae ("Bony fishes")433XP_030603298.1794 aa28.6%0
Mastacembelus armatusZig-zag eelSynbrachiformes433XP_026164592.1760 aa28.7%0
PteropodidaeMegabatChiroptera94751 aa35.4%0
Mus musculusMouseRodentia89751 aa26.5%3.00e-153
Ciona intestinalisSea squirtEnterogona6031278 aa15.6%3.00e-96
PetromyzontiformesSea lampreyLamprey599751 aa7.1%4.00e-35
Drosophila sechelliaFruit flyFly736751 aa6.9%4.00e-35
Pristionchus pacificusRoundwormRhabditida736751 aa2.9%6.00e-15
Caenorhabditis briggsaeNematodeRhabditida736751 aa1.9%2.00e-08
Camellia japonicaJapanese camelliaPlants1275751 aa1.9%2.00e-08
Table 5. Orthologs to ZNF337.

ZNF337 is evolving at the molecular level very quickly. When compared to fibrinogen protein rate of evolution, the ZNF337 appears to be accumulating the same amount amino acid changes in the same amount of time. It is evolving faster than cytochrome C protein, which is known to evolve slowly, as well as hemoglobin.

Function/Biochemistry

The ZNF337 gene encodes a zinc finger domain containing protein, however, this gene/protein is not yet well understood by the scientific community.

The function of this gene has been proposed to participate in a processes such as the regulation of transcription (DNA-dependent), and proteins are expected to have molecular functions such as DNA binding, metal ion binding, zinc ion binding, which would be further localized in various subcellular locations.

Because ZNF337 has several post-translational modification sites, alternative protein states may be present that permit ZNF337 to have different forms.

ZNF337 also has a variety of interactions with other proteins as discussed above, suggesting it may have a broad range of action. The different transcription factors demonstrate roles in transcription regulation. The KRAB box in the beginning of the sequence may play an important role in cell differentiation and development as well as regulating viral replication and transcription.[24] PHD fingers are found in nuclear proteins involved in epigenetics and chromatin-mediated transcriptional regulation.[25] Zinc finger C2H2 transcription factors are sequence-specific DNA binding proteins that regulate transcription. They possess DNA-binding domains that are formed from repeated Cys2His2 zinc finger motifs.[26] Also, many proteins containing a RING finger play a key role in the ubiquitination pathway.

Interacting proteins

Only the CEBPA transcription factor within the strongest DNAse HS cluster was also detected by GenoMatix. GenoMatix determined that potential transcription factors could include the following: TF2B, VTBP, CAAT, and CEBP. This is confirmed to be associated with the ZNF337 gene by the TF-ChIP ENCODE data and ORegAnno. The cluster score for this overlapping transcription factor, CEBPA, is 1000. Transcription Factors that might bind to regulatory sequences, specifically the enhancer region, includes CEBPA (chr20:25670005-25670302) and CTCF (chr20:25670168-25670507).

Clinical significance

Diseases associated with the ZNF337 gene include the development of adult astrocytic tumors,[27] which is the most common glial (brain cell) tumor occurring within the brain and spinal cord.[28] This observation and association could make sense as there is a high expression of the ZNF337 gene in various parts of the brain (specifically the cerebellum).

There are several notable SNPs in the coding sequence of ZNF337. These mutations include mostly missense and nonsense mutations.[29] [30]

Notes and References

  1. Web site: ZNF337 zinc finger protein 337 [Homo sapiens (human)] - Gene - NCBI]. www.ncbi.nlm.nih.gov. 2020-04-29.
  2. 2019-08-22. Homo sapiens zinc finger protein 337 (ZNF337), transcript variant 2, mRNA. en-US.
  3. Deloukas P, Matthews LH, Ashurst J, Burton J, Gilbert JG, Jones M, Stavrides G, Almeida JP, Babbage AK, Bagguley CL, Bailey J, Barlow KF, Bates KN, Beard LM, Beare DM, Beasley OP, Bird CP, Blakey SE, Bridgeman AM, Brown AJ, Buck D, Burrill W, Butler AP, Carder C, Carter NP, Chapman JC, Clamp M, Clark G, Clark LN, Clark SY, Clee CM, Clegg S, Cobley VE, Collier RE, Connor R, Corby NR, Coulson A, Coville GJ, Deadman R, Dhami P, Dunn M, Ellington AG, Frankland JA, Fraser A, French L, Garner P, Grafham DV, Griffiths C, Griffiths MN, Gwilliam R, Hall RE, Hammond S, Harley JL, Heath PD, Ho S, Holden JL, Howden PJ, Huckle E, Hunt AR, Hunt SE, Jekosch K, Johnson CM, Johnson D, Kay MP, Kimberley AM, King A, Knights A, Laird GK, Lawlor S, Lehvaslaiho MH, Leversha M, Lloyd C, Lloyd DM, Lovell JD, Marsh VL, Martin SL, McConnachie LJ, McLay K, McMurray AA, Milne S, Mistry D, Moore MJ, Mullikin JC, Nickerson T, Oliver K, Parker A, Patel R, Pearce TA, Peck AI, Phillimore BJ, Prathalingam SR, Plumb RW, Ramsay H, Rice CM, Ross MT, Scott CE, Sehra HK, Shownkeen R, Sims S, Skuce CD, Smith ML, Soderlund C, Steward CA, Sulston JE, Swann M, Sycamore N, Taylor R, Tee L, Thomas DW, Thorpe A, Tracey A, Tromans AC, Vaudin M, Wall M, Wallis JM, Whitehead SL, Whittaker P, Willey DL, Williams L, Williams SA, Wilming L, Wray PW, Hubbard T, Durbin RM, Bentley DR, Beck S, Rogers J . 6 . The DNA sequence and comparative analysis of human chromosome 20 . Nature . 414 . 6866 . 865–71 . 20–27 December 2001 . 11780052 . 10.1038/414865a . 2001Natur.414..865D . free .
  4. Web site: AceView: Gene:ZNF337, a comprehensive annotation of human, mouse and worm genes with mRNAs or ESTsAceView.. www.ncbi.nlm.nih.gov. 2020-04-29.
  5. Web site: ZNF337 Gene - GeneCards ZN337 Protein ZN337 Antibody. www.genecards.org. 2020-04-29.
  6. Web site: ZNF337 Gene - GeneCards ZN337 Protein ZN337 Antibody. www.genecards.org. 2020-05-02.
  7. Web site: ExPASy - Compute pI/Mw tool. web.expasy.org. 2020-05-02.
  8. Web site: SAPS < Sequence Statistics < EMBL-EBI. www.ebi.ac.uk. 2020-05-02.
  9. Web site: Motif Scan. myhits.isb-sib.ch. en. 2020-04-30.
  10. Web site: CFSSP: Chou & Fasman Secondary Structure Prediction Server. www.biogem.org. 2020-04-30.
  11. Web site: (...)use HTTP POST method and not GET(...) . 2020-05-03 . 2022-04-20 . https://web.archive.org/web/20220420063234/https://npsa-prabi.ibcp.fr/cgi-bin/secpred_gor4.pl . bot: unknown .
  12. Web site: SWISS-MODEL. swissmodel.expasy.org. 2020-04-30.
  13. Web site: ZNF568 Gene - GeneCards ZN568 Protein ZN568 Antibody. www.genecards.org. 2020-04-30.
  14. Web site: Human hg38 chr20:25,618,436-25,683,311 UCSC Genome Browser v397. genome.ucsc.edu. 2020-05-03.
  15. Web site: The Human Protein Atlas. www.proteinatlas.org. 2020-05-03.
  16. Web site: ZNF337 Antibody. Novus Biologicals. 2020-05-03.
  17. Web site: GPS 5.0 - Kinase-specific Phosphorylation Site Prediction. gps.biocuckoo.cn. 2020-04-30.
  18. Web site: EMBOSS: epestfind. emboss.bioinformatics.nl. 2020-05-03.
  19. Web site: YinOYang 1.2 Server. www.cbs.dtu.dk. 2020-05-03.
  20. Web site: SUMOplot™ Analysis Program Abcepta. www.abcepta.com. 2020-04-30.
  21. Web site: NetGlycate 1.0 Server. www.cbs.dtu.dk. en. 2020-04-30.
  22. Web site: SOSUIsignal: Result. harrier.nagahama-i-bio.ac.jp. 2020-04-30.
  23. Web site: PSORT WWW Server. psort.hgc.jp. 2020-05-03.
  24. Web site: SMART: KRAB domain annotation. smart.embl.de. en. 2020-05-03.
  25. Web site: SMART: PHD domain annotation. smart.embl.de. en. 2020-05-03.
  26. Web site: Gene Group: C2H2 ZINC FINGER TRANSCRIPTION FACTORS. flybase.org. 2020-05-03.
  27. Web site: ZNF337 Gene - GeneCards ZN337 Protein ZN337 Antibody. www.genecards.org. 2020-05-03.
  28. Web site: Astrocytoma Tumors – Symptoms, Diagnosis and Treatments. www.aans.org. en. 2020-05-03.
  29. Web site: Home - SNP - NCBI. www.ncbi.nlm.nih.gov. 2020-05-03.
  30. Web site: SNP linked to Gene (geneID:26152) Via Contig Annotation. www.ncbi.nlm.nih.gov. 2020-05-03.