CAP-Gly Domain Containing Linker Protein Family Member 4 is a protein that in humans is encoded by the CLIP4 gene.[1] In terms of conserved domains, the CLIP4 gene contains primarily ankyrin repeats and the eponymous CAP-Gly domains. The structure of the CLIP4 protein is largely made up of coil, with alpha helices dominating the rest of the protein.[2] CLIP4 mRNA expression occurs largely in the adrenal cortex and atrioventricular node. The literature encompassing CLIP4's conserved domains and paralogs points toward microtubule regulation as a possible function of CLIP4.
The human CLIP4 gene, also known as Restin-Like Protein 2 (RSNL2),[3] is located on the plus strand of the short (p) arm of chromosome 2 at region 2, band 3 from base pair 29,096,676 to base pair 29,189,643. CLIP4 is 92,968 base pairs in length and consists of 23 exons.
Transcript | mRNA size (nucleotides) | |
CLIP4 transcript variant 1[4] | 4299 | |
CLIP4 transcript variant 2[5] | 4295 | |
CLIP4 transcript variant 3[6] | 2353 |
The human CLIP4 protein is 705 amino acids in length and is composed of two main types of conserved domains: Two CAP-Gly domains and numerous ankyrin repeats. The secondary structure of CLIP4 consists largely of random coil, with alpha helices as the second-most abundant structure and beta sheets as the third-most abundant structure.
The isoelectronic point of the unprocessed CLIP4 protein is slightly basic (8.62 pI), meaning there is a slight excess of basic amino acids compared to acidic amino acids.[7] The molecular weight is about 65 kD.[7] The most abundant amino acid in CLIP4 is Serine, which makes up 10.7% of the protein.[8] Aligned matching blocks of separated, tandem, and periodic repeats are found between positions 340-345 and 542-547, as well as 447-547 and 564-568. The unusual 9-figure periodic element of a singular Lysine followed by eight other amino acids occurs five times within the protein when compared to the swp23s.q dataset. Another unusual phenomenon is a 7-figure periodic element of a negatively charged amino acid followed by six other hydrophobic amino acids, which occurs six times within the protein when compared to the swp23s.q dataset. There are two instances of Serine spacing and two instances of Phenylalanine spacing that comprise unusually large distances when compared to the swp23s.q dataset.
Isoform | Protein size (amino acids) | |
CLIP4 isoform 1[9] | 705 | |
CLIP4 isoform 2[10] | 599 |
CLIP4 RNA expression is consistently measured to a high degree in the thyroid. Additionally, high degrees of transcription occur in the adrenal cortex and atrioventricular node.[11] The Human Protein Atlas points toward high RNA expression values in the muscle tissues, as well as some in the skin, endocrine tissues, and proximal digestive tract.[12] Greatest protein expression values appeared in the muscle tissues as well, in addition to some in the lung, gastrointestinal tract, liver & gallbladder, and bone marrow & lymphoid tissues.
CLIP4 protein expression seems to be highly expressed during Ada3 deficiency.[13] There also exists a higher trend towards higher CLIP4 expression in the absence of U28.
These transcription factors were chosen and organized based on proximity to the promoter and matrix similarity.[14]
Transcription Factor | Detailed Matrix Info | Anchor Base | Matrix Similarity | Sequence | |
---|---|---|---|---|---|
NOLF | Early B-cell factor 1 | 17 | 0.98 | taagagTCCCcagggcagaaaca | |
PAX2 | Zebrafish PAX2 paired domain protein | 18 | 0.8 | aagagtccccagggcagAAACaa | |
AP2F | Transcription factor AP-2, alpha | 16 | 0.98 | ctgcCCTGgggactc | |
AP2F | Transcription factor AP-2, beta | 16 | 0.899 | gagTCCCcagggcag | |
SORY | SRY (sex-determining region Y) box 9, dimeric binding sites | 35 | 0.768 | aAACAaaatccagtgagggagag | |
HNF6 | CUT-homeodomain transcription factor Onecut-2 | 32 | 0.827 | aaacaaAATCcagtgag | |
PAX5 | B-cell-specific activator protein | 40 | 0.815 | acaaaaTCCAgtgagggagagatgcaggg | |
ZF16 | PR/SET domain 15 | 36 | 0.852 | aaatccagtgaGGGA | |
SORY | HMGI(Y) high-mobility-group protein I (Y), architectural transcription factor organizing the framework of a nuclear protein-DNA transcriptional complex | 78 | 0.945 | tggaAATTttctaccttaggagc | |
NFAT | Nuclear factor of activated T-cells 5 | 83 | 0.955 | ttttGGAAattttctacct | |
NFAT | Nuclear factor of activated T-cells 5 | 83 | 0.871 | aggtAGAAaatttccaaaa | |
CEBP | CCAAT/enhancer binding protein (C/EBP), epsilon | 89 | 0.975 | agccttttGGAAatt | |
CAAT | Cellular and viral CCAAT box | 110 | 0.91 | gcagCCATttaatct | |
CAAT | Avian C-type LTR CCAAT box | 165 | 0.875 | cccaCCAAgcagtgg | |
CEBP | CCAAT/enhancer binding protein (C/EBP), gamma | 650 | 0.866 | ctaaTTGCtcaacgt | |
CEBP | CCAAT/enhancer binding protein alpha | 651 | 0.971 | cacgttgaGCAAtta | |
VTBP | Mammalian C-type LTR TATA box | 680 | 0.903 | tgctgTAAAaggcctaa | |
TF2B | Transcription factor II B (TFIIB) recognition element | 983 | 1 | ccgCGCC | |
TF2B | Transcription factor II B (TFIIB) recognition element | 1157 | 1 | ccgCGCC | |
TF2B | Transcription factor II B (TFIIB) recognition element | 1228 | 1 | ccgCGCC |
The human CLIP4 mRNA sequence has 12 stem-loop structures in its 5' UTR and 13 stem-loop structures in its 3' UTR. Of those secondary structures, there are 12 conserved stem-loop secondary structures in the 5'UTR as well as 1 conserved stem-loop secondary structure in the 3' UTR.[15]
The human CLIP4 protein is localized within the cellular nuclear membrane.[16] CLIP4 does not have a signal peptide due to its intracellular localization.[17] It also does not have N-linked glycosylation sites for that same reason.[18] CLIP4 is not cleaved.[19] However, numerous O-linked glycosylation sites are present.[20] A high density of phosphorylation sites are present in the 400-599 amino acid positions on the CLIP4 protein, although many are also present throughout the rest of the protein.[21]
CAP-Gly domains are often associated with microtubule regulation.[22] In addition, ankyrin repeats are known to mediate protein-protein interactions.[23] Furthermore, CLIP1, a paralog of CLIP4 in humans, is known to bind to microtubules and regulate the microtubule cytoskeleton.[24] The CLIP4 protein is also predicted to interact with various microtubule-associated proteins.[25] As a result, it is likely that the CLIP4 protein, although uncharacterized, is associated with microtubule regulation.
The CLIP4 protein is predicted to interact with many proteins associated with microtubules; namely, MAPRE1, MAPRE2, and MAPRE3. It is also predicted to interact with CKAP5 and DCTN1, a cytoskeleton-associated protein and dynactin-associated protein respectively.
CLIP4 activity is correlated with the spread of renal cell carcinomas (RCCs) within the host and could therefore be a potential biomarker for RCC metastasis in cancer patients.[26] Additionally, measurement of promotor methylation levels of CLIP4 using a Global Methylation DNA Index reveals that higher methylation of CLIP4 is associated with an increase in severity of gastritis to possibly gastric cancer.[27] This indicates that CLIP4 could be used for early detection of gastric cancer.[28] A similar finding was also documented for prostate cancer, in which CLIP4 was found to be hypermethylated in patients with prostate cancer.[29]
The presence of CLIP4 was found to be highly increased in samples with predicted severe fibrosis as a result of Chronic Hepatitis C virus (HCV).[30] Additionally, the presence of CLIP4 as a novel self-antigen in Systemic Lupus Arythematosus points to it having a potential role in the disease mechanism.[31]
These orthologs were chosen and organized based on estimated date of divergence from the human protein as well as the global sequence identity.[32]
Binomial Nomenclature | Common Name | Taxonomic Group | Estimated DoD from Human (MYA) | Accession Number | Sequence Length (AA) | Global Sequence Identity to Human Protein (%) | Global Sequence Similarity to Human Protein (%) | |
Homo sapiens (Hsa) | Human | Primate | 0 | AAP97312 | 601 | 100 | 100 | |
Aotus nancymaae (Ana) | Ma's night monkey | Primate | 43.2 | XP_012330895 | 704 | 83.5 | 83.7 | |
Sorex araneus (Sar) | Common shrew | Eulipotyphla | 96 | XP_004620056 | 707 | 74 | 78.5 | |
Antrostomus carolinensis (Aca) | Chuck-will's-widow | Aves | 312 | XP_028942997 | 702 | 66.5 | 75.4 | |
Gekko japonicus (Gja) | Schlegel's Japanese gecko | Reptilia | 312 | XP_015270366 | 702 | 63.8 | 73.1 | |
Rhinatrema bivittatum (Rbi) | Two-lined caecilian | Amphibians | 351.8 | XP_029448862 | 707 | 59.5 | 70.5 | |
Callorhinchus milii (Cmi) | Elephant shark | Chondrichthyes | 473 | XP_007895016 | 715 | 52.5 | 65.6 | |
Branchiostoma floridae (Bfl) | Florida lancelet | Leptocardii | 684 | XP_002606824 | 481 | 40.4 | 52.8 | |
Saccoglossus kowalevskii (Sko) | Acorn worm | Enteropneusta | 684 | XP_006822686 | 648 | 35.7 | 47.5 | |
Ixodes scapularis (Isc) | Black-legged tick | Arachnid | 797 | XP_029831090 | 527 | 38.9 | 53 | |
Limulus polyphemus (Lpo) | Atlantic horseshoe crab | Arachnid | 797 | XP_013786376 | 462 | 38 | 51.6 | |
Lottia gigantea (Lgi) | Owl limpet | Gastropods | 797 | XP_009046843 | 669 | 36.3 | 49.3 | |
Mizuhopecten yessoensis (Mye) | Yesso scallop | Bivalvia | 797 | XP_021359747 | 633 | 35.4 | 47.2 | |
Parasteatoda tepidariorum (Pte) | Common house spider | Arachnid | 797 | XP_015914966 | 616 | 34.7 | 47.6 | |
Aplysia californica (Aca) | California sea hare | Gastropods | 797 | XP_012945346 | 653 | 33.7 | 45.7 | |
Crassostrea virginica (Cvi) | Eastern oyster | Bivalvia | 797 | XP_022315879 | 646 | 32.7 | 45.1 | |
Tetranychus urticae (Tur) | Two-spotted spider mite | Arachnid | 797 | XP_015790536 | 652 | 31.9 | 43.5 | |
Centruroides sculpturatus (Csc) | Bark scorpion | Arachnid | 797 | XP_023229484 | 605 | 30.6 | 43.4 | |
Penaeus vannamei (Pva) | Pacific white shrimp | Malacostracans | 797 | XP_027206746 | 681 | 22.9 | 34 | |
Monosiga brevicollis (Mbr) | Choanoflagellate | Choanoflagellatea | 1023 | XP_001748580 | 576 | 25.3 | 40.8 |