KIAA1704, also known as LSR7 (lipopolysaccharide-specific response protein 7), is a protein that in humans is encoded by the GPALPP1 (GPALPP motifs containing 1) gene. The function of KIAA1704 is not yet well understood. KIAA1704 contains one domain of unknown function, DUF3752. The protein contains a conserved, uncharged, repeated motif GPALPP(GF) near the N terminus and an unusual, conserved, mixed charge throughout (alternating readily between positive and negative charges).[1] It is predicted to be localized to the nucleus.[2]
KIAA1704 has at least a 5 fold expression loss associated with mantle cell lymphoma.[3] In a second study, researchers used a linkage disequilibrium mapping study of locus 13q13-14 to investigate potential susceptibility for autism over a 1.5 Mb linkage peak, including KIAA1704. A single marker PDTPhase analysis was performed for four SNPs for KIAA1704; however, none of the SNPs were statistically significant in associating the marker with the loci.[4]
An expression study found that KIAA1704 is significantly up-regulated in U937 cells (macrophage-like human cell line) when treated with nicotine.[5]
KIAA1704 is found on the chromosome 13, at locus q14.12, with the genomic sequence starting at 45,563,687 bp and ending at 45,602,405 bp.[6]
KIAA1704 is located on the positive strand surrounded by 5 nearby genes.
Positive Orientation
Negative Orientation
KIAA1704 has ubiquitous low to moderate expression patterns across body tissues (below 50%)[8]
Using GenoMatix ElDorado analysis tools, the promoter was predicted to be 727 base pairs in length projecting into exon 1. There are two predicted transcriptional start sites for this promoter, shown on the adjacent image.[9]
KIAA1704 promoter showed significant histone 3 lysine 4 trimethylation peaks in K562 cells (erythroid cell line). It also showed increased relative expression in erythroid progenitors along with gene neighbors, NUFIP1 and TPT1.[10]
An additional study found that the proximal promoter is one of many thousand direct targets of transcription factor, Myc, in vivo.[11]
According to Ensembl, there are four coding splice variants. None of the alternative splice forms have experimental evidence associated. One splice variant undergoes non-sense mediated decay while another is predicted to splice the gene directly in half and retain amino acids 171–340.[12]
NCBI BLAST searches reveal that known mRNA orthologs exist in mammals, reptiles, birds, frogs, and fish with at least 65% sequence identity.[13]
Shown in the table below, KIAA1704 has significantly higher percentages of charged amino acids (D, K, KR, KRED) than the normal human protein and is mostly conserved within its orthologous proteins.[1]
Compositional Analysis | Amino Acid (AA) Abundance H. sapiens | AA Abundance Mus musculus | AA Abundance Gallus gallus | AA Abundance Xenopus tropicales | |
---|---|---|---|---|---|
D++ | 42 (12.4%) | 37 (10.7%) | 35 (10%) | 33 (9.8%) | |
V-- | 6 (1.8%) | 9 (2.6%) | 9 (2.6%) | ----- | |
K+ | 37 (10.9%) | 37 (10.7%) | ----- | ----- | |
L- | 17 (5.0%) | 17 (4.9%) | 19 (5.4%) | ----- | |
KR+ | 62 (18.2%) | 62 (17.9%) | 60 (17.1%) | 56 (16.6%) | |
KRED++ | 134 (39.4%) | 135 (39.0%) | 135 (38.6%) | 122 (36.1%) | |
ED+ | 72 (21.2%) | 73 (21.1%) | 75 (21.4%) | 66 (19.5%) | |
LVIFM- | 54 (15.9%) | 59 (17.1%) | 58 (16.6%) | 57 (16.9%) |
KIAA1704 has protein orthologs extending through plants, shown in descending order of identity in the table below. Mammals have the highest level of conservation with 89 percent identity followed by birds, frogs, fish, invertebrates, insects, and plants.[13]
Accession Number | Sequence Length (aa) | Sequence Identity to Human protein (%) | Sequence Similarity to Human Protein (%) | Evolutionary Time to Human Divergence (Million years) | |||
---|---|---|---|---|---|---|---|
Homo sapiens | NP_061029.2 | 340 | 100 | 100 | 0 | ||
Pan troglodytes | XP_509661.2 | 340 | 99 | 100 | 6.4 | ||
Macaca mulatta | XP_001094145 | 344 | 97 | 98 | 29.2 | ||
Loxodonta africana | XP_003412655 | 341 | 95 | 98 | 98.8 | ||
Sus scrofa | XP_001924228 | 346 | 89 | 95 | 92.4 | ||
Mus musculus | NP_080453.2 | 346 | 89 | 93 | 94.4 | ||
Gallus gallus | NP_001006270 | 350 | 67 | 78 | 371.2 | ||
Taeniopygia guttata | XP_002198724 | 342 | 70 | 81 | 400.1 | ||
Xenopus tropicales | NP_001072786 | 338 | 62 | 74 | 400.1 | ||
Xenopus laevis | NP_001089474 | 337 | 61 | 74 | 661.2 | ||
Danio rerio | NP_001003473 | 405 | 49 | 61 | 782.7 | ||
Saccoglossus kowalevskii | XP_002738946 | 350 | 43 | 60 | 1369 | ||
Culex quinquefasciatus | Southern house mosquito | XP_001847636.1 | 335 | 41 | 57 | 782.7 | |
Drosophila ananassae | XP_001954135 | 348 | 34 | 50 | 1215.8 | ||
Glycine max | XP_003556198 | 569 | 32 | 51 | 1369 | ||
Puccinia graminis | XP_003328471 | 346 | 29 | 46 | 1215.8 |
Concerning conserved domains, thus far, there does not appear to be much information about conserved motif, GPALPP(GF). This motif represents the neutral segments in this highly charged protein.
DUF3752 is generally found in Eukaryotes and is between 140 and 163 amino acids in length. It belongs to pfam12572, member of superfamily cl13947[14]
Conserved Region | H. sapiens Amino Acid Site | Charge (Acidic, Basic, Neutral) | |
---|---|---|---|
41-49 | Neutral | ||
81-88 | Acidic | ||
GPALPP(GF) | 7–14; 32–37; 92–99; 112-119 | Neutral | |
IIGP | 110–113; 146-149 | Neutral | |
DUF3752 | 196-333 | Basic (pI=10.51) |
KIAA1704 is predicted by ExPASy tools to undergo several conserved post translational modifications including glycation, o-linked glycosylation, serine phosphorylation, threonine phosphorylation, and several kinase specific phosphorylation (PKC, PKA, and CKII).[2]
There are four conserved predicted alpha helices located towards the C terminus of the protein. The N terminus is predicted to be dominated by coiled regions.[15]
ExPASy PSORT predicts 74% chance of being localized to the nucleus.[2]