List of protein tandem repeat annotation software explained
Computational methods use different properties of protein sequences and structures to find, characterize and annotate protein tandem repeats.
Sequence-based annotation methods
Name | Last update | Usage | Result types | Description | Open source? | Repeat type specific | Reference |
---|
ard2 | 2013 | web | annotated sequence | Neural network | no | alpha-solenoid | [1] |
DECIPHER | 2021 | downloadable | | Detection of tandem and/or interspersed repeats by orthology (DetectRepeats function in R package) | yes | no | [2] |
TRUST | 2004 | downloadable / web | unit position, multiple sequence alignment | Ab-initio determination of internal repeats in proteins. Exploits transitivity of alignments | ? | no | [3] |
T-REKS | 2009 | downloadable / web | repeat unit | Clustering of lengths between identical short strings by using a K-means algorithm | yes | no | [4] |
HHRepID | 2008 | downloadable / web | | Identification of repeats in protein sequences via HMM-HMM comparison to exploit evolutionary information in the form of multiple sequence alignments of homologs | no | | [5] |
RADAR | 2018 | downloadable / web | unit position, multiple sequence alignment | RADAR identifies short composition biased and gapped approximate repeats, as well as complex repeat architectures involving many different types of repeats in a query sequence | yes | no | [6] [7] |
XSTREAM | 2007 | web | unit position, different periods, multiple sequence alignment | data-mining tool designed to efficiently identify Tandem Repeat (TR) patterns in biological sequence data. The program uses a seed-extension strategy coupled with several post-processing algorithms to analyze FASTA-formatted protein or nucleotide sequences | no | no | [8] |
TRED | 2007 | downloadable | | definition for tandem repeats over the edit distance and an efficient, deterministic algorithm for finding these repeats | no | no | |
TRAL | 2015 | downloadable | | Detects tandem repeats with both de novo software and sequence profile HMMs; statistical significance analysis of putative tandem repeats, and filtering of redundant predictions | yes | | [9] |
DOTTER | 1995 | downloadable | | Graphical dotplot program for detailed comparison of two sequences | | | [10] |
0J.PY | | | | | | | [11] |
PTRStalker | 2012 | downloadable | unit position, multiple sequence alignment | Ab-initio detection of fuzzy tandem repeats in protein amino acid sequences. | | no | [12] |
TRDistiller | 2015 | | | Rapid sorting of tandem repeat (TR)- and no-TR-containing sequences | | | [13] |
REPRO | 2000 | web | | Repeats detection based on a variation of the Smith-Waterman local alignment strategy followed by a graph-based iterative clustering procedure | no | no | [14] | |
REP | 2000 | web | | | no | yes | | |
Structure-based annotation methods
Name | Last update | Usage | Result types | Description | Open source? | Repeat type specific | Reference |
---|
TAPO | 2016 | web | unit position | Uses periodicities of atomic coordinates and other types of structural representation, including strings generated by conformational alphabets, residue contact maps, and arrangements of vectors of secondary structure elements | no | no | [15] |
SYMD | 2014 | galaxy | repeat geometry | Detects internally symmetric protein structures through an “alignment scan” procedure in which a protein structure is aligned to itself after circularly permuting the second copy by all possible number of residues | no | no | [16] |
RAPHAEL | 2012 | web | repeat probability | Reduce to three dimensional structure to a wave function. It then determines periodicity information. | no | no | [17] |
CE-SYMM | 2021 | | | | | | |
ProSTRIP | 2010 | | | | | | |
DAVROS | 2004 | | | | | | |
RQA | 2009 | | | | | | |
OPAAS | 2006 | | | | | | |
Gplus | 2009 | | | | | | |
REUPRED | 2016 | | | | | | |
ConSole | 2015 | | | | | | |
RepeatsDB-Lite | 2017 | | | | | | |
PRIGSA | 2014 | | | | | | |
Swelfe | 2008 | | | | | | |
Frustratometer | 2021 | | | | | | | |
Notes and References
- Fournier D, Palidwor GA, Shcherbinin S, Szengel A, Schaefer MH, Perez-Iratxeta C, Andrade-Navarro MA . Functional and genomic analyses of alpha-solenoid proteins . PLOS ONE . 8 . 11 . e79894 . 21 November 2013 . 24278209 . 3837014 . 10.1371/journal.pone.0079894. 2013PLoSO...879894F . free .
- Wright ES . Using DECIPHER v2.0 to Analyze Big Biological Sequence Data in R . The R Journal . 2015 . 8 . 1 . 352–359 . 26445311 . 4595117 . 10.1186/s12859-015-0749-z . free .
- Szklarczyk. Radek. Heringa. Jaap. 2004-08-04. Tracking repeats using significance and transitivity. Bioinformatics. 20. Suppl 1 . i311–317. 10.1093/bioinformatics/bth911. 1367-4811. 15262814. free.
- Jorda. Julien. Kajava. Andrey V.. 2009-10-15. T-REKS: identification of Tandem REpeats in sequences with a K-meanS based algorithm. Bioinformatics. 25. 20. 2632–2638. 10.1093/bioinformatics/btp482. 1367-4811. 19671691. free.
- Zimmermann. Lukas. Stephens. Andrew. Nam. Seung-Zin. Rau. David. Kübler. Jonas. Lozajic. Marko. Gabler. Felix. Söding. Johannes. Lupas. Andrei N.. 2018-07-20. A Completely Reimplemented MPI Bioinformatics Toolkit with a New HHpred Server at its Core. Journal of Molecular Biology. Computation Resources for Molecular Biology. 430. 15. 2237–2243. 10.1016/j.jmb.2017.12.007. 29258817. 22415932 . 0022-2836.
- Heger. Andreas. Holm. Liisa. 2000. Rapid automatic detection and alignment of repeats in protein sequences. Proteins: Structure, Function, and Genetics. 41. 2. 224–237. 10.1002/1097-0134(20001101)41:2<224::aid-prot70>3.0.co;2-z. 10966575. 21757391 . 0887-3585.
- Lopez. Rodrigo. Paern. Juri. Squizzato. Silvano. Valentin. Franck. Li. Weizhong. McWilliam. Hamish. Goujon. Mickael. 2010-07-01. A new bioinformatics analysis tools framework at EMBL–EBI. Nucleic Acids Research. 38. suppl_2. W695–W699. 10.1093/nar/gkq313. 20439314. 2896090. 0305-1048.
- Newman. Aaron M.. Cooper. James B.. 2007-10-11. XSTREAM: A practical algorithm for identification and architecture modeling of tandem repeats in protein sequences. BMC Bioinformatics. 8. 1. 382. 10.1186/1471-2105-8-382. 1471-2105. 2233649. 17931424 . free .
- Anisimova. Maria. Xenarios. Ioannis. Zoller. Stefan. Stockinger. Heinz. Murri. Riccardo. Messina. Antonio. Pečerska. Jūlija. Korsunsky. Alexander. Schaper. Elke. 2015-09-15. TRAL: tandem repeat annotation library. Bioinformatics. 31. 18. 3051–3053. 10.1093/bioinformatics/btv306. 25987568. 1367-4803. free. 20.500.11850/103876. free.
- Sonnhammer. E. L.. Durbin. R.. 1995-12-29. A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis. Gene. 167. 1–2. GC1–10. 0378-1119. 8566757. 10.1016/0378-1119(95)00714-8.
- Wise. M. J.. 2001. 0j.py: a software tool for low complexity proteins and protein domains. Bioinformatics. 17. Suppl 1 . S288–295. 1367-4803. 11473020. 10.1093/bioinformatics/17.suppl_1.s288. free.
- Pellegrini. Marco. Renda. Maria Elena. Vecchio. Alessio. 2012-03-21. Ab initio detection of fuzzy amino acid tandem repeats in protein sequences. BMC Bioinformatics. 13. 3. S8. 10.1186/1471-2105-13-S3-S8. 1471-2105. 3402919. 22536906 . free .
- Richard. François D.. Kajava. Andrey V.. 2014-06-01. TRDistiller: A rapid filter for enrichment of sequence datasets with proteins containing tandem repeats. Journal of Structural Biology. 186. 3. 386–391. 10.1016/j.jsb.2014.03.013. 24681324. 1047-8477.
- George. Richard A.. Heringa. Jaap. October 2000. The REPRO server: finding protein internal sequence repeats through the Web. Trends in Biochemical Sciences. 25. 10. 515–517. 10.1016/s0968-0004(00)01643-1. 11203383. 0968-0004.
- Do Viet. Phuong. Roche. Daniel B.. Kajava. Andrey V.. 2015-09-14. TAPO: A combined method for the identification of tandem repeats in protein structures. FEBS Letters. 589. 19 Pt A. 2611–2619. 10.1016/j.febslet.2015.08.025. 1873-3468. 26320412. free.
- Tai. Chin-Hsien. Paul. Rohit. KC. Dukka. Shilling. Jeffery D.. Lee. Byungkook. 2014-07-01. SymD webserver: a platform for detecting internally symmetric protein structures. Nucleic Acids Research. 42. Web Server issue. W296–W300. 10.1093/nar/gku364. 0305-1048. 4086132. 24799435.
- Walsh. Ian. Sirocco. Francesco G.. Minervini. Giovanni. Di Domenico. Tomás. Ferrari. Carlo. Tosatto. Silvio C. E.. 2012-09-08. RAPHAEL: recognition, periodicity and insertion assignment of solenoid protein structures. Bioinformatics. 28. 24. 3257–3264. 10.1093/bioinformatics/bts550. 22962341. 1460-2059. free.