Description: | For predicting protein subcellular localizations |
Scope: | Data input: Protein sequence in FASTA format. Data output: Localization predictions in tab delimited format. |
Center: | University of Alberta |
Laboratory: | David S. Wishart |
Citation: | [1] |
Released: | 2004 |
Url: | http://webdocs.cs.ualberta.ca/~bioinfo/ |
Frequency: | Last updated on 2014 |
Curation: | Manually curated |
Proteome Analyst (PA) is a freely available web server and online toolkit for predicting protein subcellular localization, or where a protein resides in a cell.[2] In the field of proteomics, accurately predicting a protein's subcellular localization, or where a specific protein is located inside a cell, is an important step in the large scale study of proteins. This computational prediction problem is known as Protein subcellular localization prediction. Over the last decade, more than a dozen web servers and computer programs have been developed to attempt to solve this problem. Proteome Analyst is an example of one of the better performing subcellular prediction tools. Proteome Analyst makes predictions for both prokaryotic eukaryotic proteins using a text mining approach.[3] Proteome Analyst was originally developed by the Proteome Analyst Research Group at the University of Alberta, and was initially released in March 2004. It was recently updated in January 2014.
Users can submit requests to the Proteome Analyst web server by selecting the organism type and then uploading a text file containing the protein sequence in a FASTA format. Proteome Analyst then uses BLAST to look for similar proteins in the Uniprot database with annotation on subcellular localization information. Proteome Analyst then uses a machine-learned classifier to analyze the annotation text fields of the most similar proteins identified in Uniprot search to make the final subcellular localization predictions. Users can view and download Proteome Analyst's results or ask Proteome Analyst to explain its predictions.
Proteome Analyst consists of >30,000 lines of Java code and can be deployed on computer cluster to accelerate its speed and performance using multiple CPUs. The initial release of Proteome Analyst used Naïve Bayes classifier to perform its predictions. The current version of Proteome Analyst uses Support Vector Machine classifiers. Currently Proteome Analyst supports subcellular predictions for five organism types (Eurkayotes including animal, plant, fungi, and prokaryotes including gram-positive and gram-negative bacteria).