Structural and physical properties of DNA provide important constraints on the binding sites formed on surfaces of DNA-binding proteins. Characteristics of such binding sites may be used for predicting DNA-binding sites from the structural and even sequence properties of unbound proteins. This approach has been successfully implemented for predicting the protein–protein interface. Here, this approach is adopted for predicting DNA-binding sites in DNA-binding proteins. First attempt to use sequence and evolutionary features to predict DNA-binding sites in proteins was made by Ahmad et al. (2004) and Ahmad and Sarai (2005). Some methods use structural information to predict DNA-binding sites and therefore require a three-dimensional structure of the protein, while others use only sequence information and do not require protein structure in order to make a prediction.
Structure- and sequence-based prediction of DNA-binding sites in DNA-binding proteins can be performed on several web servers listed below.DISIS predicts DNA binding sites directly from the amino acid sequence and hence is applicable for all known proteins. It is based on the chemical-physical properties of the residue and its environment, predicted structural features and evolutionary data. It uses machine learning algorithms.[1] DISIS2 receives the raw amino acid sequence and generates all features from it, such as secondary structure, solvent accessibility, disorder, b-value, protein-protein interaction, coiled coils, and evolutionary profiles, etc. The amount of predicted features is much larger than of DISIS (previous version). Finally, DISIS2 is able to predict DNA-binding residues from protein sequence of DNA-binding proteins.DNABindR predicts DNA binding sites from amino acid sequences using machine learning algorithms.[2] DISPLAR makes a prediction based on properties of protein structure. Knowledge of the protein structure is required [3] BindN makes a prediction based on chemical properties of the input protein sequence. Knowledge of the protein structure is not required.[4] BindN+ is an upgraded version of BindN that applies support vector machines (SVMs) to sequence-based prediction of DNA or RNA-binding residues from biochemical features and evolutionary information.[5] DP-Bind combines multiple methods to make a consensus prediction based on the profile of evolutionary conservation and properties of the input protein sequence. Profile of evolutionary conservation is automatically generated by the web-server. Knowledge of the protein structure is not required.[6] DBS-PSSM[7] and DBS-Pred[8] predict the DNA-binding in a protein from their sequence information.