Predicting protein-protein interactions from primary structure
Master of Science
One of the key challenges in the post-genomic era is to understand protein-protein interactions on a large scale. Given the primary structures of proteins and ligands, along with other information, how well can we computationally predict protein-protein interaction networks? We train Naive Bayes classifiers to classify positive and negative examples of protein-ligand interactions. Such a predictive model can screen large numbers of potential ligands, saving laboratory time and costs. We demonstrate our approach in predicting interactions between SH3 domains and proline-rich ligands. Using laboratory data, we construct positive and negative examples, learn Naive Bayes models of ligand binding specificity of 8 diverse SH3 domains and visualize our models using an information theory-based measure to reveal potential binding sites. We use our classifiers to screen PxxP ligands from SwissProt for the given SH3 domains and demonstrate improvements over existing predictors. For validating our method, we use our technique to predict a computational interaction network and intersect it with an experimental yeast 2-hybrid network, using the methodology and data from Tong et al. [TDN +02]. Our technique produces comparable results to Tong et al., even without incorporating their consensus sequences.
Biochemistry; Computer science