Electronic Thesis and Dissertation Repository

Protein Interaction Sites Prediction using Deep Learning

Sourajit Basak, The University of Western Ontario

Abstract

The accurate prediction of protein-protein interaction (PPI) binding sites is a fundamental problem in bioinformatics, since most of the time proteins perform their functions by interacting with some other proteins. Experimental methods are slow, expensive and not very accurate, hence the need for efficient computational methods.

In this thesis, we perform a study aiming to improve the performance of the currently best program for binding site prediction, DELPHI. We have employed some of the currently best techniques from machine learning, including attention and various embedding techniques, such as BERT and ELMo. This is the first time such tools are being tested for this problem. We have tested many architectures on a large dataset and analyzed our findings. While we succeeded to improve the performance, it is interesting to notice that some of the best machine learning techniques failed to provide the expected improvement, a fact that will require further investigation.