
Protein-Protein Interaction Prediction
Abstract
Proteins are essential components of cellular processes, playing pivotal roles in various biological functions. Their interactions with other proteins are fundamental for maintaining cellular functionality, a phenomenon referred to as protein-protein interaction (PPI). Within PPI, specific amino acid residues responsible for bonding, termed protein interaction site, are crucial. Identifying PPIs and protein interaction sites is a significant challenge in systems biology, as traditional experimental methods are resource-intensive and time-consuming. Thus, enhancing the efficacy of computational methods becomes imperative.
Computational approaches typically involve profiling proteins or their constituent amino acids to make predictions. Unsupervised methods, such as those employing embeddings, offer avenues for profiling without relying on expert-defined features. Following profiling, the development of robust classifiers becomes pivotal for improving prediction accuracy.
This thesis introduces novel methods and techniques aimed at enhancing predictions in PPI and protein interaction site identification. It offers an overview of proteins and their interactions, coupled with an exploration of deep learning architectures. A detailed examination of current state-of-the-art methodologies, including PITHIA and Seq-InSite, which were developed as part of this research, underscores their efficacy in protein interaction site prediction. Furthermore, the thesis introduces a novel algorithm named C3PI, demonstrating superior performance in PPI prediction compared to existing models.