Electronic Thesis and Dissertation Repository

Thesis Format

Monograph

Degree

Doctor of Philosophy

Program

Computer Science

Supervisor

Ilie, Lucian

Abstract

Proteins are essential components of cellular processes, playing pivotal roles in various biological functions. Their interactions with other proteins are fundamental for maintaining cellular functionality, a phenomenon referred to as protein-protein interaction (PPI). Within PPI, specific amino acid residues responsible for bonding, termed protein interaction site, are crucial. Identifying PPIs and protein interaction sites is a significant challenge in systems biology, as traditional experimental methods are resource-intensive and time-consuming. Thus, enhancing the efficacy of computational methods becomes imperative.

Computational approaches typically involve profiling proteins or their constituent amino acids to make predictions. Unsupervised methods, such as those employing embeddings, offer avenues for profiling without relying on expert-defined features. Following profiling, the development of robust classifiers becomes pivotal for improving prediction accuracy.

This thesis introduces novel methods and techniques aimed at enhancing predictions in PPI and protein interaction site identification. It offers an overview of proteins and their interactions, coupled with an exploration of deep learning architectures. A detailed examination of current state-of-the-art methodologies, including PITHIA and Seq-InSite, which were developed as part of this research, underscores their efficacy in protein interaction site prediction. Furthermore, the thesis introduces a novel algorithm named C3PI, demonstrating superior performance in PPI prediction compared to existing models.

Summary for Lay Audience

Proteins are vital components of cellular processes, playing key roles in various biological functions. Their interactions with other proteins, known as protein-protein interactions (PPIs), are essential for maintaining cellular functionality. Within these interactions, certain amino acid residues, called protein interaction sites, are crucial for bonding. Identifying PPIs and these interaction sites is a significant challenge in systems biology because traditional experimental methods are resource-intensive and time-consuming. Therefore, improving computational methods is critical.

Computational approaches typically involve profiling proteins or their amino acids to make predictions. Unsupervised methods, like those using embeddings, allow for profiling without needing expert-defined features. After profiling, developing strong classifiers is essential for improving prediction accuracy.

This thesis introduces new methods and techniques to enhance predictions in PPI and protein interaction site identification. It provides an overview of proteins and their interactions, along with an exploration of deep learning architectures. The thesis examines current state-of-the-art methodologies, such as PITHIA and Seq-InSite, developed during this research, highlighting their effectiveness in predicting protein interaction sites. Additionally, it introduces a new algorithm named C3PI, which shows superior performance in predicting PPIs compared to existing models.

Creative Commons License

Creative Commons Attribution-Noncommercial 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial 4.0 License

Available for download on Friday, August 21, 2026

Share

COinS