Electronic Thesis and Dissertation Repository

Thesis Format

Integrated Article


Master of Science


Applied Mathematics


Karttunen, Mikko

2nd Supervisor

Choy, Wing-Yiu



Intrinsically disordered proteins (IDPs) are known not only for their roles in disease but also for their conformational flexibility, which makes them elusive for experimentation. We consider the role played by theory and simulation in resolving important questions pertaining to IDP structure and dynamics, as well as the nature of the charged residues (e.g., glutamate, lysine, etc.) that enrich them. Specifically, we investigated how the deep learning trained AlphaFold2 (AF2) predictor estimates disorder content, revealing both strong performance in relation to conventional approaches and an important relationship between the AF2 confidence metric and IDP dynamics. We also assessed how modern molecular dynamics (MD) simulations could reproduce the ensembles of two highly charged peptides at various protonation states and two model IDPs for which new experimental data are available. Our results revealed notable performance discrepancies and the impact of a new Amber force field variant on the resultant structures. The charged residues enriched in IDPs are protonatable; depending on their pKa values, they will be (de)protonated at a specific solution pH. We considered how MD simulations alongside non-equilibrium free energy methods and theory could be used to compute coupled and uncoupled pKa values for more than 140 amino acid residues spanning 13 proteins. We achieved performance that matched or exceeded several state-of-the-art alternative approaches.

Summary for Lay Audience

Disordered proteins are implicated in a variety of neurodegenerative diseases and cancers. These proteins are often enriched with charged amino acids, which can impart a high degree of flexibility and allow them to interact with a large number of protein partners. The probability that these amino acids are charged depends on the pH environment. Specifically, as the environment becomes more or less acidic, these amino acids may become neutral, resulting in unique protein dynamics. Moreover, every charged amino acid has a numeric constant associated with it that determines whether it will be charged at a specific pH and will differ depending on the particular protein in which the amino acid finds itself. While experimental tools can be employed to help answer questions pertaining to protein dynamics and the nature of these "charge constants", oftentimes these insights are not readily accessible, and computational methods must be employed. Here, we investigated two main questions: 1) can disordered proteins be accurately modelled using computational approaches at different pH values? and 2) can the "charge constants" associated with amino acids be reliably predicted using molecular simulation? Importantly, we find that the answer to both is yes.