![Electronic Thesis and Dissertation Repository](../../assets/md5images/943042f88f25dd8adfffc3aeec6e330c.png)
Thesis Format
Integrated Article
Degree
Doctor of Philosophy
Program
Biochemistry
Supervisor
David R. Edgell
2nd Supervisor
Greg B. Gloor
Co-Supervisor
Abstract
CRISPR systems are used for strain-specific bacterial elimination and enhance bacterial recombineering outcomes. Their effectiveness depends on reliably generating targeted DNA breaks at intended sites using Cas9 directed by a sgRNA. However, not all Cas9/sgRNA combinations lead to the same degree of cleavage. Many groups have collected datasets to analyze Cas9/sgRNA cleavage activity in eukaryotic organisms and cleavage datasets for bacteria are limited and largely only test a single Cas9 orthologue. Moreover, prediction models trained on these data do not generalize to activities measured in other assays, or to bacteria other than where the data was collected. To overcome these problems, I generate a number of high-quality cleavage datasets for pools of sgRNAs using enrichment and depletion experimental setups to identify the sgRNA cleavage landscape for (Tev)SpCas9 and (Tev)SaCas9 in bacteria. Activities measured using enrichment experiments were extensively validated by assaying sgRNAs individually. Cleavage activities for identical sgRNAs measured by enrichment and depletion setups are highly correlated suggesting a congruence between different measurement modalities. I also identify toxic sgRNA phenotypes that were related to the number and position of mismatches to chromosomal DNA. I tested sgRNA pools containing mismatches relative to targets identifying off-target cleavage as one potential mechanism of sgRNA induced toxicity while simultaneously providing position-dependent cleavage information for model training. Machine learning models crisprHAL and crisprHAL2.0 trained on TevSpCas9 and TevSaCas9 datasets produce accurate predictions that generalized to relevant organisms such as S. enterica and C. rodentium. I also identify the importance of nucleotides downstream of the PAM sequence for cleavage activity and model predictions. The models produced show marked increases in predictive accuracy compared to previous models, indicating that the quality of training data is imperative for accurate and generalizable performance. The data collected in this thesis helps to further understand sgRNA requirements for reliable cleavage in bacteria by orthogonal Cas9 enzymes.
Summary for Lay Audience
Gene editing can be likened to scrap booking where the placement of the cuts and the scissors you use ultimately define the picture that is created. This simile helps to describe CRISPR systems where sgRNA/Cas9 complexes act as molecular scissors that cut and DNA the same way we would edit a picture. Moreover, native microbiomes composed of diverse bacterial communities that inhabit environments such the human gut resemble collages of pictures. In some instances, we would like to remove portions of the collage while maintaining the integrity of the rest. This is what happens when using CRISPR systems to selectively eliminate pathogenic bacteria from these complex microbial environments. One problem is that not all scissors (sgRNA/Cas9 combinations) are equally effective at removing the desired portions of the collage (pathogenic bacteria). To overcome this, I collect a large amount of data testing different sgRNA/Cas9 combinations and their effectiveness at generating cuts we intended. Some combinations are toxic and would remove unintended segments of the collage. We use this data to train a machine learning model that helps us select the best scissors and cut sites. Comparatively, this is like your grandmother (scrapbooking expert) were there to tell you which scissors to use (likely ones that make frilly edges) and where to cut. Here the data I collect is like the vast experience your grandmother has accumulated over her life. Together, the data collected helps understand the proficiency of cleavage for various molecular scissors and furthers our understanding of what is necessary to make the changes we intend.
Recommended Citation
Ham, Dalton T., "Analyzing sgRNA Cleavage Activities for SaCas9 and SpCas9 in Bacteria" (2024). Electronic Thesis and Dissertation Repository. 10554.
https://ir.lib.uwo.ca/etd/10554
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.