Electronic Thesis and Dissertation Repository

Using Machine Learning Models to Address Challenges in Lung Cancer Care

Salma Dammak, Western University

Abstract

Lung cancer is characterized by its aggressiveness, heterogeneity, and wide array of treatments. Choosing the best treatment requires extensive patient information, which may sometimes be incomplete, presenting several clinical challenges. This thesis addresses two such challenges and introduces three machine-learning-based models to address them.

The first challenge focuses on identifying lung squamous cell carcinoma (SqCC) patients who could benefit from immunotherapy based on their tumour mutational burden (TMB). TMB is a promising but often unavailable biomarker in routine clinical practice. To address this, we developed a model capable of predicting TMB from standard-of-care tumour excision slides. Using 50 slides from 35 centres, we found that VGG16 had an area under the receiver operating characteristic curve (AUC) of 0.65, demonstrating that TMB status can be inferred from tissue morphology.

A crucial aspect of this study was the need to measure cancer tissue content within tiles from the tumour resections, which requires labour-intensive manual contours by expert pathologists. To automate this, we developed a separate VGG16-based model using 116 scans of lung SqCC tumour excisions from 35 centres. The model demonstrated a median regression error of 4% with a standard deviation of 36%, and an AUC of 0.83 at a 50% cancer content threshold. By automating this process, we can scale up TMB prediction models, making them more clinically applicable.

The second clinical challenge pertains to distinguishing benign radiation-induced lung injury (RILI) from tumour recurrence following stereotactic ablative radiotherapy (SABR). SABR is highly effective for early-stage inoperable lung cancer but often leads to RILI, which appears similar to tumour recurrence on CT scans. Accurate differentiation is critical for deciding whether invasive testing or salvage therapy is necessary. Utilizing CT scans from 68 patients showing lesion growth post-SABR, bootstrapped experiments with a random forest classifier had an average AUC of 0.66. Notably, the features deemed important by the model were correlated with clinical outcomes, marking an important advancement in non-invasive distinction of RILI from recurrence.

These studies highlight the potential of machine-learning to address critical clinical challenges in lung cancer care, particularly in the context of novel treatments like immunotherapy and SABR.