Document Type


Publication Date



Introduction: Head and neck squamous cell carcinoma (HNSCC) is primarily treated with surgery. This surgery is guided by a pathologist, who intraoperatively scans removed tissue for cancer and dysplasia (precancerous epithelial tissue). Dysplasia is sometimes not removed because it can be difficult to detect. This may result in HNSCC recurrence, so there is great need to detect dysplasia more accurately. Machine learning (ML; the use of algorithms to train mathematical models) has been successfully applied to other medical detection problems, making it an attractive approach for this task. In this study, we aim to build and evaluate a convolutional neural network (CNN; a type of ML model) -based tool to detect dysplasia on HNSCC pathology slides.

Methods: Pathologist-contoured digitized frozen section slides from seventeen HNSCC surgeries were preprocessed and tiled in MATLAB and the Groovy programming language. In Python, the slides were used to train, validate, and optimize a VGG16 CNN in a transfer learning approach. Model testing was reserved for future work. The tool was evaluated with quantitative performance metrics and binary heatmaps integrated into the digital pathology tool, QuPath.

Results: The model’s accuracy, sensitivity, specificity, and positive predictive value (PPV) in validation were 83%, 74%, 83%, and 1.3%, respectively. Validation area under the curve (AUC) was 0.84. Qualitative comparison of the validation heatmaps with corresponding pathologist annotations revealed correct detection of most dysplasia but abundant false positive detection of nondysplastic epithelial tissue.

Conclusions: Low PPV and frequent false positives on the heatmaps suggest that the current tool struggles to discriminate between dysplasia and normal tissue, making it inappropriate for clinical use. The poor model performance may be explained by model limitations, small tile size, and substantial class imbalance. Encouragingly, much of the nondysplastic epithelium classified by the tool as dysplastic had some dysplasia-like characteristics, suggesting that the model identifies some pathologically meaningful features. Future work may seek to improve model performance by applying a precursor model to screen out non-epithelial cells, thereby rebalancing the classes. This work represents the first steps towards building a novel ML-based model to detect dysplasia on HNSCC surgery slides. If a model of this type can be improved, it could be used by pathologists to detect dysplasia more easily and accurately during HNSCC surgery, which would in turn increase the efficacy of this treatment.