Electronic Thesis and Dissertation Repository

Thesis Format

Monograph

Degree

Master of Science

Program

Computer Science

Supervisor

Ling, Charles X.

Abstract

Pathologists often identify colon cancers by inspecting whole-slide images (WSI), which are high-resolution scans of colon tissues extracted through colonoscopy. One specific case of colon cancer, named pseudoinvasion, is hardly differentiable from true invasions even under careful inspections by a panel of expert pathologists. Therefore, Pathologists seek help from artificial intelligence. A type of deep learning model called convolutional neural network (CNN) has been extensively used in image classification. Unfortunately, WSIs are too large and contain a rich amount of detailed information, making the direct use of CNN on WSIs very slow and the training impossible without millions of training samples. However, the number of labelled WSIs for true/pseudo-invasion classification is limited, making the task extremely challenging.

Since it is almost impossible to classify WSIs directly using CNNs, we identify the tissue types on the WSIs and then aggregate the results into a final output. We propose two multi-zoom-level patch-based methods for tissue type recognition, and one method for aggregation. The first method focuses on accuracy by identifying tissue types by patches at three different zoom levels using three CNNs. Then, we apply the weighted averages to combine the classification results. Our second method focuses on efficiency by classifying image patches at a low zoom level and then proceeding only to selected patches at higher zoom levels. Finally, we design a shallow CNN for aggregating the per-patch results from the two proposed tissue-type recognition results into slide-level results for WSIs. Collaborating with pathologists, we collect a private dataset by identifying 150 WSIs and annotating 50 of them. We apply self-supervised learning on a public dataset and transfer the results to our private dataset to increase the performance of our models under limited data.

Our experiments show that our methods can recognize tissues with high accuracy and reasonable efficiency, and aggregate the results into final true/pseudo-invasion classification with promising accuracy under limited data. We developed a web-based tool for the WSI true/pseudo-invasion classification task. The tool can be accessed at http://ai4path.ca/#/.

Summary for Lay Audience

Pathologists often identify colon cancers by inspecting whole-slide images (WSI), which are high-resolution scans of colon tissues extracted through colonoscopy. One specific case of colon cancer, named pseudoinvasion, is hardly differentiable from true invasions even under careful inspections by a panel of expert pathologists. Therefore, Pathologists seek help from artificial intelligence. Deep learning has been extensively used in image classification. Unfortunately, WSIs are too large and contain a rich amount of detailed information, making traditional deep learning methods impossible to use without millions of data. However, the number of labelled WSIs for true/pseudo-invasion classification is limited, making the task extremely challenging.

Since it is almost impossible to classify WSIs directly using CNNs, we identify the tissue types on the WSIs and then aggregate the results into a final output. We propose two multi-zoom-level patch-based methods for tissue type recognition, and one method for aggregation. The first method focuses on accuracy by classifying the entire WSI three times at different zoom levels using CNNs. The second method focuses on efficiency by classifying selected regions at each zoom level. We also collaborate with pathologists to collect a small private dataset for our task, and we apply transfer learning to improve the performance of our models.

Our methods achieved high accuracy with reasonable efficiency under a limited amount of data. We developed a web-based tool for the WSI true/pseudo-invasion classification task. The tool can be accessed at http://ai4path.ca/#/.

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS