Electronic Thesis and Dissertation Repository

Thesis Format

Integrated Article

Degree

Doctor of Philosophy

Program

Epidemiology and Biostatistics/Computer Science

Supervisor

Lizotte, Daniel J

Abstract

This thesis was motivated by the potential to use "everyday data", especially that collected in electronic health records (EHRs) as part of healthcare delivery, to improve primary care for clients facing complex clinical and/or social situations. Artificial intelligence (AI) techniques can identify patterns or make predictions with these data, producing information to learn about and inform care delivery. Our first objective was to understand and critique the body of literature on AI and primary care. This was achieved through a scoping review wherein we found the field was at an early stage of maturity, primarily focused on clinical decision support for chronic conditions in high-income countries, with low levels of primary care involvement and model evaluation in real-world settings.

Our second objective was to demonstrate how AI methods can be applied to problems in descriptive epidemiology. To achieve this, we collaborated with the Alliance for Healthier Communities, which provides team-based primary health care through Community Health Centres (CHCs) across Ontario to clients who experience barriers to regular care. We described sociodemographic, clinical, and healthcare use characteristics of their adult primary care population using EHR data from 2009-2019. We used both simple statistical and unsupervised learning techniques, applied with an epidemiological lens. In addition to substantive findings, we identified potential avenues for future learning initiatives, including the development of decision support tools, and methodological considerations therein.

Our third objective was to advance interpretable AI methodology that is well-suited for heterogeneous data, and is applicable in clinical epidemiology as well as other settings. To achieve this, we developed a new hybrid feature- and similarity-based model for supervised learning. There are two versions, fit by convex optimization with a sparsity-inducing penalty on the kernel (similarity) portion of the model. We compared our hybrid models with solely feature- and similarity-based approaches using synthetic data and using CHC data to predict future loneliness or social isolation. We also proposed a new strategy for kernel construction with indicator-coded data.

Altogether, this thesis progressed AI for primary care in general and for a particular health care organization, while making research contributions to epidemiology and to computer science.

Summary for Lay Audience

This thesis was motivated by the potential to use "everyday data", which is data generated through activities outside formal research settings, to improve primary care for clients facing complex clinical and/or social situations. Artificial intelligence (AI) and its subfield machine learning include techniques that can analyze these data and provide information to help guide care delivery, such as personalized treatment recommendations or risk estimates. In our first study we summarized the state of AI and primary care research, finding the field was at an early stage of maturity with knowledge gaps for how to best develop, implement, and evaluate AI for primary care.

Our second study was done in collaboration with the Alliance for Healthier Communities, which provides team-based primary health care through Community Health Centres (CHCs) across Ontario to clients who otherwise experience barriers to regular care. We performed a large-scale description of sociodemographic, clinical, and healthcare characteristics of their adult primary care clients from 2009 through 2019 to learn about this population and areas where AI and decision support tools may be useful. We additionally identified methodological considerations for AI to work well in primary care settings. To accomplish this we used both simple statistical techniques traditionally used in descriptive epidemiology and techniques from machine learning that can capture more complex patterns in the data. Our approach can be followed to improve population-level descriptions in other settings as well.

In our third study we developed new machine learning methods for analyzing large, diverse datasets, such as electronic health records from CHCs. We combined two existing techniques, feature and kernel learning, into a single hybrid model. We demonstrated how to interpret our models and use them for prediction and for epidemiological studies, using synthetic data and in a case study to predict social isolation and loneliness for the Alliance population. We also proposed a new way to capture similarity between clients, for use in the kernel part of our model, in terms of deviations from population-level expectations.

Altogether this thesis advanced AI for primary care while making methodological contributions to the fields of epidemiology and computer science.

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS