Electronic Thesis and Dissertation Repository

Thesis Format

Monograph

Degree

Master of Science

Program

Computer Science

Collaborative Specialization

Artificial Intelligence

Supervisor

Lizotte, Daniel J.

Abstract

Reinforcement learning (RL) has helped improve decision-making in several applications. However, applying traditional RL is challenging in some applications, such as rehabilitation of people with a spinal cord injury (SCI). Among other factors, using RL in this domain is difficult because there are many possible treatments (i.e., large action space) and few patients (i.e., limited training data). Treatments for SCIs have natural groupings, so we propose two approaches to grouping treatments so that an RL agent can learn effectively from limited data. One relies on domain knowledge of SCI rehabilitation and the other learns similarities among treatments using an embedding technique. We then use Fitted Q Iteration to train an agent that learns optimal treatments. Through a simulation study designed to reflect the properties of SCI rehabilitation, we find that both methods can help improve the treatment decisions of physiotherapists, but the approach based on domain knowledge offers better performance.

Summary for Lay Audience

Reinforcement learning (RL) is a field of study that aims to build decision-making systems that base their decisions on the current and past state of the world, the previous actions undertaken, and the actions that may be taken in the future, and has been very successful in improving decision-making in several applications. Despite the successes of RL, applying traditional RL is challenging in some applications, such as the rehabilitation of people with a spinal cord injury (SCI). Among other factors, using RL to aid in decision-making for SCI treatment is difficult because there are many possible treatments (i.e., a large action space) and few patients (i.e., a limited training dataset). However, the treatments for SCIs have structure to them such that they can be grouped, facilitating learning about a treatment even if that treatment was not selected. In this work, we propose two approaches to grouping treatments so that an RL agent can learn effectively from limited data. One relies on domain knowledge of SCI rehabilitation and the other learns similarities among treatments using treatment embedding (inspired by word embedding). We then use Fitted Q Iteration, an iterative algorithm that estimates the value of each action in every patient state (e.g., unable to sit independently to full walking capacity), to learn which treatments are best in each state. Through a simulation study designed to reflect the properties of SCI rehabilitation, we find that both methods can be used to improve the treatment decisions of physiotherapists, but the approach based on domain knowledge offers better performance.

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS