Electronic Thesis and Dissertation Repository

Thesis Format

Monograph

Degree

Master of Science

Program

Computer Science

Collaborative Specialization

Artificial Intelligence

Supervisor

Mohsenzadeh, Yalda

2nd Supervisor

Daniel, Lizotte

Abstract

Recent progress in contrastive representation learning has shown to yield robust representations that can avoid catastrophic forgetting in continual learning tasks. Most of these methods avoid forgetting by limiting changes in components of the deep neural network (DNN) that hold significant information about previously seen tasks. While these previous methods have been successful in preserving aspects of learned parameters believed to be most relevant for distinguishing previous classes, the retained parameters may be overfitted to seen data, leading to poor generalization even though “forgetting” is avoided. Inspired by modulation of early sensory neurons by top-down feedback projections of cortical neurons in perception and visual processing, we propose a class-incremental continual learning algorithm that identifies and attempts to preserve weights that contribute to the model performing well on new unseen classes by assessing their generalizability on a small predictive batch of the next episode of data. With experiments on popular image classification datasets, we demonstrate the effectiveness of the proposed approach and explain how using the model’s first encounter with new data to simulate a feedback signal for modulating plasticity of weights provides more information for training compared to using the loss value alone, and how it can guide the model’s learning through new experiences.

Summary for Lay Audience

Continual learning is the field of training a neural network on a sequence of tasks defined by their corresponding datasets. A major issue that this field attempts to solve is catastrophic forgetting, when a neural network's performance on previous learned tasks rapidly decreases, in contrast to how humans learn. Previous work has made significant progress in providing neural networks that output representations (a vector) for each data sample (an image) that are robust to forgetting. Most of these methods avoid forgetting by limiting changes in components of the neural network that hold significant information about previously seen tasks. While these previous methods have been successful in preserving aspects of learned parameters believed to be most relevant for distinguishing previous classes of data, the retained parameters may be working well on the data they were trained on but perform poorly on similar data that they have not seen and lacking generalizability, even though “forgetting” is avoided. Inspired by modulation of early sensory neurons (near eyes) by top-down feedback of higher level neurons in the brain when processing visual stimuli, this thesis proposes a continual learning algorithm that identifies and attempts to preserve neurons and connections that contribute to the model performing well on new unseen classes by assessing their performance on a small subset of the next episode of data. With experiments on popular image classification datasets, the effectiveness of the proposed approach is demonstrated. It is also explained that how using the model’s first encounter with new data to simulate a feedback signal for modulating the allowance of change in neurons (plasticity) provides more information for training compared to using the loss value (used for training of the network and is indicator of performance) alone, and how it can guide the model’s learning through new experiences.

Share

COinS