Electronic Thesis and Dissertation Repository

Thesis Format

Monograph

Degree

Doctor of Philosophy

Program

Computer Science

Supervisor

Daley, Mark

Abstract

This thesis aims to examine case studies of data representation in artificial in- telligence in order to generate insights regarding model behavior and efficacy. Our first case study concerns neural networks and presents results detailing the concept of their mathematical equivalence. We demonstrate that the class of net- works equivalent to a given feedforward neural network with piecewise linear activation functions can be represented as a semi-algebraic set on the network coefficients. The second major exploration of data representation utilizes interdis- ciplinary techniques in order to qualitatively describe natural and synthetic hierar- chies in the image domain. We find that our approach predicts model performance in domain generalization (DG) tasks. The core thread throughout this work is a philosophy that emphasizes taking new vantages on popular AI data representa- tions and leveraging the resulting insights in order to lay the groundwork for new metrics of learning.

Summary for Lay Audience

When developing an artificial intelligence (AI) protocol, selecting data represen- tations is an important process. The representations we choose for our data are often influenced by the properties we gather from the data. It is equally true that selecting a data representation will guide our understanding of the data itself. Neural networks, some of our most prominent tools, are often considered a ‘black box’ from certain scientific vantages. Similarly, with regards to image domains, studies in the field of domain generalization have noted unexplained asymmetries. Through an investigation of these two categories with the tools of data represen- tation, this research aims to expand the theoretical foundation of novel metrics for evaluating AI models and understanding the data on which they are trained.

Share

COinS