Electronic Thesis and Dissertation Repository

Extensions of Classification Method Based on Quantiles

Yuanhao Lai, The University of Western Ontario

Abstract

This thesis deals with the problem of classification in general, with a particular focus on heavy-tailed or skewed data. The classification problem is first formalized by statistical learning theory and several important classification methods are reviewed, where the distance-based classifiers, including the median-based classifier and the quantile-based classifier (QC), are especially useful for the heavy-tailed or skewed inputs. However, QC is limited by its model capacity and the issue of high-dimensional accumulated errors. Our objective of this study is to investigate more general methods while retaining the merits of QC.

We present four extensions of QC, which appear in chronological order and preserve the ideas driving our research. The first extension, ensemble quantile classifier (EQC), treats QC as a base learner in ensemble learning to increase model capacity and introduces weight decay regularization to mitigate high-dimensional accumulated errors. The second extension, multiple quantile classifier (MQC), enhances the model capacity of EQC by allowing multiple quantile-difference transformations to be conducted for each variable. The third extension, factorized multiple quantile classifier (FMQC), adds higher-order interactions to MQC via a computationally efficient approach of adaptive factorization machines. The fourth extension, deep multiple quantile classifier (DeepMQC), embeds the MQC into the flexible framework of deep neural networks and opens more possibilities of applications to various tasks. We discuss the theoretical motivation for each method. Numerical studies on synthetic and real datasets are used to demonstrate the improvement of the proposed methods.