Electronic Thesis and Dissertation Repository

Degree

Doctor of Philosophy

Program

Statistics and Actuarial Sciences

Supervisor

Provost, Serge B.

2nd Supervisor

Ahmed, Syed Ejaz

Joint Supervisor

Abstract

This thesis advocates the use of shrinkage and penalty techniques for estimating the parameters of a regression model that comprises both parametric and nonparametric components and develops semi-nonparametric density estimation methodologies that are applicable in a regression context.

First, a moment-based approach whereby a univariate or bivariate density function is approximated by means of a suitable initial density function that is adjusted by a linear combination of orthogonal polynomials is introduced. Such adjustments are shown to be mathematically equivalent to making use of standard polynomials in one or two variables. Once extended to apply to density estimation, in which case the sample moments are being utilized, the proposed technique readily lends itself to the modeling of massive univariate or bivariate data sets. As well, the resulting density functions are shown to be expressible as kernel density estimates via the Christoffel-Darboux formula. Additionally, it is established that a set of n observations is entirely specified by its first n moments.

It is also explained that a univariate bona fide density approximation can be obtained by assuming that the derivative of the logarithm of the density function under consideration is expressible as a rational function or a polynomial. An explicit representation of the density function so obtained is derived and jointly sufficient statistics for its parameters are identified. Then, extensions of the proposed methodology to density estimation and multivariate settings are discussed. As a matter of fact, this approach constitutes a generalization of Pearson's system of frequency curves. Several illustrative examples are presented including regression applications.

Finally, an iterative algorithm involving shrinkage and pretest techniques is introduced for estimating the parameters of a certain semi-nonparametric model. It is theoretically established and numerically verified that the proposed estimators are more accurate than the unrestricted ones. This methodology is successfully applied to a mass spectrometry data set.

Share

COinS