Electronic Thesis and Dissertation Repository


Doctor of Philosophy


Statistics and Actuarial Sciences


W. John Braun


Nonparametric density estimators are used to estimate an unknown probability density while making minimal assumptions about its functional form. Although the low reliance of nonparametric estimators on modelling assumptions is a benefit, their performance will be improved if auxiliary information about the density's shape is incorporated into the estimate. Auxiliary information can take the form of shape constraints, such as unimodality or symmetry, that the estimate must satisfy. Finding the constrained estimate is usually a difficult optimization problem, however, and a consistent framework for finding estimates across a variety of problems is lacking.

It is proposed to find shape-constrained density estimates by starting with a pilot estimate obtained by standard methods, and subsequently adjusting its shape until the constraints are satisfied. This strategy is part of a general approach, in which a constrained estimation problem is defined by an estimator, a method of shape adjustment, a constraint, and an objective function. Optimization methods are developed to suit this approach, with a focus on kernel density estimation under a variety of constraints. Two methods of shape adjustment are examined in detail. The first is data sharpening, for which two optimization algorithms are proposed: a greedy algorithm that runs quickly but can handle a limited set of constraints, and a particle swarm algorithm that is suitable for a wider range of problems. The second is the method of adjustment curves, for which it is often possible to use quadratic programming to find optimal estimates.

The methods presented here can be used for univariate or higher-dimensional kernel density estimation with shape constraints. They can also be extended to other estimators, in both the density estimation and regression settings. As such they constitute a step toward a truly general optimizer, that can be used on arbitrary combinations of estimator and constraint.