Electronic Thesis and Dissertation Repository

Degree

Doctor of Philosophy

Program

Statistics and Actuarial Sciences

Supervisor

Dr. Duncan Murdoch

2nd Supervisor

Dr. Wenqing He

Joint Supervisor

Abstract

Variable selection is a difficult problem in statistical model building. Identification of cost efficient diagnostic factors is very important to health researchers, but most variable selection methods do not take into account the cost of collecting data for the predictors. The trade off between statistical significance and cost of collecting data for the statistical model is our focus. A Branching LARS (BLARS) procedure has been developed that can select and estimate the important predictors to build a model not only good at prediction but also cost efficient. BLARS method is an extension of the LARS variable selection method to incorporate various costs of factors, where branch and bound search method is employed to accelerate the search process. Both additive and non-additive costs will be addressed. The R package branchLars which implements BLARS will be described. We will show that a "cheaper" model could be selected by sacrificing a user selected amount of model accuracy.

Included in

Biostatistics Commons

Share

COinS