Kang-in Lee

Doctor of Philosophy


In this dissertation, five topics related to the process and prediction of forward stepwise logistic regression are investigated.;Forward stepwise logistic regression is involved with selection and stopping criteria. Seven selection criteria are used: the likelihood ratio statistic, Lawless and Singhal (1978)'s statistic, the Wald statistic, the score statistic, Peduzzi, Hardy, and Holford (1980)'s statistic, Lee and Koval's statistic (LK), and a sweep operator's statistic (SW). Five stopping criteria are used: {dollar}\chi\sp2{dollar} test based on a fixed {dollar}\alpha{dollar} level, minimum value of ERR, minimum value of the C{dollar}\sb{lcub}\rm p{rcub}{dollar} statistic (Hosmer, 1989), minimum value of the Akaike information criterion (Akaike, 1974), and minimum value of Schwarz's criterion (Schwarz, 1978).;Apparent error tate (ARR) tends to underestimate true error rate (ERR). In our study, estimated true error rate (ERR) is obtained by ERR = ARR + {dollar}\\omega{dollar}, where {dollar}\\omega{dollar} is from Efron (1986)'s parametric estimate of bias for ARR.;We use Monte Carlo simulation with both multivariate normal and multivariate binary independent variables; we implement the simulation with SAS/IML programs. We then analyze the experimental design to see which factors of the distribution of independent variables affect various outcomes.;As a result, we recommend the best {dollar}\alpha{dollar} level for the {dollar}\chi\sbsp{lcub}(\alpha){rcub}{lcub}2{rcub}{dollar} stopping criterion. Second, we compare the order of variables selected by different selection criteria. Third, we investigate the effects of different structures of predictor variables on ARR, {dollar}\\omega{dollar}, and ERR. Fourth, we compare the sizes of subset models determined by different stopping criteria. Finally, we compare the performances of selection and stopping criteria in terms of ERR.



