
Evaluating quantitative methods for intercategorical-intersectionality research: a simulation study
Abstract
This study evaluated eight quantitative methods for their predictive accuracy for intersectionally-defined subgroups, via a simulation study. The methods included two forms of single-level regression with interaction terms, cross-classification, multilevel analysis of individual heterogeneity and discriminatory accuracy (MAIHDA), and four decision tree methods: classification and regression trees (CART), conditional inference trees, chi-square automatic interaction detector, and random forest. The simulated datasets varied by outcome variable type, input variable types, sample size, and size and direction of the effects. Predictive accuracy improved with increasing sample size for all methods except CART. At small sample sizes, random forest and MAIHDA generally created the most precise predictions. While performing well for prediction, variable selection by random forest and confidence interval coverage and power of MAIHDA main effects coefficients were suboptimal. We have identified differences in methods ideal for intersectional prediction versus variable identification, highlighting that different objectives and data scenarios require different methods.