Pattern Discovery in Brain Imaging Genetics via SCCA Modeling with a Generic Non-convex Penalty
URL with Digital Object Identifier
Brain imaging genetics intends to uncover associations between genetic markers and neuroimaging quantitative traits. Sparse canonical correlation analysis (SCCA) can discover bi-multivariate associations and select relevant features, and is becoming popular in imaging genetic studies. The L1-norm function is not only convex, but also singular at the origin, which is a necessary condition for sparsity. Thus most SCCA methods impose ℓ 1 -norm onto the individual feature or the structure level of features to pursuit corresponding sparsity. However, the ℓ1 -norm penalty over-penalizes large coefficients and may incurs estimation bias. A number of non-convex penalties are proposed to reduce the estimation bias in regression tasks. But using them in SCCA remains largely unexplored. In this paper, we design a unified non-convex SCCA model, based on seven non-convex functions, for unbiased estimation and stable feature selection simultaneously. We also propose an efficient optimization algorithm. The proposed method obtains both higher correlation coefficients and better canonical loading patterns. Specifically, these SCCA methods with non-convex penalties discover a strong association between the APOE e4 rs429358 SNP and the hippocampus region of the brain. They both are Alzheimer's disease related biomarkers, indicating the potential and power of the non-convex methods in brain imaging genetics.