Title: The out-of-sample prediction error of the square-root lasso and related estimators
Abstract: We study the classical problem of predicting an outcome variable, Y, using a linear combination of a d-dimensional covariate vector, X. We are interested in linear predictors whose coefficients solve: inf_β (E[(Y - < β, X >)^r])^(1/r) + 𝛿 || β ||, where r >1 and 𝛿 > 0 is a regularization parameter. We provide conditions under which linear predictors based on these estimators minimize the worst-case prediction error over a ball of distributions determined by a type of max-sliced Wasserstein metric. A detailed analysis of the statistical properties of this metric yields a simple recommendation for the choice of regularization parameter. The suggested order of 𝛿, after a suitable normalization of the covariates, is typically d/n, up to logarithmic factors. Our recommendation is computationally straightforward to implement, pivotal, has provable out-of-sample performance guarantees, and does not rely on sparsity assumptions about the true data generating process. This is joint work with Jose Montiel Olea, Amilcar Velez and Johannes Wiesel.
Bio: Cynthia Rush is an Associate Professor of Statistics at Columbia University. She received a Ph.D. and M.A. in Statistics from Yale University in 2016 and 2011, respectively, and she completed her undergraduate coursework at the University of North Carolina at Chapel Hill where she obtained a B.S. in Mathematics in 2010. Her research focuses on high-dimensional statistics, message passing algorithms, statistical robustness, and information theory.