A central question in statistical learning is to design algorithms that not only perform well on training data, but also generalize to new and unseen data. In this talk, we tackle this question by formulating a data-driven distributionally robust optimization (DRO) problem, which seeks a solution that minimizes the worst-case expected loss over a family of distributions that are close to the empirical distribution in Wasserstein distance. In the first part, we derive a tractable reformulation of the DRO problem via strong duality. This is obtained using a novel constructive proof, in which the worst-case distribution has a concise structure. In the second part, we establish a close connection between DRO and regularization. Such connection suggests a systematic way to regularize high-dimensional, non-convex problems. This is demonstrated by two applications: learning high-dimensional distributions with deep neural networks, and learning heterogeneous consumer preferences with mixed logit choice model.
Rui Gao is a Ph.D. candidate in Operations Research in the H. Milton Stewart School of Industrial & Systems Engineering at Georgia Institute of Technology. His research interests lie in the intersection of data-driven decision-making under uncertainty and statistical learning. Specific application areas include deep learning, revenue management, and power systems design. His work has been recognized with several paper competition awards, including finalist for INFORMS Nicholson Student Paper Competition 2016, finalist for Computational Management Science Best Student Paper Award 2017, winner for INFORMS Data Mining Best Paper Competition 2017 and runner-up for INFORMS Computing Society Student Paper Award 2017.