Title: A Phase Transition in Gradient Descent for Wide, Deep Neural Networks
Abstract: Recent investigations into infinitely-wide deep neural networks have given rise to intriguing connections between deep networks, kernel methods, and Gaussian processes. Backing off of the infinite-width limit, one may wonder to what extent finite-width neural networks will be describable by including perturbative corrections to these results. We identify a regime that appears to be sharply different from such a description. The choice of learning rate in gradient descent is a crucial factor, naturally categorizing the dynamics of deep neural networks into two classes that are separated by a (sharp) phase transition as networks become wider. I will describe the distinct signatures of the two phases, how they are elucidated in a class of solvable simple models, and the implications for neural network performance.
Bio: Yasaman Bahri is a Research Scientist at Google. Her recent work has focused on understanding deep learning and bridging the gap between theory and practice. She has broad multi-disciplinary interests that span machine learning as well as neighboring fields. She was trained as a theoretical condensed matter physicist, in the area of strongly correlated systems, and received her Ph.D. in Physics from UC Berkeley in 2017.