Dr. Dan Roberts, Research Affiliate at the Center for Theoretical Physics at MIT will give a presentation during the CAII Seminar Series on Monday, November 8 at 11:00 a.m. The talk is titled “The Principles of Deep Learning Theory."
View Seminar here: https://go.ncsa.illinois.edu/2021CAIIFallSeminarSeries
Abstract:
Deep learning is an exciting approach to modern artificial intelligence based on artificial neural networks. The goal of this talk is to put forth a set of principles that enable us to theoretically analyze deep neural networks of actual relevance. In doing so, we will explain why such a goal is even attainable in theory and how we are able to get there in practice.
To begin, we will discuss how physical intuition and the approach of theoretical physics can be brought to bear on this problem, borrowing from the "effective theory" framework of physics. For context, we will recount how similar ideas were used to connect the thermodynamic effective description of artificial machines from the industrial age to the first-principles theory of microscopic components provided by statistical mechanics. In order to make progress on deep learning, we will need to understand the statistics of initialized deep networks and determine the dynamics of such an ensemble when learning from data. To make this tractable, we will have to take the structure of neural networks into account. Developing a perturbative 1/n expansion around the limit of infinite hidden-layer width, we will find a principle of sparsity that will let us describe effectively-deep networks of practical large-but-finite-width networks. We will thus see that useful neural networks should be sparse -- hence the preference for larger and larger models -- but not too sparse -- so that they are also deep.
This talk is based on a book, "The Principles of Deep Learning Theory," co-authored with Sho Yaida and based on research also in collaboration with Boris Hanin. It will be published next year by Cambridge University Press.
Speaker Bio:
Dan Roberts is currently a Research Affiliate at the Center for Theoretical Physics at MIT, an Affiliate of the NSF AI Institute for Artificial Intelligence and Fundamental Interactions, and a Principal Researcher at Salesforce. Previously, he was Co-Founder and CTO of Diffeo, a collaborative AI company acquired by Salesforce, a research scientist at Facebook AI Research (FAIR) in NYC, and a Member of the School of Natural Sciences at the Institute for Advanced Study in Princeton, NJ. Dan received a Ph.D. from MIT, funded by a Hertz Foundation Fellowship and the NDSEG, and he studied at Cambridge and Oxford as a Marshall Scholar. Dan's research has centered on the interplay between physics and computation, and previously he has focused on the relationship between black holes, quantum chaos, computational complexity, randomness, and how the laws of physics are related to fundamental limits of computation.
All presentations will be recorded and will available on the CAII website shortly after the presentation.