Computer Science Speakers Calendar

View Full Calendar

iDS^2 Mini-Workshop on Asymptotics and Non-Asymptotics in Control and Reinforcement Learning: Stochastic Control Problems, and How You Can Solve Yours

Event Type
Conference/Workshop
Sponsor
Coordinated Science Laboratory
Virtual
wifi event
Date
Apr 16, 2021   12:30 - 3:00 pm   with a break from 1:30 pm to 2:00 pm
Speaker
Sean Meyn (University of Florida)
Registration
Register here
Contact
Maxim Raginsky
E-Mail
maxim@illinois.edu
Views
52
Abstract: Convergence theory for reinforcement learning is sparse: barely existent for Q-learning outside of the special case of Watkins, and the situation is even worse for RL with nonlinear function approximation. This is unfortunate, given the current interest in neural networks.  What’s more, every user of RL knows that it can be insanely slow and unreliable. The talk will begin with explanations for slow convergence based on a combination of statistical reasoning and nonlinear dynamical systems theory. The special sauce in this lecture is an approach to universal stability of RL based on generalizations of Zap Q-learning.
 
Apologies in advance: there will be no finite-n bounds bounds in this lecture — all asymptotic.  We will see why there is little hope for useful finite-n bounds when we consider algorithms with “noise” that has memory (such as in standard Markovian settings).      

REFERENCES:

  1. S. Chen, A. M. Devraj, F. Lu, A. Busic, and S. Meyn. Zap Q-Learning with nonlinear function approximation. In H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, and arXiv e-prints 1910.05405, volume 33, pages 16879–16890. Curran Associates, Inc., 2020.
  2. A. M. Devraj, A. Busic and S. Meyn. Fundamental design principles for reinforcement learning algorithms. In K. G. Vamvoudakis, Y. Wan, F. L. Lewis, and D. Cansever, editors, Handbook on Reinforcement Learning and Control. Springer, 2021.
  3. S. Meyn. Control Systems and Reinforcement Learning. Cambridge University Press, 2021 (draft available upon request)
link for robots only