*Presentation will be recorded.
Abstract:
In this talk, we will present two vignettes on the problem of online control of the Linear Quadratic Regulator (LQR) problem when the dynamics are non-stationary and unknown. LQR is arguably the simplest Markov Decision Process, and serves as a fertile ground for developing new frameworks for studying online and robust control policies. In the first part of the talk, we will present a minimax dynamic regret optimal policy under two somewhat strong assumptions: (i) the noise process is independent across time steps, and (ii) the total variation of the dynamics over T time steps is sublinear in T (we do not assume this variation is known). In the second part, we will relax both these assumptions. Since dynamic regret minimization is too strong a goal, we propose a policy that guarantees bounded-input-bounded-output stability in the closed loop. The talk will highlight how different perspectives for studying online control under non-stationary dynamics lead to novel statistical and algorithmic questions.
Bio:
Varun Gupta is a Visiting Associate Professor in the Computer Science Department at Northwestern University. He obtained his PhD in computer science from Carnegie Mellon University and bachelor’s in computer science and engineering from the Indian Institute of Technology in Delhi. His research interests include stochastic modeling and optimization, applied probability, algorithm design and analysis, and mechanism design. He is particularly interested in modeling and optimization of resource allocation policies for multi-server and distributed systems (e.g., third party logistics, cloud infrastructure, health care) from a queueing theoretic perspective, and learning and control in non-stationary environments. He serves as an Associate Editor for the INFORMS Operations Research journal. His research has won the Test of time award at INFOCOM, and the best publication award from the MSOM Service Science SIG.