Offline reinforcement learning (RL) learns a good decision-making strategy based on a pre-collected dataset without directly interacting with the environment, and is a promising paradigm for applying reinforcement learning (RL) to real-world applications where active exploration and intervention is difficult. That said, many basic theoretical questions that are directly relevant to practice still remain open, from how to perform model selection to training under non-exploratory data. In this talk I will briefly introduce some of my group's recent works that make progress in these open problems.
Nan Jiang is an assistant professor of Computer Science at University of Illinois at Urbana-Champaign. Prior to joining UIUC, he was a postdoc researcher at Microsoft Research NYC. He received his PhD in Computer Science and Engineering at University of Michigan. His research interests lie in the theory of reinforcement learning (RL), mostly focusing on sample efficiency. Specific research topics include sample complexity of exploration under function approximation, off-policy evaluation and policy selection, batch RL with insufficient data coverage, and spectral learning of dynamical systems.