Abstract:
We consider the problem of nonlinear stochastic optimal control. The optimal feedback law can be synthesized via the solution of an associated Dynamic Programming (DP) problem. Unfortunately, the DP problem is fundamentally intractable owing to Bellman's infamous “Curse of Dimensionality". We show that the optimal deterministic feedback law has a perturbation structure in that higher order terms in the feedback expansion do not affect the lower order terms and is near optimal to fourth order in a small noise parameter to the optimal stochastic optimal policy. We show that satisfying the Minimum Principle is sufficient to obtain the globally optimal open loop solution for deterministic nonlinear control problems, which then determines all the higher order feedback terms. Furthermore, we show that the perturbation structure is lost in the stochastic problem and empirical results show that, in practice, the deterministic feedback law offers superior performance. We consider the generalization to the partially observed optimal control problem via the automated construction of an “information state”. Utilizing a Minimum Principle for the optimal control in the information state, we show global optimality of the open loop and the perturbation structure of the optimal control for the information state problem.
We consider the implications on Reinforcement Learning (RL)/ data-based control. Most RL techniques search over a complex global nonlinear feedback parametrization making them suffer from high training times as well as solution variance. Instead, we advocate searching over a local perturbation feedback representation consisting of an open-loop sequence, and an associated optimal linear feedback law completely determined by the open-loop. We show that this alternate approach, termed decoupled data-based control (D2C) results in highly efficient training, the answers obtained are globally optimum locally, have negligible variance, the resulting closed loop performance is superior to global state of the art RL techniques and the approach is easily generalized to partially observed problems. We present several applications to complex Robotic Control problems including Swimming and Tensegrity Robots as well as the control of nonlinear Partial Differential Equations such as Material Microstructure control.
Bio:
Suman Chakravorty obtained his B.Tech in Mechanical Engineering in 1997 from the Indian Institute of Technology, Madras and his PhD in Aerospace Engineering from the University of Michigan, Ann Arbor in 2004. He joined the Aerospace Engineering Department at Texas A&M University, College Station, in August 2004 as an Assistant Professor, where he is currently a Professor. Dr. Chakravorty’s research interests lie in the estimation and control of stochastic dynamical systems with application to robotic planning and control, and situational awareness problems. He has served as an Associate Editor for the ASME Journal on Dynamical Systems, Measurement and Control and the IEEE Robotics and Automation Letters.