Webinar Series: Mathematical Foundations of Data Science

August 25th, 2020 - 3pm ET

Policy Optimization for Linear Optimal Control with Guarantees of Robustness 

Abstract

Policy optimization (PO) is a key ingredient of modern reinforcement learning (RL), and can be used for the efficient design of optimal controllers. For control design, certain constraints are generally enforced on the policies to be implemented, such as stability, robustness, and/or safety concerns on the closed-loop system. Hence, PO entails, by its nature, a constrained optimization in most cases, which is also nonconvex, and analysis of its global convergence is generally very challenging. Further, another element that compounds the challenge is that some of the constraints that are safety-critical, such as closed-loop stability or the H-infinity norm constraint that guarantees the system robustness, can be difficult to enforce on the controller while being learned as the PO methods proceed. We have recently overcome this difficulty for a special class of such problems, which I will discuss in this presentation.

Specifically, I will introduce the problem of PO for H2 linear control with a guarantee of robustness according to the H∞ criterion, for both continuous- and discrete-time linear systems. I will argue, with justification, that despite the nonconvexity of the problem, PO methods can enjoy the global convergence property. More importantly, I will show that the iterates of two specific PO methods (namely, natural policy gradient and Gauss-Newton) automatically preserve the H∞ norm (i.e., the robustness) during iterations, thus enjoying what we refer to as “implicit regularization” property. Furthermore, under certain conditions, convergence to the globally optimal policies features globally sub-linear and locally super-linear rates. Due to the inherent equivalence of this optimal robust control model to risk-sensitive linear control and linear quadratic (LQ) dynamic games, these results also apply as a byproduct to these settings as well, which I will also discuss. The talk will conclude with some informative simulations, and brief discussion of extensions to the model-free framework and associated sample complexity analyses. 

(Based on joint work with Kaiqing Zhang and Bin Hu, UIUC)

Bio

Tamer Başar has been with the University of Illinois at Urbana-Champaign since 1981, where he holds the academic positions of Swanlund Endowed Chair; Center for Advanced Study (CAS) Professor of Electrical and Computer Engineering; Professor, Coordinated Science Laboratory; Professor, Information Trust Institute; and Affiliate Professor, Mechanical Science and Engineering. He is also the Director of the Center for Advanced Study. At Illinois, he has also served as Interim Dean of Engineering and Interim Director of the Beckman Institute for Advanced Science and Technology. He is a member of the US National Academy of Engineering; Fellow of IEEE, IFAC, and SIAM; a past president of the IEEE Control Systems Society (CSS), the founding president of the International Society of Dynamic Games (ISDG), and a past president of the American Automatic Control Council (AACC). He has received several awards and recognitions over the years, including the highest awards of IEEE CSS, IFAC, AACC, and ISDG, the IEEE Control Systems Technical Field Award, and a number of international honorary doctorates and professorships, most recently an honorary doctorate from KTH, Sweden. He has over 900 publications in systems, control, communications, optimization, networks, and dynamic games, including books on non-cooperative dynamic game theory, robust control, network security, wireless and communication networks, and stochastic networks. He was Editor-in-Chief of the IFAC Journal Automatica between 2004 and 2014, and is currently editor of several book series. His current research interests include stochastic teams, games, and networks; multi-agent systems and learning; data-driven distributed optimization; epidemics modeling and control over networks; security and trust; energy systems; and cyber-physical systems.

Event Type

Sponsor

Georgia Institute of Technology
Northwestern University
Pennsylvania State University
Princeton University
University of Illinois at Urbana-Champaign
National Institute of Statistical Sciences
Harvard University

Location

Online Webinar
Tamer Basar, UIUC