From Optimization to Control: An Algorithmic Perspective
Abstract: In this talk, we draw an explicit analogy across four problem classes in optimization and control with a unified solution characterization. This viewpoint allows for a systematic transformation of algorithms from one domain to the other. With this in mind, we exploit two linear structural constraints specific to control problems with finite state-action pairs to approximate the Hessian in a second-order-type algorithm from optimization. This leads to novel first-order control algorithms with the same computational complexity as (model-based) value iteration and (model-free) Q-learning, while they exhibit an empirical convergence behavior similar to (model-based) policy iteration and (model-free) Zap Q-learning with very low sensitivity to the discount factor. If time permits, we also discuss how an interesting analogy between the convex conjugate operator and the Fourier transform can reduce the typical time complexity of the dynamic programming operation from O(XU) to O(X + U) where X and U denote the size of the discrete state and input spaces, respectively.
Date and Time
Location
Hosts
Registration
-
Add Event to Calendar
- 172 St. George St.
- Toronto, Ontario
- Canada M5R 0A3
- Building: MIE, University of Toronto
- Room Number: SF B650
Speakers
Peyman Mohajerin Esfahani of Operations Research Industrial & Mechanical Engineering University of Toronto Delft Center for Systems and Control Mechanical Engineering Delft University of Technology
From Optimization to Control: An Algorithmic Perspective
Abstract: In this talk, we draw an explicit analogy across four problem classes in optimization and control with a unified solution characterization. This viewpoint allows for a systematic transformation of algorithms from one domain to the other. With this in mind, we exploit two linear structural constraints specific to control problems with finite state-action pairs to approximate the Hessian in a second-order-type algorithm from optimization. This leads to novel first-order control algorithms with the same computational complexity as (model-based) value iteration and (model-free) Q-learning, while they exhibit an empirical convergence behavior similar to (model-based) policy iteration and (model-free) Zap Q-learning with very low sensitivity to the discount factor. If time permits, we also discuss how an interesting analogy between the convex conjugate operator and the Fourier transform can reduce the typical time complexity of the dynamic programming operation from O(XU) to O(X + U) where X and U denote the size of the discrete state and input spaces, respectively.
Biography:
Bio: Peyman Mohajerin Esfahani is an Associate Professor at the University of Toronto and Delft University of Technology. He joined TU Delft in October 2016, and prior to that, he held several research appointments at EPFL, ETH Zurich, and MIT between 2014 and 2016. He received the BSc and MSc degrees from Sharif University of Technology, Iran, and the PhD degree from ETH Zurich. He currently serves as an associate editor of Transactions on Automatic Control, Mathematical Programming, Operations Research, and Open Journal of Mathematical Optimization. He was one of the three finalists for the Young Researcher Prize in Continuous Optimization awarded by the Mathematical Optimization Society in 2016, and a recipient of the 2016 George S. Axelby Outstanding Paper Award from the IEEE Control Systems Society. He received the ERC Starting Grant and the INFORMS Frederick W. Lanchester Prize in 2020. He is the recipient of the 2022 European Control Award.