Towards a Systems Theory for Optimization Algorithms

Workshop at the Conference on Decision and Control 2021, Sunday, December 12, 1pm-5pm (Time zone is UTC)

Last update: August 31, 2021

**Organizers (alphabetical order)**

Christian Ebenbauer (Chair of Intelligent Control Systems, RWTH Aachen University, Germany, christian.ebenbauer[at]ic.rwth-aachen.de)

Carsten Scherer (Chair for Mathematical Systems Theory, University of Stuttgart, Germany, carsten.scherer[at]imng.uni-stuttgart.de)
If you require any further information, feel free to contact the organizers.
**Title and abstract of the proposed workshop**Title: Towards a Systems Theory for Optimization Algorithms

Abstract: Over the past decade, various research activities in the area of analysis and synthesis of optimization algorithms using ideas and concepts from control theory have been initialized. These activities are motivated by the basic fact that many optimization algorithms are dynamical systems, where tools from control theory can be used to analyze and synthesize algorithms. A further motivation is the increasing use of online optimization in feedback algorithms due to the high availability of online computing power in all kinds of systems. Prominent examples of such algorithms are receding horizon schemes or extremum control. It can be expected that artificial intelligence techniques and learning algorithms based on large data sets will further accelerate this development.

For this reason, the dynamics of optimization algorithms play a key role of future control and AI systems which needs to be taken into account in a fundamental fashion when analyzing or designing complex control systems.The goals of this workshop are

i) to give an introduction and an overview of these research activities,

ii) to present recent results and

iii) to discuss future challenges and developments.**Workshop length**

Half-day (4 hours).**List of contributors and affiliations (alphabetical order)**Saverio Bolognani (ETH, Switzerland)

Florian Dörfler (ETH, Switzerland)

Christian Ebenbauer (RWTH Aachen University, Germany)

Guilherme França (University of California Berkeley, USA)

Francois Glineur (University of Louvain, Belgium)

Dennis Gramlich (RWTH Aachen University, Germany)

Julien Hendrickx (University of Louvain, Belgium)

Laurent Lessard (Northeastern University, USA)

Michael Muehlebach (Max Planck Institute for Intelligent Systems, Germany)

Carsten Scherer (University of Stuttgart, Germany)

Adrian Taylor (INRIA, Paris, France)

Bryan Van Scoy (Miami University in Ohio, USA)

**Brief synopsis of the workshop content and major topics**Introduction

Analysis and Synthesis of Optimization Algorithms: Robust Control, Worst-Case and Distributed Approaches

Continuous-time Perspectives on Optimization Algorithms

Optimization in Feedback Loops: Feedback Optimization and Extremum Control

**Brief description of the intended audience, diversity, objectives**

The goal of this workshop is to give an introduction and an overview of current research activities at the interface of control, optimization algorithms, and optimization in feedback loops, which have gained considerable interest over the past years. Hence, the participants of the workshop get a concise overview of these research directions and the main underlying ideas and methods.

The workshop starts at a very basic level but also covers ongoing research activities. Most talks will comprise a short tutorial part as an introduction. Overall, the workshop is intended to be attractive for a broad audience, including PhD students, post-docs, professors and researcher from industry.

The workshop coverse various research streams of a highly promising future research direction located at the interface of control and optimization. The speakers the workshop are leading experts from five different countries affiliated with renowned universities and other research institutions.

The workshop also aims to bundle various research streams and to foster the idea of a novel systems theory for optimization algorithms.

**Preliminary Schedule**

The schedule of the workshop is depicted in Table 1. It comprises five parts featuring certain research directions or methodological approaches by one or two talks of either 20 or 30 minutes each. Moreover, the workshop starts with a short introduction/overview and concludes with a summary and a discussion about future research topics.Schedule of Workshop Time Speaker Title/Topics 1:00pm – 1:20pm C. Ebenbauer, D. Gramlich, C. Scherer -Introduction 1 1:20pm – 2:00pm L. Lessard -Rate-Robustness trade-off in first-order optimization algorithms B. Van Scoy -Distributed Optimization 2 2:00pm – 2:30pm J. Hendrickx, F. Glineur, A. Taylor -A principled and automated tight analysis of first order optimization methods 3 2:30pm – 3:00pm C. Scherer, C. Ebenbauer -Convex Synthesis of Accelerated Gradient Algorithms 3:00pm – 3:30pm ——– Break ——– 4 3:30pm – 4:10pm M. Muehlebach -A robust control perspective on lower bounds for gradient-based optimization algorithms G. França -Dissipative symplectic integration and optimization 5 4:10pm – 4:50pm F. Dörfler -Enabling and accelerating nested multi-level algorithms via predictive sensitivity S. Bolognani -Optimization Algorithms as Robust Feedback Controllers 4:50pm – 5:00pm ALL -Summary, Future Research **Prerequisites**

Basic knowledge about optimization and control.

**Title, speaker, abstract, brief bios****Title: Introduction**Speaker: C. Ebenbauer, D. Gramlich, C. Scherer

Abstract: We provide an introduction and an overview about various research activities centered around the theme of the workshop.

Brief Bio(s): Christian Ebenbauer received his MS (Dipl.-Ing.) in Telematics (Computer Science and Electrical Engineering) from Graz University of Technology, Austria, in 2000 and his PhD (Dr.-Ing.) in Mechanical Engineering from the University of Stuttgart, Germany, in 2005. After having completed his PhD, he was a Postdoctoral Associate and an Erwin Schreodinger Fellow at the Laboratory for Information and Decision Systems, Massachusetts Institute of Technology, USA. From April 2009 until July 2021, he was a full professor at the Institute for Systems Theory and Automatic Control, University of Stuttgart, Germany. Since August 2021 he holds the Chair of Intelligent Control Systems at RWTH Aachen University, Germany. His research interests lie in the areas of dynamical systems, control theory, optimization and computation. Recent research projects focus on system-theoretic approaches to optimization algorithms, extremum seeking, and receding horizon decision making.

Dennis Gramlich is a PhD student at the Chair of Intelligent Control Systems at RWTH Aachen University, Germany, under the supervision of Christian Ebenbauer. He studied Mathematics and Engineering Cybernetics at the University of Stuttgart, Germany, and specialised on the fields of Control Theory, Scientific Computing and Learning Theory. He is working at the intersection of the fields of Control Theory, Optimisation and Machine Learning with a recent focus on the analysis and design of optimisation algorithms using tools from control theory.

Carsten Scherer received his Ph.D. degree in mathematics from the University of Würzburg (Germany) in 1991. In 1993, he joined Delft University of Technology (The Netherlands) where he held positions as an assistant and associate professor. From December 2001 until February 2010 he was a full professor within the Delft Center for Systems and Control at Delft University of Technology. Since March 2010 he holds the SimTech Chair for Mathematical Systems Theory in the Department of Mathematics at the University of Stuttgart (Germany). Dr. Scherer acted as the chair of the IFAC technical committee on Robust Control (2002-2008), and he has served as an associated editor for the IEEE Transactions on Automatic Control, Automatica, Systems and Control Letters and the European Journal of Control. Since 2013 he is an IEEE fellow “for contributions to optimization-based robust controller synthesis“. Dr. Scherer’s main research activities cover various topics in applying optimization techniques for developing new advanced controller design algorithms and their application to mechatronics and aerospace systems.

**Title: Rate-Robustness trade-off in first-order optimization algorithms**Speaker: Laurent Lessard

Abstract: We consider the trade-off between convergence rate and sensitivity to additive gradient noise in the design of first-order methods. We consider three classes of functions: (1) strongly convex quadratics, (2) strongly convex functions with Lipschitz gradients, and (3) functions satisfying a one-point strong convexity property. For each case, we present a tractable way to compute both convergence rate and sensitivity to additive gradient noise for a broad family of first-order methods. We present optimal or near-optimal algorithm designs for the three aforementioned function classes. Each design has a single scalar tuning parameter that explicitly and provably trades off convergence rate and sensitivity to additive gradient noise. Each design consists of a simple analytic update rule requiring at most one timestep of memory, similar to well-known accelerated methods such as Polyak’s Heavy Ball method and Nesterov’s Fast Gradient method. When tuned as aggressively as possible, our proposed algorithms recover the algorithms with fastest known convergence rates for each function class. When tuned to be more robust, our proposed algorithms are novel.

Brief Bio: Laurent Lessard is an Associate Professor of Mechanical and Industrial Engineering at Northeastern University. His interests include robust control, distributed control, and optimization.

**Title: Distributed Optimization**Speaker: Bryan Van Scoy

Abstract: In distributed optimization, a group of agents cooperatively optimize a common objective by communicating with neighboring agents and performing local computations. Pressing applications such as large-scale machine learning and multi-agent systems have fueled a significant amount of research on this problem in the past decade. In this talk, we illustrate how distributed algorithms can be analyzed using tools from robust control. By solving a small semidefinite program, we obtain worst-case performance guarantees for a given algorithm, even in realistic scenarios in which the communication network among the agents varies with time. Such an analysis tool enables us to systematically prove convergence bounds for a wide range of known algorithms while also paving the way for the design of new algorithms with strong performance guarantees.

Brief Bio: Bryan Van Scoy is an Assistant Professor of Electrical and Computer Engineering at Miami University in Ohio. His research interests include control, optimization, and multi-agent systems.

**Title: A principled and automated tight analysis of first order optimization methods**Speaker: Julien Hendrikx, Francois Glineur, Adrien Taylor

Abstract: First-order optimization methods are very often characterized in terms of their worst-case performances, as many other classes of algorithms. From a practical point of view, performance guarantees are mathematical proofs which mainly rely on precise combinations of inequalities derived from assumptions on the functions and methods. This often leads to potentially long, technical and not necessarily intuitive proofs, that can be hard to analyze for those who were not involved in their development. Moreover, proofs found in the literature very frequently produce worst-case bounds that are not as tight as possible, leading to conservatism, and making performance comparisons and algorithmic tuning more challenging In this presentation, we show how and why most convergence proofs of first order optimization algorithms are actually very similar in nature, by relating them to the essence of worst-case analysis: computing worst-case scenarios. We then show how this point of view offers a principled way to approach and produce worst-case analyses in the context of first-order methods, which can in many cases be automated. This approach led to exact worst-case bounds for a multitude of first-order methods, together with examples of worst-case instances, and new optimal algorithmic parameters. The presentation will be example-based, featuring methods such as the gradient method and its accelerated variants, projected gradient methods, distributed methods or Douglas-Rachford, or methods involving inexact gradients. All the results that will be presented can be reproduced numerically using the Performance Estimation Toolbox (PESTO). https://github.com/AdrienTaylor/Performance-Estimation-Toolbox

Brief Bio(s): François Glineur received dual engineering degrees from Université de Mons and CentraleSupélec in 1997, and a PhD in Applied Sciences from Université de Mons in 2001. He visited Delft University of Technology and McMaster University as a postdoctoral researcher, then joined Université catholique de Louvain where he is currently a full professor of applied mathematics at the Engineering School, member of the Center for Operations Research and Econometrics and the Institute of Information and Communication Technologies, Electronics and Applied Mathematics. He was vice-dean of the Engineering School between 2016 and 2020. He has supervised ten doctoral students and authored more than sixty publications on optimization models and methods (with a focus on convex optimization and algorithmic efficiency) and their engineering applications, as well as on nonnegative matrix factorization and applications to data analysis.

Julien M. Hendrickx is professor of mathematical engineering at UCLouvain, in the Ecole Polytechnique de Louvain since 2010. He obtained an engineering degree in applied mathematics (2004) and a PhD in mathematical engineering (2008) from the same university. He has been a visiting researcher at the University of Illinois at Urbana Champaign in 2003-2004, at the National ICT Australia in 2005 and 2006, and at the Massachusetts Institute of Technology in 2006 and 2008. He was a postdoctoral fellow at the Laboratory for Information and Decision Systems of the Massachusetts Institute of Technology 2009 and 2010, holding postdoctoral fellowships of the F.R.S.-FNRS (Fund for Scientific Research) and of Belgian American Education Foundation. He was also resident scholar at the Center for Information and Systems Engineering (Boston University) in 2018-2019, holding a WBI.World excellence fellowship. Doctor Hendrickx is the recipient of the 2008 EECI award for the best PhD thesis in Europe in the field of Embedded and Networked Control, and of the Alcatel-Lucent-Bell 2009 award for a PhD thesis on original new concepts or application in the domain of information or communication technologies.

Adrien Taylor is a researcher at Inria and Ecole Normale Supérieure in Paris. He obtained his PhD from Université catholique de Louvain under the supervision of Julien Hendrickx and François Glineur. His research interests include optimization and machine learning, with a particular emphasis on semidefinite programming and first-order methods. Adrien Taylor was a finalist of the triennal Tucker Prize and is recipient of the 2018 IBM-FNRS innovation award and the 2018 ICTEAM thesis award.

**Title: Convex Synthesis of Accelerated Gradient Algorithms**Speaker: Carsten Scherer, Christian Ebenbauer

Abstract: Gradient descent is a well-established technique for solving optimization problems in the current era of big data and machine learning. Recent years have witness a strong interest in so-called accelerated gradient algorithms which have superior convergence properties if compared to standard gradient descent applied to convex optimization problems. It has been known for a long time that such algorithms can be viewed as a linear time-invariant discrete-time system in feedback with the gradient of the to-be-minimized function as a nonlinearity. This point-of-view provides an immediate link to the so-called absolute stability or Lur’e-problem in systems theory. It opens up the possibility to apply advanced tools from robust control in order to determine the convergence rate of a given accelerated gradient algorithm, as has been only rather recently shown by relying on so-called Zames-Falb multipliers within the framework of integral quadratic constraints. Remarkably, these techniques permit the tuning of the parameters of Nesterov’s celebrated accelerated algorithm in order to improve its convergence speed, which has lead to the so-called triple momentum algorithm.

A much more challenging task is the direct computational design of algorithms in order to achieve optimal convergence rates. This questions falls into the area of robust feedback controller synthesis. Since the initial suggestion of this approach towards understanding algorithms, it has been an open problem how to design optimal algorithms by solving a convex program. In this talk we survey how to formulate algorithm design as a robust feedback controller synthesis problem. We reveal that the convergence of general gradient algorithms requires a particularly structured plant in the synthesis problem, and we exploit this structure in order to translate the algorithm design problem into a tractable convex semi-definite program. We further show that interpolation techniques from robust control allow to determine explicit formulas for the best possible convergence rates that are achievable with general algorithms and the class of causal Zames-Falb multipliers. Finally, we show that these techniques seamlessly extend to extremum control, with the goal to design controllers that drive the output of a linear system to the minimum of a convex function with an optimal rate of convergence.

Brief bios(s): See above.

**Title: A robust control perspective on lower bounds for gradient-based optimization algorithms**Speaker: Michael Muehlebach

Abstract: The talk will highlight how tools from robust control can be used to derive lower bounds on the complexity of gradient-based algorithms. We will adopt a continuous-time perspective and normalize time in order to make the discussion of continuous-time convergence rates meaningful. We reduce the multi-dimensional problem to a single dimension, recover well-known lower bounds from the discrete-time setting, and provide insight into why these lower bounds occur. We present algorithms that achieve the proposed lower bounds, even when the function class under consideration includes certain non-convex functions.

Brief Bio: Michael Muehlebach obtained his PhD from ETH Zurich in 2018 under the supervision of Prof. R. D’Andrea. He then joined the group of Prof. Michael I. Jordan at the University of California, Berkeley as a postdoctoral researcher. Since 2021, he leads the independent research group Learning and Dynamical Systems at the Max Planck Institute for Intelligent Systems in Tuebingen. He is interested in a wide variety of subjects, including dynamics, machine learning, control, and optimization. During his Ph.D. he developed approximations of the constrained linear quadratic regulator problem and applied them to model predictive control. He also designed learning and control algorithms for a balancing robot and a flying machine. During his postdoctoral research he analyzed momentum-based optimization algorithms from a dynamical systems point of view. He received the Outstanding D-MAVT Bachelor Award for his Bachelor’s degree and the Willi-Studer prize for the best Master’s degree. His doctoral thesis was awarded with the ETH Medal and the HILTI prize for innovative research. He was also awarded the Branco Weiss Fellowship and the Emmy Noether Fellowship, which generously fund his research group.

**Title: Dissipative symplectic integration and optimization**Speaker: Guilherme França

Abstract: Symplectic integrators are the preferred approach when simulating conservative Hamiltonian systems because they preserve the underlying (symplectic) geometry of these systems, implying imitation of the continuum dynamics and long term stability. However, similar results for dissipative systems are unknown due to the absence of a conservation law. We will show how structure-preserving techniques can be extended to general dissipative Hamiltonian systems, generalizing key results of symplectic integrators to such cases. This approach is of particular interest to optimization and allow us to understand accelerated optimization methods from this perspective and construct new methods based on the simulation of dissipative physical systems.

Brief Bio: Guilherme França received the PhD degree in theoretical physics from the Institute of Theoretical Physics IFT-UNESP/ICTP-SAIFR. He was a Postdoctoral Fellow with the Physics Department, Cornell University, a Postdoctoral Fellow with the Mathematical Institute for Data Science, Johns Hopkins University, and currently he is a Research Scientist at University of California, Berkeley. He has worked on several topics of mathematical physics and non-perturbative field theory, and more recently he is interested in connections between physics, dynamical systems and statistical mechanics with optimization and machine learning.

**Title: Enabling and accelerating nested multi-level algorithms via predictive sensitivity**Speaker: Florian Dörfler

Abstract: Many widely used optimization algorithms present nested levels, where iterations for the lower levels need to converge before proceeding to the next upper level iteration, for example, dual ascent, interior point methods, bilevel gradient descent, etc. Such multi-level algorithms can be represented as control systems with multiple time scales. A classical approach for multiple time-scale systems is to design simple controllers within each time scale, then connect them through singular perturbation terms, and finally certify stability of the interconnected system for a sufficiently large time-scale separation via singular perturbation analysis. However, such a time scale separation interconnection slows down the convergence of the interconnected system. Instead, the recently proposed predictive-sensitivity approach allows to connect time-scale separated systems in a single time scale, and preserve their stability, without requiring any time-scale separation. For that, the predictive-sensitivity approach uses a feed-forward term to modify the dynamics of faster systems in order to anticipate the dynamics of slower ones. In this workshop, we analyze how to enable and accelerate nested multi-level algorithms via the predictive-sensitivity approach. We illustrate the utility of our approach with numerical case studies.

Brief Bio: Florian Dörfler is an Associate Professor at the Automatic Control Laboratory at ETH Zürich and the Associate Head of the Department of Information Technology and Electrical Engineering. He received his Ph.D. degree in Mechanical Engineering from the University of California at Santa Barbara in 2013, and a Diplom degree in Engineering Cybernetics from the University of Stuttgart in 2008. From 2013 to 2014 he was an Assistant Professor at the University of California Los Angeles. His primary research interests are centered around control, optimization, and system theory with applications in network systems, in particular electric power grids. He is a recipient of the distinguished young research awards by IFAC (Manfred Thoma Medal 2020) and EUCA (European Control Award 2020). His students were winners or finalists for Best Student Paper awards at the European Control Conference (2013, 2019), the American Control Conference (2016), the Conference on Decision and Control (2020), the PES General Meeting (2020), and the PES PowerTech Conference (2017). He is furthermore a recipient of the 2010 ACC Student Best Paper Award, the 2011 O. Hugo Schuck Best Paper Award, the 2012-2014 Automatica Best Paper Award, the 2016 IEEE Circuits and Systems Guillemin-Cauer Best Paper Award, and the 2015 UCSB ME Best PhD award.

**Title: Optimization Algorithms as Robust Feedback Controllers**Speaker: Saverio Bolognani

Abstract: We consider the problem of driving a stable system to an operating point that minimizes a given cost, and tracking such an optimal point in the presence of exogenous disturbances. This class of problem arises naturally in many engineering applications, e.g. in the real-time operation of electric power systems. We will show that this task can be achieved by interpreting and using iterative optimization algorithms as robust output-feedback controllers. Special care is required in order to satisfy input and output constraints at all times and to maintain stability of the closed-loop system. We will see how both these properties can be guaranteed at the controller design stage.

Brief Bio: Saverio Bolognani received the B.S. degree in Information Engineering, the M.S. degree in Automation Engineering, and the Ph.D. degree in Information Engineering from the University of Padova, Italy, in 2005, 2007, and 2011, respectively. In 2006-2007, he was a visiting graduate student at the University of California at San Diego. In 2013-2014 he was a Postdoctoral Associate at the Laboratory for Information and Decision Systems of the Massachusetts Institute of Technology in Cambridge (MA). He is currently a Senior Researcher at the Automatic Control Laboratory at ETH Zurich. His research interests include the application of networked control system theory to power systems, distributed control and optimization, and cyber-physical systems.