8th Meeting on Systems and Control Theory

Technical Program

Click on a paper title to show/hide abstract.

Monday, May 7

Talks in SF1105, Sandford Fleming Building

	Coffee break
9:30-10:30	Andrew Lewis, Queen's U.	The exponential map for time-varying and parameter-dependent vector fields
The exponential map for vector fields on a manifold does not exist, at least not in any normal sense, i.e., as a mapping from vector fields to diffeomorphisms. The difficulty, of course, is that vector fields are not generally complete. Moreover, this lack of completeness is about as debilitating as one could hope, in that the infimum over the set of initial states of the length of the time interval on which an integral curve is defined is zero. An appropriate substitute is devised for the exponential map, including in the formulation vector fields that depend on time and parameter. Two novel mathematical tools play a crucial role in the development. First, suitable topologies for sets of vector fields allow an elegant and uniform treatment of vector fields across varying regularity classes. Second, sheaves and groupoids for vector fields and diffeomorphisms allow for systematic localisation of the components of what will become the exponential map. It will be illustrated that a complicated exponential map such as is described should play an essential role in problems of local controllability.
10:30-11:00	James Forbes, McGill U.	H_inf-Optimal Parallel Feedforward Control
There are well-established feedback control methods capable of enforcing a wide range of closed-loop system properties. Arguably the most important closed-loop property is stability, which ultimately boils down to pole placement. Interestingly, feedback control is not, generally, capable of placing closed-loop zeros. Although closed-loop zeros do not impact stability, they do drastically impact closed-loop performance. Parallel feedforward control is capable of placing closed-loop zeros by creating an augmented system with a new output. Existing parallel feedforward control methods are rather limited in that they are only applicable to SISO systems and only focus on ensuring all closed-loop zeros minimum phase. Moreover, there does not exist any optimal parallel feedforward controller synthesis methods. This talk will discuss optimal parallel feedforward controller synthesis where the difference between the original system output and the augmented system output is minimized in an H_inf sense. Two parallel feedforward controller design methods will be discussed, both of which employ linear matrix inequality (LMI) tools. First, an indirect method will be discussed where a minimum gain of a negative feedback controller is maximized that in turn yields an H_inf-optimal parallel feedforward controller when inverted. Next, a direct method will be discussed that directly designs the parallel feedforward controller by imposing a nonzero minimum gain condition on the augmented system that in turn ensures that it is minimum phase. Both the direct and indirect methods can be weighted allowing for the designer to limit the difference between the true plant and augmented plant over a bandwidth. Numerical examples demonstrating the application of the direct and indirect parallel feedforward methods will be discussed.
11:30-12:00	Philip McCarthy, U. of Waterloo	Stability of a Class of Discrete-Time Dynamics on Nilpotent and Solvable Lie Algebras, with Applications
Systems on matrix Lie groups are common in engineering applications. For example, rotational and translational dynamics, oscillators, and quantum systems can be modelled on matrix Lie groups. Kinematic models on matrix Lie groups, unlike most nonlinear systems, admit exact closed-form solutions for piecewise constant inputs. This enables sampled-data control using an exact discretization. We present our results for sampled-data synchronization and regulation of systems on solvable matrix Lie groups. Solvability is a generalization of the notion of nilpotency, which is in turn a generalization of commutativity. The Lie algebra of a solvable Lie group can be decomposed into the direct sum of an abelian subalgebra and a nilpotent ideal, where the latter generates a decreasing sequence of algebraic ideals that terminates in the origin. This algebraic structure admits a decreasing sequence of dynamically invariant subspaces, which we leverage for control design. A common control objective for a network of agents is synchronization. This can mean, for example, driving a collection of oscillators to the same phase, or a fleet of vehicles to the same position and orientation. We propose a nonlinear control law, which in local coordinates resembles the classical consensus algorithm for linear systems. In the most general case, the synchronization error is locally asymptotically stable. Convergence is exponentially fast in the special cases of nilpotent and commutative Lie groups. If SO(2) is not a subgroup of the state space, then global asymptotic stability of the synchronization error to zero is achieved. Regulation is one of the main problems addressed by control theory; its goals comprise simultaneous asymptotic stability, disturbance rejection, and driving a problem-specific quantity to zero, such as reference tracking error. We consider plants on commutative matrix Lie groups driven by both discrete- and continuous-time exosystems. We propose a controller that solves the regulator problem wherever the principal matrix logarithm is well-defined. We also propose a state estimator that exists under the same conditions, and prove a separation principle. In local coordinates, the closed-loop dynamics resemble those of a linear system. The controller and state estimator are shown to be resilient to nonlinear "wrapping" phenomena, which are characteristic of dynamics on quotient spaces.
	Lunch break
13:30-14:00	Kexue Zhang, Queen's U.	Impulsive Systems with Time-Delay: Theory and Applications
Impulsive systems are dynamical systems subject to state jumps at a sequence of discrete-time moments, which are modelled by impulsive differential equations, and have applications in a variety of scenarios, including secure communication, network synchronization, and vaccination against epidemic diseases. Due to the common existence in many evolution processes, control schemes, and physical systems, time-delay has been widely considered for impulsive systems. Furthermore, time-delay is also unavoidable in sampling and transmission of the impulsive information. Study of impulsive systems with time-delay thus becomes essential. In this talk, we introduce the fundamental theory of impulsive functional differential equations, which are mathematical models of impulsive time-delay systems. Stability of impulsive systems with time-delay is also discussed, where we focus on the stability results of impulsive systems with delay-dependent impulsive effects. As applications of these stability results, consensus problems of multi-agent systems are studied via impulsive protocols. Finally, some preliminary results on distributed convex optimization problems via continuous-time algorithms with impulsive communication are discussed.
14:00-15:00	Martin Guay, Queen's U.	Output regulation for a class of unknown non-linear dynamical system with unknown disturbance dynamics
In this presentation, a design technique is proposed to solve a dynamic output regulation feedback control problem for a class of nonlinear dynamical control systems with unknown dynamics subject to unmeasured disturbances. The disturbances are generated by an exosystem whose dynamics are also assumed to be unknown. The input-output dynamics of the nonlinear system are assumed to be minimum phase with strong relative degree with known relative order. An estimation technique is used to compute simultaneously a stabilizing feedback that acts as an internal model of the unknown exosystem dynamics. The stability analysis demonstrates that the closed-loop system achieves semi-global practical asymptotic stability of the closed-loop and practical convergence of the output of the system to zero. The application of this technique for the design of distributed leader-follower formation control systems will be discussed. The presentation will also introduce two extensions of the model-free output-regulation approach to more general classes of problems. One extension that will be addressed is the problem of output regulation in the context of extremum seeking control. In this application, the regulation of an unknown nonlinear system to an unknown optimal set-point is considered. This is in contrast to the traditional output regulation where the set-point must be known a priori. The solution of this problem is complicated by the fact the desired extremum point occurs at a singularity of the system where the generation of an internal model is difficult. A second extension of the proposed technique will be to consider the solution of output regulation problems in the presence of non-minimum phase dynamics. Although such systems can be treated in a model-based formulation, the model-free approach introduces some interesting challenges that will be discussed. The objective is to provide an introduction to some new developments in the area of extremum seeking control and nonlinear adaptive control. The focus will be on design techniques that are suitable when only limited knowledge of the nonlinear systems dynamics is available. Issues related to controller performance and controller design will also be discussed.
15:00-15:30	Jon Sensinger, U. of New Brunswick	Approaching human-machine interface theory as an optimal control problem rather than a signal processing bottleneck
Imagine a system in which one agent (A) interacts with a plant or semi-autonomous agent (B). Agent A chooses which goals are worth achieving and which tasks will be needed to achieve them. This agent then sends control signals to agent B, but with substantial signal corruption, and receives partial feedback from agent B, but again with substantial signal corruption and delay. In this application, agent A is a vastly sophisticated black box and cannot be directly changed, but agent B can be optimized (albeit without knowledge of which goals or tasks for which it should be optimized). Given the fact that the goals of agent A are unknown and information transfer is heavily corrupted, this is a challenging control problem. This control problem exists in many human-machine interfaces, in which humans (agent A) interact with a machine (agent B) to complete a task. For interfaces that have substantial delays and significant corruption (such as brain-computer interfaces, myoelectric prosthesis control, or exoskeleton control), researchers have typically approached the problem from a signal processing standpoint, developing pattern recognition and machine learning approaches. Approaching the problem from a control perspective seems likely to yield better performance, if a policy can be found that analytically captures the salient details of human cognition, remains agnostic to task formation in generating control policy, and handles some of the quirks prominent in humans such as substantial multiplicative noise. Our lab has approached this control problem by modeling agent A as an adaptive optimal stochastic feedback controller. Optimal stochastic feedback controllers analytically describe humans surprisingly well, in part because even if humans have not reached optimal solutions, they typically strive for them, and in part because most stochastic human processes are well modeled as Gaussian distributions. This approach produces control policies that are agnostic to specific tasks (for example, in some cases the same policy will result in a force-control law; in others a motion-control law; and in others a complex control law). It can explain many of the unique aspects of human-machine interfaces (such as exploiting uncontrolled manifolds, or generating synergies). Optimal stochastic feedback control has emerged as the leading framework within computational motor control over the last 15 years. For our purposes, however, we are more interested in understanding how the analytic model of agent A is used to optimize the dynamics and observability of agent B. Because the model for agent A is analytically based, it provides substantial insight into how the design of agent B indirectly affects the control decisions of agent A, in turn enabling optimization of the multi-agent system. In this presentation we will provide a brief overview of the success that optimal stochastic feedback control has had in describing human motor control. We will then briefly describe some of our modeling and psychophysics experiments that verify that human cost functions are well approximated by analytically tractable (e.g., quadratic) variants of this framework. We will end the talk by discussing some of the control frameworks we have made in designing agent B, and their ability to succeed in goals that have remained elusive to researchers for over 40 years. Throughout the talk, we will highlight open questions to the control theoretic community.
	Coffee break
16:00-16:30	Miaomiao Wang, U. of Western Ontario	Globally Exponentially Stable Nonlinear Observers for 3D Inertial Navigation
The development of reliable pose (orientation and position) and linear velocity estimation is instrumental in many applications such as autonomous underwater vehicles and unmanned aerial vehicles. Classical pose and linear velocity estimation approaches are based on nonlinear filtering techniques such as extended Kalman filter (EKF), unscented Kalman filter (UKF) or particle filter. However, there is no general proof of global convergence for these filters. Recently, nonlinear observers on Lie groups, which take into account the topological properties of the motion space, have made their appearance in the literature. In this talk, we will discuss the design of globally exponentially convergent nonlinear hybrid observers on Lie groups, relying on an inertial measurement unit (IMU) and landmark measurements. To the best of our knowledge, this type of observers on Lie groups, endowed with such strong stability properties, has never been proposed in the literature.
16:30-17:30	Daniel Miller, U. of Waterloo	Classical Adaptive Control Revisited: Linear-Like Convolution Bounds and Exponential Stability
While the original classical parameter adaptive controllers do not handle noise or unmodelled dynamics well, redesigned versions have been proven to have some tolerance; however, exponential stabilization and a bounded gain on the noise is rarely proven. Here we consider a classical pole placement adaptive controller using the original projection algorithm rather than the commonly modified version; we impose the assumption that the plant parameters lie in a convex, compact set. In our estimator we restrict the parameter estimates to this convex set, and we demonstrate that the corresponding adaptive controller ensures that the closed-loop system exhibits a very desirable property: there are linear-like convolution bounds on the key variables, which confers exponential stability and a bounded noise gain; these properties, in turn, can be leveraged to prove tolerance to unmodelled dynamics and plant parameter variation. We emphasize that there is no persistent excitation requirement of any sort; the improved performance arises from the vigilant nature of the parameter estimator.
17:30-18:00	Mohamad Shahab, U. of Waterloo	Multi-Estimator-Based Adaptive Control Which Provides Exponential Stability: The First-Order Case
Classical adaptive controllers provide asymptotic stabilization; neither exponential stabilization nor bounded noise gain is typically proven. In recent work it is shown that these desired results can be achieved by using an estimator based on the original ideal projection algorithm (together with a restriction of the parameter estimates to a given compact convex set), rather than using the common-used modified classical algorithm. Here the goal is to remove the convexity requirement. To this end, we consider the first-order case with unknown plant parameters belonging to a closed and bounded (compact) uncertainty set of controllable pairs. The first step of our approach is to observe that the compact uncertainty set can be covered by two disjoint convex compact sets, each of controllable pairs. For each of the two convex compact sets, we design an estimator together with the corresponding one-step-ahead controller. We apply a switching logic to choose between these two choices. We prove that the resulting controller guarantees linear-like convolution bounds on the closed-loop behavior, which implies exponential stability and a bounded noise gain.

Tuesday, May 8

Talks in BA1180, Bahen Centre for Information Technology

	Coffee break
9:00-9:30	Minyi Huang, Carleton U.	Linear Quadratic Mean Field Games: asymptotic solvability and the fixed point approach
Mean field game theory has been developed largely following two routes by different researchers. One may solve a large-scale game first and derive a limit for the solution when the population size increases, which can be called the direct (or bottom up) approach (see e.g. Lasry and Lions, 2006, 2007). The second route is to solve an optimal control problem of a single agent based on mean field approximations and formalize a fixed point problem, and this is called the fixed point (or top-down) approach (see e.g. Huang, Caines, and Malhame, 2003, 2006, 2007). So far the investigation of the connection between the two approaches is scarce. In this work we contribute to this direction within the framework of linear-quadratic (LQ) mean field games with a finite time horizon. We first introduce an asymptotic solvability notion in LQ games, which means for all sufficiently large population sizes, the corresponding game has a set of feedback Nash strategies in addition to a minor regularity condition. This formulation falls in the direct approach but differs from many existing works in this category since we do not restrict to decentralized information from the beginning. We provide a necessary and sufficient condition for asymptotic solvability and show that in this case the solution converges to a mean field limit. This is accomplished by developing a re-scaling method to derive a low dimensional ordinary differential equation (ODE) system, where a non-symmetric Riccati ODE has a central role. We next review the well studied two point boundary value (TPBV) problem in the fixed point approach, and describe the necessary and sufficient condition for its solvability. We show that asymptotic solvability implies feasibility of the fixed point approach, but the converse is not true. We further examine the long time behavior of the Riccati ordinary differential equations in the asymptotic solvability problem and address non-uniqueness of solutions in the TPBV problem. This is joint work with Mengjie Zhou
9:30-10:30	Roland Malhamé, Polytechnique Montréal	Min_LQG games and collective discrete choice problems
We introduce a novel class of finite horizon linear quadratic Gaussian games involving distinct potential finite destination states, interpreted as discrete choices under social pressure. The model provides stylized interpretations of opinion swings in elections, the dynamics of discrete societal choices, as well as a framework for achieving communication constrained group decision making in micro‐robotic based exploration. Two distinct cases are considered: (i) The zero noise or “deterministic” case where agents are initially randomly distributed over their range space; (ii) The fully stochastic case. Under mild technical conditions, the existence of epsilon ‐Nash equilibria is established in both cases although these equilibria may in general be multiple. The corresponding agent control strategies are of a decentralized nature and are characterized in each case by the fixed points of a specific finite dimensional operator. Individual agent destination choices are fixed at the outset in case (i), while by contrast, their probability distribution evolves randomly along trajectories in case (ii), with a deterministic limit for the complete population as the latter grows to infinity. This is joint work with Rabih Salhab and Jérôme Le Ny.
10:30-11:00	Mengjie Zhou, Carleton U.	Mean Field Games with Poisson Jumps and Impulse Control: Threshold Policies
We consider mean field games (MFG) in a continuous time competitive Markov decision process framework. Each player's state has pure jumps modelled by a self-weighted compound Poisson process subject to impulse control. We focus on analyzing the steady-state equation system of the mean field game. The best response is determined as a threshold policy and the the stationary distribution of the state is derived in terms of the threshold value. Further we investigate the existence of a solution to the general MFG equation via a fixed point theorem. We consider a specific family of the cost rate function as an example. We find that we can solve for an explicit form of the stationary state distribution under threshold policy by solving a Volterra-type integral equation. This is joint work with M. Huang
11:30-12:00	Peter Caines, McGill U.	Stability of Receding Horizon Control with Smooth Value Functions
Receding Horizon Control (RHC) is a very effective control methodology which has been employed in an extensive range of industrial applications. The stability of systems under RHC is of great importance for its application and has been abundantly studied. However, most of the stability results involve terminal costs or constraints which are sometimes not computationally desirable. Recent studies consider the stability of systems under RHC without such terminal costs or constraints. In this work, it is shown that the smoothness of the value function is sufficient to ensure stability for control affine systems under RHC laws with no terminal cost. In order to find the infimum for all stabilizing horizons, a simple ODE problem based on the linearized system is developed that provides the set of stabilizing and destabilizing horizons. It is shown that under certain conditions, the exact infimum for stabilizing horizons can be found without the need to solve the nonlinear optimal control problem. Simulations are provided to illustrate the application of these methods to some nonlinear systems.
	Lunch break
13:30-14:00	Zach Kroeze, U. of Toronto	Motion Primitives for Integrator Systems for Control Problems with LTL Specifications
Some of the most recent advancements in the field of control theory deal with control problems with complex specifications. One method for approaching such problems is to encode the low-level dynamics with a transition system, and encode the high-level specifications with Linear Temporal Logic (LTL). Current research in this area either assumes a transition system is given, or constructs the transition system using a triangulation of the state space and assumes an affine feedback can be found on each simplex of the triangulation. In either case, there are no guarantees that the low-level control design can solve the high-level specifications. We propose a novel low-level control design based on Reach Control theory which yields a correct by design transition system. The power of this method is that it can leverage the breadth of research already in place to solve high level LTL specifications. This presentation will cover the construction of the low-level controllers for integrator systems of arbitrary dimensions. The design is based on atomic motion primitives, canonical behaviours in the output space that are realized by way of closed-loop vector fields in the state space. We design three atomic motion primitives: Hold, Forward, and Backward. The associated closed-loop system behaviours correspond to: the output remaining within a segment [0,d], the output leaving [0,d] through y=d, the output leaving [0,d] through y=0. These atomic motion primitives can then be combined to achieve more complex behaviours in higher dimensions. Although designing closed-loop feedbacks to achieve the desired motion primitives for low dimensions is not complex, the problem becomes infeasible for large dimensions. We utilize an inductive method to design the motion primitives for an integrator system of arbitrary dimension. That is, we design the motion primitives for dimension k, and through dynamic extension, design the motion primitives for dimension k+1. We conclude by stating additional properties of the motion primitives which validate their use for existing research for LTL specifications.
14:00-15:00	Xiang Chen, U. of Windsor	Control of Discrete-Time Systems with Quantized Lossy Channel
Quantized lossy channels are normally carried to model the transmission nature in networked control systems. In this talk, the stabilizing control problem for discrete time systems with both quantization error and multiplicative random noise in the actuating channel (quantized lossy channel) is discussed. First, the quadratic mean square (QMS) stability is introduced in wake of the natures of both structured uncertainty and stochastic noise in the system, which naturally enables the application of the mixed H2 /H°ﬁ design to tackle the underlying problem. Both the state feedback and the observer based output feedback stabilizing control are presented for such a system in the sense of QMS. The sufficient and necessary conditions are established with a class of output feedback stabilizing controllers characterized in this case. It is also shown that the same framework could be applied to address the multi-objective control of the discrete time system over the lossy actuating channel.
15:00-15:30	Serdar Yuksel, Queen's U.	Decentralized Stochastic Control: Structural, Existence and Approximation Results
This talk is concerned with decentralized stochastic control (or dynamic team) problems and their optimal solutions. After a review on decentralized stochastic control, strategic measures for such problems will be introduced; these are the probability measures induced on the space of measurement and action sequences by admissible decentralized control policies which satisfy various conditional independence properties. Conditions ensuring the compactness of sets of strategic measures will be established for both static and dynamic teams. These will lead to existence results on optimal solutions for both static and dynamic teams. The results are applicable to teams which are either static or static reducible, as well as teams which are classical. Properties such as convexity and Borel measurability of the sets of strategic measures will be studied; it will be shown that measures induced by deterministic policies form the extreme points of a properly expanded set of strategic measures, thus establishing the optimality of such policies. It will be shown that such sets are Borel, leading to positive implications for a general form of dynamic programming for sequential dynamic teams, which allows Hans Witsenhausen's standard form for sequential stochastic control to be formulated in a well-defined recursive form for general spaces. Furthermore, this approach will give rise to very weak conditions for the existence of optimal control policies, significantly generalizing earlier results. Finally, through a proper approximation of the sets of strategic measures by those induced with quantization of measurement and action spaces, asymptotic optimality of FInite model representations for a large class of dynamic team problems will be established. These lead to asymptotic optimality of quantized control policies. The celebrated counterexample of Witsenhausen will be discussed throughout the talk to illustrate the salient aspects of information structures in decentralized stochastic control, and demonstrate the structural, existence and approximation results. In part, joint work with Naci Saldi and Tamas Linder
	Coffee break
16:00-16:30	Ali D. Kara, Queen's U.	Robustness to Incorrect System Models in Stochastic Control
We study robustness properties of partially observed Markov decision processes when we have incorrect information about the transition kernels. Suppose a decision maker has an incomplete model for the transition kernel of the process and finds an optimal control policy for this incomplete model. Thus, the calculated policy is not optimal for the actual system. The question we ask is the following: If the kernel model available to the controller converges to the true kernel model, does the cost value derived by the optimal control policy achieved for incorrect kernel converge to the true optimal cost for the discounted cost setup? To address this problem we define a number of convergence notions corresponding to weak, setwise and total variation convergence of probability measures in the context of controlled transition kernels. Using counterexamples we show that the answer to the question is not positive in general for weak and setwise convergence, which the result is positive under the stronger notion of total variation convergence. However, by putting some regularity conditions on the measurement models (such as additive noise measurement systems) and on the kernel itself (such as weak continuity), we show that the robustness can indeed be achieved even under weak convergence of transition kernels. In particular, compared to the existing literature, we obtain strictly refined and strong robustness results that are applicable even under the errors that can be investigated under weak convergence criteria, in addition to total variation criteria. We apply our results to a number of case studies, such as to deterministic robust control problems as well as to empirical learning in (data-driven) stochastic control. In particular, suppose that an estimate of a controlled transition kernel is obtained using empirical data from the evolving system. We will show that under mild conditions weak convergence of the estimated kernel to the true kernel takes place almost surely, whereas total variation convergence typically does not occur in the absence of stronger conditions. These results imply empirical consistency properties leading to practically useful robustness results for data-driven learning models, since often, in engineering applications, system models are learned through training data. Joint work with Serdar Yuksel
16:30-17:30	Aditya Mahajan, McGill U.	Optimal decentralized control of partially nested teams: sufficient statistics and separation of estimation and control
One of the most celebrated results in centralized stochastic control of linear quadratic systems is the two-way separation between estimation and control: the optimal control action is a linear function of the state estimate, where the control gain is the same as the gain for the system with perfect state observation; furthermore, the state estimate is computed recursively using linear filtering equation, where the filtering gain is the same as the gain for the system without any control inputs. The state of affairs in decentralized stochastic control is not so elegant. In decentralized control of linear quadratic systems, non-linear strategies can outperform linear strategies (as illustrated by the Witsenhausen counterexample). Even if attention is restricted to linear strategies, the best linear strategy may not have a finite dimensional representation (as illustrated by Whittle and Rudge counterexample). Even in cases where it can be shown that the optimal control is a linear function of the state estimate, the corresponding Riccati and Kalman filtering equations are coupled and cannot be solved separated (as illustrated by different variations of the two player problem with partially nested teams). Thus, it is not always possible to identify sufficient statistics and even in instances where sufficient statistics can be identified, there is no separation between estimation and control. In this paper, we argue that the reason that separation results have not been established for general partially nested teams is because decentralized estimation is fundamentally different from centralized state estimation. We argue that when this difference is taken in account, the separation between estimation and control becomes apprarent. We illustrate out point by using the asymmetric one-step delay sharing model with two agents. Joint work with Mohammad Afshari
17:30-18:00	Mohammad Akbari, Queen's U.	Distributed Online Optimization over Time-varying Networks
Many scenarios concerning the coordination of multi-agent systems can be addressed in the framework of distributed optimization problems in which a group of agents cooperatively aim to minimize a common objective. The problem has a variety of application including localization, robust estimation, and formation control. Practical scenarios of distributed optimization often take place in highly dynamic environments, where uncertainty plays a central role, e.g., estimation using sensor networks where the observations of the sensors change with time due to the presence of noise. Hence, the estimation error, which is modeled as a sum of individual cost functions, changes over time. In particular, in some key scenarios the change in the individual cost functions is highly nonstationary, and cannot be well modeled by a random process. One way to address the unpredictability of the changes in the individual cost functions is to consider the so-called ``online'' optimization setting. In online optimization, a decision maker observes a sequence of cost functions, in which, at each time step, the cost function is revealed only after an state is chosen by the decision maker and it incurred a cost. In this setting, due to lack of access to the cost functions before the decision is made, the decision maker faces a so-called ``regret'', which is the difference between its accumulated cost over time and the accumulated cost incurred by the best fixed decision, assuming that all functions are known in advance. The objective in the online optimization problem is to design a protocol that leads to bounding the regret sublinearly. In the distributed version of online optimization problem, however, the group of agents are exchanging information over a communication network which is modeled using a graph. In fact, this communication structure is often time-varying. At each time step, each agent has access to its own state and the state of its neighbors at that time, along with information on its previous cost functions. Agents make decision on their future state using this information. As soon as the new private cost function is revealed to the agent, it will face an ``individual'' regret, which is the difference between the accumulated network cost incurred by the agent's state estimation and the accumulated cost incurred by the best fixed state made by a decision maker who has access to the cost functions in advance. The main objective of this work is to design an algorithm, which each agent will apply, such that we can still guarantee a sublinear bound on the individual regrets. The limited nature of the available information and the time-varying aspect of the communications graphs add heavily to the complexity of this problem. We prove under the assumption that the communication graph is directed and time-varying, and uniformly strongly connected, and the cost functions are strongly convex and have bounded subgradients, that our proposed algorithm achieves a sublinear bound on the individual regret. Rate of convergence results for the regret are also derived.

Poster session on May 8

Posters are displayed accross from BA1180 during coffee breaks, 11:00AM-11:30AM and 3:30PM-4PM

Sajjad Edalatzadeh, U. of Waterloo	Optimal Actuator Design for Semi-Linear Systems
Actuator location and design are important design variables in controller synthesis for distributed parameter systems. Finding the best actuator location or shape to control a distributed parameter system can significantly reduce the cost of the control and improve its effectiveness. The numerous practical applications in flexible structures, fluids, heat conductors, and chemical processes have made this area an active area of research in recent years. A number of theoretical results have been obtained for linear models but little for nonlinear models. Many systems of course contain nonlinearities. One example is control of vibrations in railway tracks. Railway tracks are rested on ballast known for exhibiting nonlinear behavior. Placement of actuators and control of vibrations of railway tracks are of interest in the engineering literature. This poster presents our results on optimal actuator design for a general class of abstract semilinear systems. Conditions that guarantee the existence of an optimizer and also explicit optimality equations are described. Many nonlinearities satisfy these conditions, including the standard model for railway tracks.
Longhao Qian, U. of Toronto	Path Following Control of Multiple Quadrotor UAVs Carrying A Rigid-body Slung Payload
The nonlinear modeling and control design of multiple quadrotors carrying a slung payload is to be presented. Slung payload delivery is an emerging application for quadrotors which shows promise for commercial applications. Lifting a payload using cables connected to the vehicles can offer advantages such as increased energy efficiency, and enhanced payload capacity by performing cooperative payload sharing. In this application, the path following controller is an essential ingredient in achieving autonomous, stable flight. The system°Øs under-actuation and complex nonlinear multi-body dynamics render the nonlinear control design sophisticated and non-intuitive. The governing set of multi-body equations of motion, obtained by using Kane's method, are presented. The equations of motion completely describe the payload swing motion and the coupling effect between the quadrotors and the payload. A model-based nonlinear controller has been proposed to achieve asymptotic path following. The generic nonlinear controller is capable of stabilizing a rigid-body payload carried by 2 or more quadrotors. Simulation results and important aspects of stability proof are also presented.
Yinan Li, U. of Waterloo	Robustly Complete Control Synthesis via Interval Analysis
This talk considers the control synthesis problem for nonlinear systems with respect to temporal logic specifications. Study in this area is pushed forward by the rising demand of understanding and control of cyber-physical systems, which is challenging because of hybrid dynamics and complex specifications. A motivating example is the robot motion planning problem. While being subject to mechanical constraints and dynamics, robots are designed to fulfill tasks such as pickup-delivery, parts assembly, surveillance and persistent monitoring. Such control specifications can be effectively expressed by temporal logic. To deal with hybrid dynamics and also automate the control design process, we proposed algorithmic control synthesis methods which are proved to be sound and robustly complete in the sense that control strategies can be found whenever the given specification is robustly realizable. In this talk, we focus on fixed-point characterization of the initial states satisfying key temporal logic specifications for control purposes (e.g. invariance, reachability, Buchi and co-Buchi) and practical computation of control strategies over continuum state space. We will show how interval arithmetic and a branch-and-bound scheme are integrated to guarantee the soundness and robust completeness of our proposed algorithms. The performance of the proposed algorithms will be demonstrated using several examples drawn from different applications and our recently developed software tool - ROCS.
Dena Firoozi, McGill U.	Mean Field Game Systems with Switching and Stopping Strategies: A Hybrid Optimal Control Approach
In this work a novel framework, combining Mean Field Games (MFG) and Hybrid Optimal Control (HOC) theory, is presented to obtain a unique $\epsilon$-Nash equilibrium for a non-cooperative game with switching and stopping times. We consider the case where there exists one major agent with significant influence on the system together with a large population of minor agents (within two subpopulations), each with individually asymptotically negligible effect on the whole system. Each agent has stochastic linear dynamics with quadratic costs, and the agents are coupled in their dynamics by the average state of minor agents (empirical mean field). The hybrid feature enters via the indexing by discrete states (i) the switching of the major agent or (ii) cessation of one or both subpopulations of minor agents. Optimal switching and stopping time strategies together with best response control actions for, respectively, the major agent and each of the mass of minor agents are established so as to yield the equilibrium. Joint work with Ali Pakniyat (U. Michigan) and Peter E. Caines (McGill)
Sina Sanjari, Queen's U.	On the optimality relation between stochastic teams with finitely and infinitely many agents
A stochastic team is a group of agents acting together on a noisy environment to optimize a common cost function, but not necessarily sharing all the available information. We will present sufficient conditions under which a sequence of team optimal policies of static teams with countably infinite decision makers identify as a limit of the sequence of team optimal policies of static teams with $N$ number of decision makers as $N \to \infty$. This approach is a counterpart of the more commonly utilized method (in mean-field theory) where one first studies the infinite limit and tries to infer $\epsilon$-optimality for large N; this approach typically only establishes person-by-person-optimality and not global optimality for team problems. In contrast, our approach is able to establish the global optimality for the infinite number of decision makers setup. We will also investigate the symmetric setup as an important special case, where stronger convergence results can be obtained. For the symmetrically optimal teams, we will show that under mild conditions, team optimal policies of static teams with countably infinite decision makers identify as a limit of the sequence of team optimal policies of static teams with $N$ number of decision makers as $N \to \infty$. In contrast, for the general case, under more restrictive conditions, we will show that if team optimal policies of decision makers $i=1,\dots,N$ of static teams with $N$ number of decision makers converges uniformly, this determines a team optimal policy of decision makers for the static team with countably infinite number of decision makers. We will also present applications of our results, including Linear Quadratic Gaussian (LQG) teams whose costs are coupled through the state and/or control. In addition, we will show that the team optimal policy for LQG teams with a classical information structure can be obtained using the presented technique. This is important since this result, while well-known in the stochastic control literature using standard dynamic programming techniques, has not been investigated using the static reduction method and hence this approach can be viewed as a first step to address the optimality for the infinite-horizon partially nested dynamic LQG problems, for which the optimality of linear policies is in general an open problem to our knowledge. Joint work with Serdar Yuksel
Sadegh Rahnamoon, U. of Toronto	State-Based Control of Timed Discrete-Event Systems
The problem of supervisor synthesis for discrete-event systems (DES) has been proven to be NP-hard. Nevertheless, state tree structures (STS) have turned out to be a computationally efficient framework to design supervisors for very large-scale systems. But this framework relies on a state-based control theory. To expand STS to timed systems, known as timed discrete-event systems (TDES), a state-based control theory for TDES is required. In this work, such a theory will be introduced and it will be proven that this approach is indeed equivalent to the existing language-based control theory for TDES.

Wednesday, May 9

Talks in SF1101, Sandford Fleming Building

9:00-10:00	Bahman Gharesifard, Queen's U.	Bilinear Control Systems: Old and New
I present some recent results related to stabilization and controllability of sparse bilinear control system, where by “sparse’’ it is meant that the underlying Lie algebra of matrices is restricted to belong to a vector subspace given by a zero pattern. Depending on the time, I may also present a recent work on the so-called “distinguished sets’’ of a semisimple Lie algebra and explain their role in ensemble controllability of bilinear control systems, as well as constructive controllability of bilinear control systems. The main objective of the talk is educational, nevertheless: to familiarize the students with some classical results on bilinear control systems.
10:00-11:00	Ilia Polushin, U. of Western Ontario	Scattering-Based Stabilization of Networks of Dissipative Systems
In this presentation, an overview of recently developed methods for scattering-based stabilization of interconnections of nonlinear dissipative systems is presented. Scattering transformation techniques have been used in the theory of electric networks, particularly transmission lines and networks with delays, since the middle of twentieth century. In teleoperation systems, scattering-based stabilization is currently among the most popular methods to deal with instabilities caused by force reflection in the presence of communication delays. The stabilizing effect of the conventional scattering transformation is based on the fact that scattering operator transforms a passive system into a system with L2-gain less than or equal to one. In this talk, we discuss methods for extension of scattering-based stabilization techniques to the case of interconnections of arbitrary dissipative systems with quadratic supply rates. The extension is based on a generalization of the notion of conic systems to non-planar case, where the cone's center is a subspace with dimension generally speaking greater than one. The class of non-planar conic systems is quite general and, in fact, coincides with that of dissipative systems with quadratic supply rates. For a feedback interconnection of non-planar conic systems, we discuss a graph separation stability condition for finite-gain $L2$-stability which is derived in terms of parameters of systems’ dynamic cones. Subsequently, we present a procedure for design of generalized scattering transformations that allow for rendering the dynamic characteristics of a non-planar conic system into an arbitrary prescribed cone with compatible dimensions. The ability of the scattering transformation to change the parameters of a system’s dynamic cone can be used for stabilization purposes. Specifically, stability of interconnections of non-planar conic systems can be achieved by designing scattering transformation(s) that render the subsystems's cones in such a way that an appropriate stability condition (a graph separation condition in the non-delayed case, or a small-gain condition in the presence of communication delays) is satisfied. The developed scattering transformation techniques are applied to the problem of stabilization of interconnections of non-planar conic systems, with and without communication delays. Applications to coupled stability problem in robotics, stabilization of networks of dissipative systems with multiple communication delays, and force-reflecting teleoperation are discussed.
11:00-11:30	Shuang Gao, McGill U.	Graphon-LQR Control of Complex Networks of Linear Systems
To achieve control objectives for extremely complex and very large scale networks using standard methods is essentially intractable. In this work, we further develop our previously proposed graphon control methodology to approximately regulate complex network systems by the use of graphon theory and the theory of infinite dimensional systems. Conditions on the exact controllability and the approximate controllability of graphon dynamical systems are investigated. Approximation schemes are developed to generate control law to regulate large network systems with linear quadratic costs. Then the convergence property of states and that of costs under the approximation schemes are proved. Finally, examples of the application of graphon-LQR control to complex networks with randomly sampled weightings are demonstrated.
11:30-12:00	Jayakumar Subramanian, McGill U.	Renewal Monte Carlo: Renewal Theory-Based Reinforcement Learning Algorithm
In recent years, reinforcement learning has emerged as a leading framework to learn how to act optimally in unknown environments. Policy gradient methods have played a prominent role in the success of reinforcement learning. Such methods have two critical components: policy evaluation and policy improvement. In policy evaluation step, the performance of a parameterized policy is evaluated while in the policy improvement step, the policy parameters are updated using a stochastic gradient ascent. Monte Carlo (MC) and temporal difference (TD) are the two methods used for policy evaluation. In MC methods, performance of a policy is estimated using the return of a single sample path; in TD methods, the value(-action) function is guessed and this guess is iteratively improved using temporal differences. MC methods are attractive because they have zero bias, are simple and easy to implement, and work for both discounted and average reward setups as well as for models with continuous states and actions. However, they suffer from various drawbacks. First, they have a high variance because a single sample path is used to estimate performance. Second, they are not asymptotically optimal for infinite horizon models because it is effectively assumed that the model is episodic; in infinite horizon models, the trajectory is arbitrarily truncated to treat the model as an episodic model. Third, the policy improvement step cannot be carried out in tandem with policy evaluation. One must wait until the end of the episode to estimate the performance and only then can the policy parameters be updated. It is for these reasons that MC methods are largely ignored in the literature, which almost exclusively focusses on TD methods. In this paper, a new Monte Carlo based online reinforcement learning algorithm, called Renewal Monte Carlo (RMC), for infinite horizon models with a designated start state is introduced. RMC retains the advantages of the Monte Carlo approach including low bias, simplicity and ease of implementation, while circumventing its key drawbacks of high variance and delayed (end of episode) updates. RMC works for both discounted and average reward setups for models with discrete and continuous state and action spaces. To the best of our knowledge, this is the first time renewal theory based methods are used for the discounted reward case in Reinforcement Learning. The key idea of RMC is as follows. Under any reasonable policy, the closed loop system is positive recurrent and there are states that are visited infinitely often. One of these recurrent states is picked as a reference state. Successive visits to the reference state can be viewed as a regenerative process. When the reference state is same as the start state, the expected discounted reward for infinite horizon is equal to the ratio of the expected discounted first passage reward and the expected discounted first passage time. Using this idea, sample path based estimates of performance and performance derivatives are derived. We show that the policy parameters of a parametrized policy converge to a local optimal using this algorithm. Detailed numerical studies are presented to illustrate the performance of RMC.