An introduction to adaptive control and reinforcement learning for discrete-time deterministic
linear systems. Topics include: discrete-time state space models; stability of discrete time
systems; parameter adaptation laws; error models in adaptive control; persistent excitation;
controllability and pole placement; observability and observers; classical regulation in
discrete-time; adaptive regulation; dynamic programming; value and policy iteration;
Q-learning. Labs involve control design using MATLAB.
The following table shows the lecture topics. The events column shows
suggested reading from the course notes (distributed on Quercus) as well as
quiz and exam dates. This schedule may be updated as the semester progresses,
so it's a good idea to check this webpage periodically.
| Week |
Date |
Lecture |
Topics |
Weekly Events |
| 1 |
Jan 6 |
1 |
Introduction |
|
| |
|
2 |
State space models, SS --> TF, controllable and observable canonical forms |
|
| |
|
3 |
Time response |
|
| 2 |
Jan 13 |
4 |
Solution of SS models, computing A^k |
|
| |
|
5 |
Transient response and pole locations |
|
| |
|
6 |
Stability for discrete-time systems |
|
| 3 |
Jan 20 |
7 |
Lyapunov's method |
Quiz 1, Jan 19 |
| |
|
8 |
Stability for LTI systems |
|
| |
|
9 |
Controllability, Pole placement |
|
| 4 |
Jan 27 |
10 |
Deadbeat control, PBH test |
|
| |
|
11 |
Adaptive control: static error model |
|
| |
|
12 |
Adaptive control: dynamic error model |
|
| 5 |
Feb 3 |
13 |
Adaptive control: static error model theory |
Quiz 2, Feb 2 |
| |
|
14 |
Adaptive control: dynamic error model theory |
|
| |
|
15 |
Adaptive control: dynamic error model theory |
|
| 6 |
Feb 10 |
16 |
Observability, observers, separation principle |
|
| |
|
17 |
Lab 2 preparation |
|
| |
|
18 |
Dynamic programing: finite horizon |
|
| |
Feb 17 |
|
Reading Week
| |
| 7 |
Feb 24 |
19 |
Dynamic programming: infinite horizon |
Quiz 3, Feb 23 |
| |
|
20 |
Dynamic programming: value and policy iterations |
|
| |
|
21 |
Offline value and policy iterations using Q functions |
|
| 8 |
Mar 3 |
22 |
Reinforcement learning: temporal difference error |
|
| |
|
23 |
Lab 3 preparation |
|
| |
|
24 |
Reinforcement learning: value function approximation |
|
| 9 |
Mar 10 |
25 |
Midterm, March 10, 6-8pm |
|
| |
|
26 |
Reinforcement learning: value function approximation |
|
| |
|
27 |
Reinforcement learning: Q functions |
|
| 10 |
Mar 17 |
28 |
Reinforcement learning: online policy and value iterations |
|
| |
|
29 |
Lab 4 preparation |
|
| |
|
30 |
Regulator problem |
|
| 11 |
Mar 24 |
31 |
Internal model principle, regulator design |
Quiz 4, Mar 23 |
| |
|
32 |
Adaptive regulator problem |
|
| |
|
33 |
Adaptive regulator design |
|
| 12 |
Mar 31 |
34 |
Adaptive regulator design |
|
| |
|
35 |
TBD |
|
| |
|
36 |
TBD |
|
Labs are Matlab-based and performed in groups of two or three in BA3114. You may select your own
lab partners, or your assigned practical TA can help you form a group.
Each team presents their preparation to the lab TA at the start of the lab session.
Each team submits their exported Matlab Livescript (including prep section) as a pdf or html by
5pm on the due date listed below. The preparation + lab are
worth 2 + 8 = 10 marks. Labs start at 10 minutes
after the hour. Late or no shows will receive a deduction of 8 marks.
Instructions for graduate students only: you may work on your own and you do not need to
attend a lab session. Submit both the preparation and the report as one submission on Quercus
by the due date listed below.