An introduction to adaptive control and reinforcement learning for discrete-time deterministic
linear systems. Topics include: discrete-time state space models; stability of discrete time
systems; parameter adaptation laws; error models in adaptive control; persistent excitation;
controllability and pole placement; observability and observers; classical regulation in
discrete-time; adaptive regulation; dynamic programming; value and policy iteration;
Q-learning. Labs involve control design using MATLAB.
The following table shows the lecture topics. The events column shows
suggested reading from the course notes (distributed on Quercus) as well as
quiz and exam dates. This schedule may be updated as the semester progresses,
so it's a good idea to check this webpage periodically.
Week |
Date |
Lecture |
Topics |
Weekly Events |
1 |
Jan 6 |
1 |
Introduction |
Chapter 2 |
|
|
2 |
Difference equations, z-transforms |
|
|
|
3 |
Solving difference equations using z-transforms, transfer functions |
|
2 |
Jan 13 |
4 |
State space models, SS --> TF, controllable and observable canonical forms |
|
|
|
5 |
Time response |
Quiz 1 |
|
|
6 |
Solution of SS models, computing A^k |
|
3 |
Jan 20 |
7 |
Transient response and pole locations |
|
|
|
8 |
Stability for discrete-time systems |
Chapter 3 |
|
|
9 |
Lyapunov's method |
|
4 |
Jan 27 |
10 |
Stability for LTI systems |
|
|
|
11 |
Controllability, Pole placement |
Quiz 2 |
|
|
12 |
Deadbeat control, PBH test |
Chapter 4 |
5 |
Feb 3 |
13 |
Adaptive control: static error model |
|
|
|
14 |
Adaptive control: dynamic error model |
|
|
|
15 |
Adaptive control: static error model theory |
Chapter 5 |
6 |
Feb 10 |
16 |
Adaptive control: dynamic error model theory |
|
|
|
17 |
Adaptive control: dynamic error model theory |
Quiz 3 |
|
|
18 |
Observability, observers, separation principle |
|
|
Feb 17 |
|
Reading Week
| |
7 |
Feb 24 |
19 |
Lab 2 preparation |
Chapter 6 |
|
|
20 |
Regulator problem |
|
|
|
21 |
Internal model principle, regulator design |
|
8 |
Mar 3 |
22 |
Adaptive regulator problem |
Chapter 7 |
|
|
23 |
Adaptive regulator design |
|
|
|
24 |
Adaptive regulator design |
|
9 |
Mar 10 |
25 |
Midterm, March 10, 5-7pm |
|
|
|
26 |
Lab 3 preparation |
Chapter 8 |
|
|
27 |
Dynamic programing: finite horizon |
|
10 |
Mar 17 |
28 |
Dynamic programming: infinite horizon |
|
|
|
29 |
Dynamic programming: value and policy iterations |
|
|
|
30 |
Offline value and policy iterations using Q functions |
|
11 |
Mar 24 |
31 |
Lab 4 preparation |
|
|
|
32 |
Reinforcement learning: temporal difference error |
Quiz 4 |
|
|
33 |
Reinforcement learning: value function approximation |
Chapter 9 |
12 |
Mar 31 |
34 |
Reinforcement learning: value function approximation |
|
|
|
35 |
Reinforcement learning: Q functions |
|
|
|
36 |
Reinforcement learning: online policy and value iterations |
|
Labs are Matlab-based and performed in groups of two or three in BA3114. You may select your own
lab partners, or your assigned practical TA can help you form a group.
Each team submits a preparation on Quercus at the start of the lab session. Each team submits
their exported Matlab Livescript as a pdf or html by 5pm on the due
date listed below. The preparation + lab are worth 2 + 8 = 10 marks.
Instructions for graduate students only: you may work on your own and you do not need to
attend a lab session. Submit both the preparation and the report as one submission on Quercus
by the due date listed below.