An introduction to adaptive control and reinforcement learning for discrete-time deterministic
linear systems. Topics include: discrete-time state space models; stability of discrete time
systems; parameter adaptation laws; error models in adaptive control; persistent excitation;
controllability and pole placement; observability and observers; classical regulation in
discrete-time; adaptive regulation; dynamic programming; value and policy iteration;
Q-learning. Labs involve control design using MATLAB.
The following table shows the lecture topics. The events column shows
suggested reading from the course notes (distributed on Quercus) as well as
quiz and exam dates. This schedule may be updated as the semester progresses,
so it's a good idea to check this webpage periodically.
Week |
Date |
Lecture |
Topics |
Weekly Events |
1 |
Jan 9 |
1 |
Introduction |
Chapter 2 |
|
|
2 |
Difference equations, z-transforms |
|
|
|
3 |
Solving difference equations using z-transforms, transfer functions |
|
2 |
Jan 16 |
4 |
State space models, SS --> TF, controllable and observable canonical forms |
|
|
|
5 |
Time response |
Quiz 1 |
|
|
6 |
Solution of SS models, computing A^k |
|
3 |
Jan 23 |
7 |
Transient response and pole locations |
|
|
|
8 |
Stability for discrete-time systems |
Chapter 3 |
|
|
9 |
Lyapunov's method |
|
4 |
Jan 30 |
10 |
Lyapunov's method |
|
|
|
11 |
Stability for LTV systems |
Quiz 2 |
|
|
12 |
Stability for LTI systems |
Chapter 4 |
5 |
Feb 6 |
13 |
Controllability, Pole placement |
|
|
|
14 |
Deadbeat control, PBH test |
|
|
|
15 |
Observability, observers, separation principle |
Chapter 5 |
6 |
Feb 13 |
16 |
Adaptive control: static error model |
|
|
|
17 |
Adaptive control: dynamic error model |
Quiz 3 |
|
|
18 |
Parameter convergence for static error model |
|
|
Feb 19 |
|
Reading Week
| |
7 |
Feb 27 |
19 |
Parameter convergence for dynamic error model |
|
|
|
20 |
Robust parameter adaptation |
|
|
|
21 |
Regulator problem |
Chapter 6 |
8 |
Mar 5 |
22 |
Internal model principle |
|
|
|
23 |
Regulator design |
|
|
|
24 |
Adaptive regulator problem |
Chapter 7 |
9 |
Mar 12 |
25 |
Adaptive regulator design |
Midterm, March 11, 5-7pm |
|
|
26 |
Adaptive regulator design |
|
|
|
27 |
Adaptive regulator design |
|
10 |
Mar 19 |
28 |
Dynamic programing: finite horizon |
Chapter 8 |
|
|
29 |
Dynamic programming: infinite horizon |
|
|
|
30 |
Dynamic programming: infinite horizon |
|
11 |
Mar 26 |
31 |
Dynamic programming: value and policy iterations |
|
|
|
32 |
Offline value and policy iterations using Q functions |
Quiz 4 |
|
|
33 |
Reinforcement learning: temporal difference error |
|
12 |
Apr 2 |
34 |
Reinforcement learning: value function approximation |
Chapter 9 |
|
|
35 |
Reinforcement learning: Q functions |
|
|
|
36 |
Reinforcement learning: online policy and value iterations |
|
13 |
Apr 9 |
37 |
Review |
|
|
|
38 |
Review |
|
Labs are Matlab-based and performed in groups of two or three in BA3114. You may select your own
lab partners, or your assigned practical TA can help you form a group.
Each team submits a preparation on Quercus at the start of the lab session. Each team submits
their exported Matlab Livescript as a pdf or html by 5pm on the due date.
The preparation + lab are worth 2 + 8 = 10 marks.