Abstract:
Conventionally, optimal control solutions demand offline a priori knowledge of the system dynamics. Adaptive optimal control methods have been advantageous in allowing for dynamic controllers to approximate optimal control solutions by estimating system parameters online. For continuous-time linear quadratic regulator (LQR), the optimal control is inferred from the non-linear algebraic Riccati equation (ARE). This thesis presents a history of previous research on optimal and adaptive optimal control (AOC). Then we build on a policy iteration algorithm for solving online, on-policy adaptive solutions to the LQR problem. The proposed algorithm is a filter-based explorized approach to designing an adaptive optimal controller (AOC) for systems with unknown dynamics, where a two-layer low pass filter architecture is introduced. The initial layer mitigates the necessity for sensing state derivatives and using computationally complex finite window integrals (FWI), and the subsequent layer offers appropriate algebraic connections, negating the need for ”intelligent” data storage techniques. An exploration signal is utilized in the control parameter to guarantee stability and convergence. We conclude by providing analytical assurances for the stability of the closed-loop dynamics of the system and presenting the simulation setup for the same.