Introduction to Reinforcement Learning

This documentation is designed to be an additional resource for students and a reliable, interactive knowledge base for anyone interested in reinforcement learning (RL). It provides a summary of the material covered in the:

Reinforcement Learning MOOC, taught by Adam White and Martha White;
University of Alberta's courses:
- CMPUT 655, instructed by Professors Michael Bowling and Simone Parisi;
- CMPUT 365, for which I have served as a teaching assistant under Professor Marlos C. Machado.

The next chapters cover the fundamental ideas in reinforcement learning, starting from multi-armed bandits—which introduces the trade-off of exploration vs. exploitation in a simplified setting—and extending to Markov Decision Processes (MDPs) that formalize sequential decision-making problems. We review methods like dynamic programming, Monte Carlo estimation, and temporal-difference learning, exploring how each leverages returns to update value functions and policies. Along the way, we introduce planning techniques and explore how function approximation allows RL to scale to larger state spaces. Finally, we wrap up with policy gradient approaches that learn parameterized policies directly, setting the stage for more advanced algorithms and real-world applications.

References

Sutton, R. S., & Barto, A. G. Reinforcement Learning: An Introduction (2nd Edition)
University of Alberta - Reinforcement Learning MOOC
Coursera: Reinforcement Learning Specialization by Professors Adam and Martha White.
Foundations of Deep RL (2021)
YouTube Video Series by Professor Pieter Abbeel.
UC Berkeley - CS 285: Deep Reinforcement Learning (2023)
YouTube Lecture Series by Professor Sergey Levine.
Stanford - CS 234: Reinforcement Learning (2024)
YouTube Lecture Series by Professor Emma Brunskill.
DeepMind x UCL - Introduction to Reinforcement Learning (2015)
YouTube Lecture Series by Professor David Silver.
University of Alberta - CMPUT 365: Introduction to Reinforcement Learning (2023)
Course Website by Professor Csaba Szepesvári.

Introduction to Reinforcement Learning

References

📄️ 1. Introduction to Reinforcement Learning

📄️ 2. Markov Decision Processes

📄️ 3. Dynamic Programming

📄️ 4. Monte Carlo Methods

📄️ 5. Temporal-Difference Learning

📄️ 6. Planning and Learning with Tabular Methods

🗃️ 7. Prediction and Control with Function Approximation

📄️ 8. Policy Gradient Methods

References​

📄️ 1. Introduction to Reinforcement Learning

📄️ 2. Markov Decision Processes

📄️ 3. Dynamic Programming

📄️ 4. Monte Carlo Methods

📄️ 5. Temporal-Difference Learning

📄️ 6. Planning and Learning with Tabular Methods

🗃️ 7. Prediction and Control with Function Approximation

📄️ 8. Policy Gradient Methods

References