Introduction to Reinforcement Learning
This documentation is designed to be an additional resource for students and a reliable, interactive knowledge base for anyone interested in reinforcement learning (RL). It provides a summary of the material covered in the:
- Reinforcement Learning MOOC, taught by Adam White and Martha White;
- University of Alberta's courses:
- CMPUT 655, instructed by Professors Michael Bowling and Simone Parisi;
- CMPUT 365, for which I have served as a teaching assistant under Professor Marlos C. Machado.
The next chapters cover the fundamental ideas in reinforcement learning, starting from multi-armed bandits—which introduces the trade-off of exploration vs. exploitation in a simplified setting—and extending to Markov Decision Processes (MDPs) that formalize sequential decision-making problems. We review methods like dynamic programming, Monte Carlo estimation, and temporal-difference learning, exploring how each leverages returns to update value functions and policies. Along the way, we introduce planning techniques and explore how function approximation allows RL to scale to larger state spaces. Finally, we wrap up with policy gradient approaches that learn parameterized policies directly, setting the stage for more advanced algorithms and real-world applications.
References
-
Sutton, R. S., & Barto, A. G. Reinforcement Learning: An Introduction (2nd Edition)
-
University of Alberta - Reinforcement Learning MOOC
Coursera: Reinforcement Learning Specialization by Professors Adam and Martha White. -
Foundations of Deep RL (2021)
YouTube Video Series by Professor Pieter Abbeel. -
UC Berkeley - CS 285: Deep Reinforcement Learning (2023)
YouTube Lecture Series by Professor Sergey Levine. -
Stanford - CS 234: Reinforcement Learning (2024)
YouTube Lecture Series by Professor Emma Brunskill. -
DeepMind x UCL - Introduction to Reinforcement Learning (2015)
YouTube Lecture Series by Professor David Silver. -
University of Alberta - CMPUT 365: Introduction to Reinforcement Learning (2023)
Course Website by Professor Csaba Szepesvári.
📄️ 1. Introduction to Reinforcement Learning
In this chapter you'll be introduced to:
📄️ 2. Markov Decision Processes
In this chapter, you'll learn about:
📄️ 3. Dynamic Programming
In this chapter, you'll learn about:
📄️ 4. Monte Carlo Methods
In this chapter, you'll learn about:
📄️ 5. Temporal-Difference Learning
In this chapter, you'll learn about:
📄️ 6. Planning and Learning with Tabular Methods
In this chapter, you'll learn about:
🗃️ 7. Prediction and Control with Function Approximation
3 items
📄️ 8. Policy Gradient Methods
In this chapter, you'll learn about: