Skip to main content

Introduction to Reinforcement Learning

This documentation is designed to be an additional resource for students and a reliable, interactive knowledge base for anyone interested in reinforcement learning (RL). It provides a summary of the material covered in the:

The next chapters cover the fundamental ideas in reinforcement learning, starting from multi-armed bandits—which introduces the trade-off of exploration vs. exploitation in a simplified setting—and extending to Markov Decision Processes (MDPs) that formalize sequential decision-making problems. We review methods like dynamic programming, Monte Carlo estimation, and temporal-difference learning, exploring how each leverages returns to update value functions and policies. Along the way, we introduce planning techniques and explore how function approximation allows RL to scale to larger state spaces. Finally, we wrap up with policy gradient approaches that learn parameterized policies directly, setting the stage for more advanced algorithms and real-world applications.

References

  1. Sutton, R. S., & Barto, A. G. Reinforcement Learning: An Introduction (2nd Edition)

  2. University of Alberta - Reinforcement Learning MOOC
    Coursera: Reinforcement Learning Specialization by Professors Adam and Martha White.

  3. Foundations of Deep RL (2021)
    YouTube Video Series by Professor Pieter Abbeel.

  4. UC Berkeley - CS 285: Deep Reinforcement Learning (2023)
    YouTube Lecture Series by Professor Sergey Levine.

  5. Stanford - CS 234: Reinforcement Learning (2024)
    YouTube Lecture Series by Professor Emma Brunskill.

  6. DeepMind x UCL - Introduction to Reinforcement Learning (2015)
    YouTube Lecture Series by Professor David Silver.

  7. University of Alberta - CMPUT 365: Introduction to Reinforcement Learning (2023)
    Course Website by Professor Csaba Szepesvári.