Foundations of Deep Reinforcement Learning
This content is a direct translation of the Foundations of Deep RL lecture series, from professor Pieter Abbeel's YouTube channel.
For a more detailed coverage of the topics, I highly recommend watching the original videos.
This lecture series is designed to build a strong foundation in deep reinforcement learning, enabling students, such as myself, to understand current developments and pursue their own research and applications in the field of Reinforcement Learning.
📄️ 1. Foundations of RL
This first post covers Markov Decision Processes (MDPs), exact solution methods and maximum entropy formulation. We'll define the main components of MDPs—states, actions, transition probabilities, and rewards—and look at exact solution methods like dynamic programming. While these methods work well for small-scale problems, they don't scale effectively to larger ones, pointing to the need for advanced techniques that we'll discuss in later posts.
📄️ 2. Deep Q-Learning
In the previous post, we laid the groundwork by exploring Markov Decision Processes (MDPs) and exact solution methods like value iteration and policy iteration. In this post, we'll build on those concepts to delve into deep Q-learning, an essential component of deep reinforcement learning.
📄️ 3. Policy Gradients
These are my rough notes.
📄️ 4. TRPO and PPO
These are my rough notes.
📄️ 5. DDPG and SAC
These are my rough notes.
📄️ 6. Model-based RL
These are my rough notes.