Paper Summaries
📄️ R-max – A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning
Brafman, R., and Tennenholtz, M. 2003. R-max - a general polynomial time algorithm for near-optimal reinforcement learning. J. Mach. Learn. Res., 3, p.213–231.
📄️ Bounding-Box Inference for Error-Aware Model-Based Reinforcement Learning
Talvitie, Erin J., et al. “Bounding-Box Inference for Error-Aware Model-Based Reinforcement Learning.” Reinforcement Learning Conference, 2024.