Reinforcement Learning 104: Deep Q-Networks

Deep Q-Networks: DQN, Experience Replay, Prioritized Experience Replay, Target Networks, Reward Clipping, Overestimation Bias, Double DQN, Dueling DQN.

July 2, 2026 · 25 min · Mateusz Pieniak

Reinforcement Learning 103: Approximate Methods

Approximate model-free RL: function approximation, regression targets, loss functions, semi-gradient TD, approximate SARSA, Expected SARSA, Q-learning, and the deadly triad.

June 23, 2026 · 19 min · Mateusz Pieniak

Reinforcement Learning 102: Q-learning & SARSA

Model-free RL: Monte Carlo and Temporal Difference control, Q-learning, SARSA, Expected SARSA, and exploration.

June 21, 2026 · 27 min · Mateusz Pieniak

Reinforcement Learning 101: Policy Iteration & Value Iteration

Model-based RL: MDPs, value functions, action-value functions, Bellman equations, and contraction mapping with full mathematical proofs of Policy Iteration and Value Iteration.

June 17, 2026 · 21 min · Mateusz Pieniak