RLHF Illustrated Guide

Learn Reinforcement Learning from Human Feedback through interactive visualizations, intuitive analogies, and hands-on examples.

Analogy Visuals

These illustrations introduce the narrative lenses we use across the guide—browse them to understand each storytelling perspective.

Retro policy training

Treat RLHF like a classic arcade challenge – policies learn by chasing new highs.

Editing draft…

Draft, edit, refine

Follow the writing student and mentor through iterative feedback and revisions.

∑ reward_t ⋅ γ^tProof Verified

Whiteboard your steps

Trace every deduction on a collaborative whiteboard to keep reasoning grounded.

Push the frontier

Peek at frontier alignment topics like Constitutional AI and tool-use orchestration.

What RLHF does and why it matters

Training reward models from preference data

Core RL optimization techniques