SemiAnalysispodcast

RL Systems Mind the Gap: Matching Trainer and Generator Throughput

Tuesday, June 16, 2026Kimbo ChenView original
RL Systems Mind the Gap: Matching Trainer and Generator Throughput
RL Training Infrastructure, GRPO, PipelineRL, Async RL, Policy Staleness, RL Sandbox Infra, CPU Requirements, TCO Analysis, Thinking Machines Tinker