| Episode | Status |
|---|---|
We’re told that AI progress is slowing down, that pre-training has hit a wall, that scaling laws are running out of road. Yet we’re releasing this episode in the middle of a wild couple of weeks that ...
Łukasz Kaiser, co-author of the Transformer paper and OpenAI research scientist, explains why AI progress narratives of slowdown are fundamentally wrong. The conversation reveals how reasoning models represent a new paradigm shift comparable to Transformers themselves, delivering exponential capability gains through reinforcement learning rather than just scaling pre-training. Kaiser provides rare insight into the engineering realities of frontier AI development, from GPU allocation to the surprising limitations that still exist (like failing simple first-grade math puzzles), while demonstrating why the combination of improved reasoning, multimodal capabilities, and tool use will drive the next wave of AI advancement.
Kaiser dismantles the AI slowdown narrative, explaining that while pre-training has reached the upper part of its S-curve, reasoning models represent an entirely new paradigm delivering better results at the same cost. He draws parallels to Moore's Law, where smooth exponential progress masks multiple underlying technology transitions, and explains how reinforcement learning for reasoning is still in its early, high-growth phase.
Kaiser reveals the massive amount of 'obvious' improvements still available in frontier AI development, spanning engineering infrastructure, data quality, synthetic data generation, and multimodal capabilities. He emphasizes that much of this work is unglamorous engineering rather than breakthrough science, but will deliver substantial capability gains.
Kaiser provides a technical explanation of reasoning models, distinguishing them from base LLMs through their chain-of-thought generation and reinforcement learning training. He explains how RL allows models to learn verification and self-correction strategies, but currently requires verifiable domains like math and coding, limiting broader application.
Kaiser shares the surprisingly distributed origin of the Transformer paper, revealing that all eight co-authors never physically met together. He explains how different researchers approached the problem from multiple angles - attention mechanisms, knowledge storage, and critical engineering work to make training actually function.
Kaiser describes his transition from Google Brain (which grew from ~40 to 4,000 people during his tenure) to OpenAI during COVID, motivated by the desire to work in smaller teams and the challenges of remote work at a massive organization. He notes that frontier AI labs are more similar to each other than different, with the real gap being between academia and industry.
Kaiser explains how the economics of serving billions of users fundamentally changed AI development priorities. OpenAI shifted from training only the largest possible models to optimizing for cost-effectiveness through smaller models and distillation, while maintaining the ability to scale pre-training when economically justified.
Kaiser reveals that the evolution from GPT-4 to GPT-5.1 involved less fundamental change than users might think. The biggest shift was adding reasoning via RL and synthetic data, while 5.1 specifically represents mostly post-training improvements around safety, tone control, and reducing hallucinations through better tool use and verification.
Kaiser demonstrates a critical limitation of current reasoning models through a striking example: frontier models that achieve gold medals at Mathematical Olympiads cannot solve simple first-grade visual math puzzles. This 'jagged' capability profile reveals how reasoning models excel in narrow, well-trained domains but struggle with basic multimodal reasoning and in-context learning.
Kaiser frames the central question in AI research: whether reasoning capabilities will be sufficient to achieve human-like generalization, or if fundamentally different architectural approaches are needed. He emphasizes that we won't know until we've exhausted current approaches, comparing it to 'driving fast in a fog.'
Kaiser explains the technical challenges and solutions behind GPT-5.1 Codex Max, designed for week-long software engineering tasks. The system uses context compaction (summarization and selective forgetting) to operate across millions of tokens, while training prevents the model from getting lost in long feedback loops.
Kaiser addresses concerns about AI replacing all work by pointing to persistent limitations (first-grade math failures), the translation industry paradox (grown despite automation), and the fundamental question of trust. He argues that while some jobs will change dramatically, there will always be things people want humans to do, especially in high-stakes scenarios.
What’s Next for AI? OpenAI’s Łukasz Kaiser (Transformer Co-Author)
Ask me anything about this podcast episode...
Try asking: