| Episode | Status |
|---|---|
| Episode | Status |
|---|---|
Ari Morcos and Rob Toews return for their spiciest conversation yet. Fresh from NeurIPS, they debate whether models are truly plateauing or if we're just myopically focused on LLMs while breakthroughs...
Fresh from NeurIPS, Ari Morcos and Rob Toews debate whether AI models are plateauing or if we're myopically focused on LLMs while breakthroughs happen elsewhere. They reveal why infinite capital at labs may constrain innovation, explain the narrow 'Goldilocks zone' where RL works, and argue U.S. chip restrictions accelerated China's self-sufficiency by years. The conversation covers OpenAI's structural vulnerabilities, SSI's mystique around Ilya's 'two words,' and closes with bold 2026 predictions including Sam Altman's potential departure from OpenAI.
Ari reflects on attending his 11th NeurIPS, noting growth from 2,000 to 30,000 attendees since 2015. Despite feeling 'late' in 2015, he now believes we're still extremely early in AI. However, NeurIPS has become more about PR than sharing work, as labs like Meta and Google now prohibit publishing meaningful research, creating a 'Goldilocks zone' where papers must be good enough to pass review but not important enough to matter for models.
Rob argues models are clearly plateauing, citing decreased incremental improvements since GPT-4 and fundamental limitations in continual learning and sample efficiency. Ari counters that we're in an 'LLM bubble' not an AI bubble - while consumer LLMs may plateau, video models and other modalities show no signs of slowing. The community's myopic focus on language models misses breakthroughs happening elsewhere in AI.
Ari explains RL only works in a specific 'Goldilocks zone' where models understand enough that on-policy guesses have some chance of being correct, but not so much that sparse rewards don't teach anything. Coding models hit this zone, but many domains haven't. The challenge isn't just applying RL - it's getting models to the readiness point where RL can be effective, which requires significant pre-work in most enterprise domains.
Ari argues that unlimited funding at major labs is a 'huge detriment' because constraints breed innovation. Chinese labs have achieved incredible results with far fewer resources than individual Western labs because they must innovate on efficiency. When you have infinite capital, throwing more data and compute at problems becomes easier than solving hard research problems around sample efficiency, even though we've known about diminishing returns from scaling laws since Kaplan's foundational work.
Debating recursive self-improvement and AI researchers, Ari challenges whether the bottleneck is actually lack of ideas versus compute to test them. OpenAI likely already has great ideas from researchers - the constraint is GPU bandwidth and time to experiment. An AI researcher only accelerates progress if it has dramatically higher hit rates than human researchers, similar to how AI for drug discovery works because current hit rates are 0.0001%. The analogy to bio is apt but assumes research hit rates are similarly abysmal.
SSI maintains unprecedented secrecy - no phones allowed, employees can't discuss AI at all to avoid inadvertent reveals. Rob shares an anecdote where Ilya told an interviewer there are 'two words' that would explain SSI's approach, but refused to share them. Two explanations exist: they've genuinely kept the next big thing secret, or there's nothing to leak. Ilya's track record (correctly calling scaling in 2018) demands respect, but aggressive secrecy makes it impossible to evaluate.
OpenAI's 'code red' signals the end of their aura of infallibility as Google demonstrates advantages in talent depth, compute resources, and crucially, cash position. Google has hundreds of billions on balance sheet while OpenAI burns cash at unprecedented rates - projecting $150B cumulative burn before profitability in 2029 (versus Uber's record $40B). This makes OpenAI completely dependent on capital markets, unlike Google which can self-fund indefinitely.
Ari calls chip restrictions 'one of the stupidest things we've ever done,' arguing it stems from viewing China as a 'copycat factory' unable to innovate - a view true in the 1990s but completely false today. China is genuinely innovating in AI (DeepSeek papers are must-reads at Western labs), and restrictions likely accelerated their domestic chip development by 5-10 years. The CCP's ability to think in 10-40 year timeframes, superior to democracies with election cycles, makes this a catastrophic strategic error.
Ari reports being 'more bearish' on Meta's superintelligence team after numerous departures with extremely short tenures and cultural challenges. The Llama 4 'purge' removed many exceptional researchers, representing organizational dysfunction where Meta pays premium for AI talent while simultaneously letting great researchers go. FAIR, once a beloved research org, is 'on its last legs' with researchers deeply unhappy and seeking exits. The shift from open source (which rehabilitated Zuck's image) to closed source removes key tailwinds.
Rob predicts Sam Altman won't be OpenAI CEO by end of 2026, citing shifting narrative, his scattered focus (BCI, space, chips), and the need for 'straight-laced' leadership for IPO preparation - comparing to Travis/Dara transition at Uber. Ari predicts >50% chance the world's best model will be Chinese open source at least once in 2026, with >20% chance it remains best at year-end. Jacob predicts infrastructure will create more value than applications in 2026 after years of app dominance, as model stability finally enables real infra companies to emerge.
AI Vibe Check: The Actual Bottleneck In Research, SSI’s Mystique, & Spicy 2026 Predictions
Ask me anything about this podcast episode...
Try asking: