| Episode | Status |
|---|---|
| Episode | Status |
|---|---|
Dr. Jeff Beck, mathematician turned computational neuroscientist, joins us for a fascinating deep dive into why the future of AI might look less like ChatGPT and more like your own brain.**SPONSOR MES...
Dr. Jeff Beck argues that scaling transformers alone won't achieve human-like AI - we need brain-inspired, object-centered models grounded in physical reality. He presents a framework combining Bayesian inference with sparse, structured world models that can be trained on smaller datasets, enable continual learning, and support true systems engineering. Key innovations include 'lots of little models' approach, physics discovery algorithms, and solving the sim-to-real gap through proper grounding in macroscopic physics rather than language.
Beck explains why he believes the brain operates as a Bayesian inference engine, citing behavioral experiments showing humans optimally combine multiple sensory cues with varying reliability. The brain constantly processes information to maintain understanding of low-level statistics, even when not consciously perceived.
Beck uses momentum in physics as an example of how we choose mathematical frameworks for computational convenience rather than because they necessarily reflect reality. Causal models are preferred because they simplify calculations and point to effective intervention points.
Discussion of why macroscopic causal relationships matter more than microscopic ones - they align with our affordances and ability to act. Downward causation justifies when we've correctly identified a useful macroscopic variable that makes microscopic details irrelevant.
Beck argues AutoGrad was more important than the transformer architecture itself, turning AI development from careful mathematical construction into an engineering problem. This enabled rapid experimentation but lost focus on structured, brain-like models.
Beck's core thesis: AI needs cognitively-inspired models grounded in macroscopic physics, not language. The approach uses sparse, structured, object-centered models that enable systems engineering and creative problem-solving rather than just pattern matching.
Critique of grounding models in language (like LangChain approach) versus grounding in physical reality. Language is an unreliable representation of both world and thought processes - self-report is least reliable experimental data.
Overview of techniques making Bayesian inference tractable at scale: normalizing flows, natural gradient methods, improved sampling. The active inference community historically focused on breadth (evangelism) over depth (solving hard problems).
Revolutionary approach: instead of one giant model, train thousands of small object-specific models that can be composed. Train on houses, train on parks separately, then combine - the book model works in both contexts.
Objects are defined by their interaction patterns. Multiple adjacency matrices represent different interaction types (forces). Bayesian uncertainty about interactions enables continual learning when encountering novel situations.
Concrete example of system advantages: warehouse robot encounters cat (never seen before). Surprise signal triggers, queries model bank, receives candidate models, tests hypotheses, incorporates cat model. Demonstrates knowing what you don't know.
Current robotics fails to transfer from simulation to reality because game engines prioritize plausibility over accuracy, and robot 'brains' lack world structure. Need accurate physics simulators and structured internal models.
Critique of reward-based alignment: reward functions are arbitrary and lead to degenerate behavior. Humans align by discussing beliefs to separate belief disagreements from value disagreements. Proposes using AI as oracles or solving alignment through explicit belief models.
Discussion of cellular automata as Turing-complete systems with emergent properties. Beck focuses less on how emergence happens from simple rules, more on mathematical properties of resulting macroscopic objects - aligns with human cognitive bias.
Program synthesis has promise but current approaches lack datasets of well-written programs. Tony Zadar's work on genetically encoding neural networks suggests path forward - learn patterns across solutions to sensibly traverse architecture space.
We Invented Momentum Because Math is Hard [Dr. Jeff Beck]
Ask me anything about this podcast episode...
Try asking: