Episode	Podcast	Published	Duration	Status

Google AI: Release Notes

Koray Kavukcuoglu: “This Is How We Are Going to Build AGI”

November 25, 2025•48m•9,342 words•Google

Description

Join Logan Kilpatrick and Koray Kavukcuoglu, CTO of Google DeepMind and Chief AI Architect of Google, as they discuss Gemini 3 and the state of AI! Their conversation includes the reception of Gemin...

Summary

Koray Kavukcuoglu, CTO of Google DeepMind and Chief AI Architect of Google, discusses Gemini 3's positive reception and the philosophy behind building AGI through real-world product integration. He emphasizes that innovation—not just scaling—is critical for continued progress, and that Google is co-building AGI with customers through tight integration between research and products. The conversation covers key focus areas including instruction following, internationalization, tool calling, and agentic coding capabilities, while highlighting DeepMind's evolution from a research lab to an engineering-driven organization shipping state-of-the-art models.

Jump to Topic

Gemini 3 Reception and Benchmark Philosophy

Discussion of Gemini 3's positive reception and the role of benchmarks in measuring progress. Koray explains that benchmarks naturally saturate as technology advances, requiring new frontiers to be defined. Real-world user feedback from products like Gemini app and AI Studio is emphasized as the most important measure of progress beyond static benchmarks.

•Benchmarks like HLE and Arc AGI have progressed from 1-2% to 40%+ accuracy, showing rapid advancement
•GPQA Diamond remains challenging but is approaching saturation, indicating need for new benchmarks
•Real-world usage across scientists, students, lawyers, and engineers is the ultimate measure of progress
•Benchmarks and model development must evolve hand-in-hand as technology pushes new frontiers

Critical Focus Areas: Instruction Following, Internationalization, and Tool Use

Koray outlines the key areas where Gemini models are optimized: instruction following, internationalization, function/tool calls, and code generation. He emphasizes that tool calling and coding are force multipliers for intelligence, enabling models to integrate with existing tools and allowing creative people to become builders through vibe coding.

•Instruction following ensures models understand and execute user requests accurately
•Internationalization is critical for Google's global reach—Gemini 3 Pro shows major improvements in historically underserved languages
•Function/tool calls enable models to use existing tools and write their own, multiplying intelligence
•Code generation democratizes building—'vibe coding' transforms creative people into productive builders
•Code is the basis for integrating with most digital activities, making it essential for broad reach

Product Integration as Core to AGI Development Strategy

Koray explains his dual role as CTO of DeepMind and Chief AI Architect of Google, emphasizing that building AGI requires tight integration with products and users. Anti-gravity, AI Studio, Search AI mode, and other products serve as critical launch partners providing essential user feedback that drives model improvements.

•Building AGI requires co-development with customers through real-world products, not isolated lab research
•Anti-gravity provided instrumental feedback in the last 2-3 weeks before Gemini 3 launch
•Product integration enables direct user signals that are critical for understanding where models need improvement
•The engineering mindset—thinking about safety and security from the ground up—is essential for real-world deployment
•Safety and security teams are embedded in post-training processes, not added at the end

Team Google: Massive Cross-Functional Collaboration

Discussion of how Gemini 3 represents a massive Team Google effort spanning continents and thousands of contributors. The model shipped simultaneously across multiple products (AI mode, Gemini app, Anti-gravity) on day one, requiring unprecedented coordination across DeepMind and Google product teams globally.

•Gemini releases involve teams across Europe, Asia, and globally—not just DeepMind
•Simultaneous shipping with AI mode, Gemini app, and other products requires product teams to be integrated during development
•This level of coordination is only possible because products work together with research during the development cycle
•The scale rivals major programs like Apollo in terms of contributor count and coordination complexity

Agentic Coding and Tool Use: The Next Frontier

Koray identifies agentic actions and coding as major growth areas where significant room for improvement remains. While Gemini 3 serves 90-95% of coding use cases well, there's still work to be done. The focus has evolved from pure multimodal capabilities to real-world integration through products like Anti-gravity.

•Agentic actions and coding represent the most exciting growth areas with substantial room for improvement
•Gemini 3 serves 90-95% of coding use cases well but isn't perfect—specific gaps remain
•Historical focus on multimodal capabilities has shifted toward real-world product integration and user feedback
•The journey from research environment to engineering mindset has been critical for progress in these areas

Generative Media Models: Nano Banana and Convergence

Discussion of generative image/video models and their convergence with text models. Koray explains how Nano Banana and Nano Banana Pro demonstrate natural technology convergence, with architectures becoming more unified. The Pro version leverages Gemini 3 Pro architecture for complex document understanding and infographic generation.

•Generative models on images date back 10-20 years in AI research—text became the faster progress domain
•Technology convergence is happening naturally as architectures align between text and image domains
•Nano Banana Pro uses Gemini 3 Pro architecture tuned for image generation, showing family of models approach
•Complex use cases: feeding large document sets, asking questions, and generating infographics all work together
•Input-output multimodality is becoming unified, though full convergence requires more innovation

Path to Unified Multimodal Models

Koray discusses the technical challenges of creating truly unified models that handle text, code, and image generation in a single checkpoint. The output space is critical for learning signals, and achieving pixel-perfect image quality while maintaining text/code performance requires significant innovation.

•Architectures are aligning between modalities, making unified models increasingly feasible
•Output space is critical—current learning signals come primarily from code and text
•Image generation requires both pixel-perfect quality and conceptual coherence, making it harder to train jointly
•Scientific method applies: hypotheses are tested, sometimes they work, sometimes they don't
•Near future will see more convergence, but it requires finding the right model innovations

DeepMind's Evolution: From Research Lab to Engineering Organization

Koray reflects on his journey as the first deep learning researcher at DeepMind in 2012 and the organization's evolution. The team has learned to organize large-scale efforts (from 25-person papers to 2,500+ contributors), maintain research culture while adopting engineering mindset, and balance exploration with execution.

•Joined DeepMind in 2012 as first deep learning researcher when AI-focused startups were rare
•Early DeepMind pioneered 25-person research papers when that was uncommon in academia
•Gemini 3 has 2,500+ contributors across Google, showing massive scale-up in coordination
•Learned to organize around specific missions through DQN, AlphaGo, AlphaZero, AlphaFold projects
•Merged research culture with engineering mindset: main line of models with systematic exploration

Innovation as the Critical Risk Factor

Koray emphasizes that running out of innovation is the biggest risk for Gemini, not execution or scaling. He rejects the notion that the recipe is figured out, stressing that building AGI requires continuous innovation at multiple scales—within Gemini, across DeepMind, and in Google Research.

•Biggest risk for Gemini is running out of innovation, not execution challenges
•Does not believe the recipe is figured out—building intelligence requires ongoing innovation
•Innovation happens at different scales: within Gemini project, across DeepMind, and in Google Research
•DeepThink models demonstrate exploration approach: use challenging targets (IMO, ICPC) to evolve general capabilities
•Gemini is the goal (intelligence), not a specific architecture—architecture will evolve through innovation

Team Culture and the Underdog Journey

Koray discusses the importance of team culture, trust, and humility in tackling hard scientific problems. He acknowledges Google was catching up in LLMs 2.5 years ago but has reached the leadership group through innovation. The team's ability to work together across exhausting challenges while maintaining focus on building intelligence the right way is emphasized.

•Team culture of trust, giving people opportunities, and tackling challenging technical problems together is essential
•Honest about being behind in LLMs 2.5 years ago—had to catch up and innovate unique solutions
•Now in leadership group with good rhythm and dynamic, but still more work to do
•Google's scale and full-stack approach (data centers to chips to networking) is an advantage, not a burden
•Goal is building intelligence the right way—putting all minds and innovation toward that mission

Google AI: Release Notes

Koray Kavukcuoglu: “This Is How We Are Going to Build AGI”

0:00 / 0:00

View original episode →

Summary

Jump to Topic

Gemini 3 Reception and Benchmark Philosophy

•Benchmarks like HLE and Arc AGI have progressed from 1-2% to 40%+ accuracy, showing rapid advancement
•GPQA Diamond remains challenging but is approaching saturation, indicating need for new benchmarks
•Real-world usage across scientists, students, lawyers, and engineers is the ultimate measure of progress
•Benchmarks and model development must evolve hand-in-hand as technology pushes new frontiers

Critical Focus Areas: Instruction Following, Internationalization, and Tool Use

•Instruction following ensures models understand and execute user requests accurately
•Internationalization is critical for Google's global reach—Gemini 3 Pro shows major improvements in historically underserved languages
•Function/tool calls enable models to use existing tools and write their own, multiplying intelligence
•Code generation democratizes building—'vibe coding' transforms creative people into productive builders
•Code is the basis for integrating with most digital activities, making it essential for broad reach

Product Integration as Core to AGI Development Strategy

•Building AGI requires co-development with customers through real-world products, not isolated lab research
•Anti-gravity provided instrumental feedback in the last 2-3 weeks before Gemini 3 launch
•Product integration enables direct user signals that are critical for understanding where models need improvement
•The engineering mindset—thinking about safety and security from the ground up—is essential for real-world deployment
•Safety and security teams are embedded in post-training processes, not added at the end

Team Google: Massive Cross-Functional Collaboration

•Gemini releases involve teams across Europe, Asia, and globally—not just DeepMind
•Simultaneous shipping with AI mode, Gemini app, and other products requires product teams to be integrated during development
•This level of coordination is only possible because products work together with research during the development cycle
•The scale rivals major programs like Apollo in terms of contributor count and coordination complexity

Agentic Coding and Tool Use: The Next Frontier

•Agentic actions and coding represent the most exciting growth areas with substantial room for improvement
•Gemini 3 serves 90-95% of coding use cases well but isn't perfect—specific gaps remain
•Historical focus on multimodal capabilities has shifted toward real-world product integration and user feedback
•The journey from research environment to engineering mindset has been critical for progress in these areas

Generative Media Models: Nano Banana and Convergence

•Generative models on images date back 10-20 years in AI research—text became the faster progress domain
•Technology convergence is happening naturally as architectures align between text and image domains
•Nano Banana Pro uses Gemini 3 Pro architecture tuned for image generation, showing family of models approach
•Complex use cases: feeding large document sets, asking questions, and generating infographics all work together
•Input-output multimodality is becoming unified, though full convergence requires more innovation

Path to Unified Multimodal Models

•Architectures are aligning between modalities, making unified models increasingly feasible
•Output space is critical—current learning signals come primarily from code and text
•Image generation requires both pixel-perfect quality and conceptual coherence, making it harder to train jointly
•Scientific method applies: hypotheses are tested, sometimes they work, sometimes they don't
•Near future will see more convergence, but it requires finding the right model innovations

DeepMind's Evolution: From Research Lab to Engineering Organization

•Joined DeepMind in 2012 as first deep learning researcher when AI-focused startups were rare
•Early DeepMind pioneered 25-person research papers when that was uncommon in academia
•Gemini 3 has 2,500+ contributors across Google, showing massive scale-up in coordination
•Learned to organize around specific missions through DQN, AlphaGo, AlphaZero, AlphaFold projects
•Merged research culture with engineering mindset: main line of models with systematic exploration

Innovation as the Critical Risk Factor

•Biggest risk for Gemini is running out of innovation, not execution challenges
•Does not believe the recipe is figured out—building intelligence requires ongoing innovation
•Innovation happens at different scales: within Gemini project, across DeepMind, and in Google Research
•DeepThink models demonstrate exploration approach: use challenging targets (IMO, ICPC) to evolve general capabilities
•Gemini is the goal (intelligence), not a specific architecture—architecture will evolve through innovation

Team Culture and the Underdog Journey

•Team culture of trust, giving people opportunities, and tackling challenging technical problems together is essential
•Honest about being behind in LLMs 2.5 years ago—had to catch up and innovate unique solutions
•Now in leadership group with good rhythm and dynamic, but still more work to do
•Google's scale and full-stack approach (data centers to chips to networking) is an advantage, not a burden
•Goal is building intelligence the right way—putting all minds and innovation toward that mission

Google AI: Release Notes

Koray Kavukcuoglu: “This Is How We Are Going to Build AGI”

0:00 / 0:00

Koray Kavukcuoglu: “This Is How We Are Going to Build AGI”

Description

Summary

Jump to Topic

Gemini 3 Reception and Benchmark Philosophy

Critical Focus Areas: Instruction Following, Internationalization, and Tool Use

Product Integration as Core to AGI Development Strategy

Team Google: Massive Cross-Functional Collaboration

Agentic Coding and Tool Use: The Next Frontier

Generative Media Models: Nano Banana and Convergence

Path to Unified Multimodal Models

DeepMind's Evolution: From Research Lab to Engineering Organization

Innovation as the Critical Risk Factor

Team Culture and the Underdog Journey

Navigate

Chat with Episode

Summary

Jump to Topic

Gemini 3 Reception and Benchmark Philosophy

Critical Focus Areas: Instruction Following, Internationalization, and Tool Use

Product Integration as Core to AGI Development Strategy

Team Google: Massive Cross-Functional Collaboration

Agentic Coding and Tool Use: The Next Frontier

Generative Media Models: Nano Banana and Convergence

Path to Unified Multimodal Models

DeepMind's Evolution: From Research Lab to Engineering Organization

Innovation as the Critical Risk Factor

Team Culture and the Underdog Journey

Navigate

Chat with Episode