| Episode | Status |
|---|---|
In this episode, a16z GP Martin Casado sits down with Sherwin Wu, Head of Engineering for the OpenAI Platform, to break down how OpenAI organizes its platform across models, pricing, and infrastructur...
Sherwin Wu, Head of Engineering for OpenAI's Platform, discusses how OpenAI has evolved from pursuing a single general-purpose model to building a portfolio of specialized models and customization tools. The conversation covers OpenAI's unique dual strategy of running both ChatGPT (800M weekly users) and a developer API, the shift from prompt engineering to context engineering, and how fine-tuning APIs with reinforcement learning enable companies to leverage proprietary data. Wu also explains why models resist abstraction, making them 'anti-disintermediation technology,' and how deterministic agent workflows are proving more practical than fully autonomous AI for many enterprise use cases.
Discussion of OpenAI's unusual dual strategy of operating both ChatGPT (reaching 800M weekly users, 10% of global population) and a developer API that powers competitors. Wu explains how growth and mission alignment reduce internal tension, and introduces the concept of models as 'anti-disintermediation technology' that resist abstraction.
Wu describes the industry's dramatic shift from believing in a single AGI model to embracing model proliferation and specialization. He explains how different models excel at different tasks (GPT-5 for planning, Composer for fast iteration, Codex for coding) and why this diversity is actually beneficial for the ecosystem.
Deep dive into OpenAI's fine-tuning capabilities, particularly the new reinforcement fine-tuning (RFT) API that allows companies to leverage proprietary data to create world-class specialized models. Wu explains the evolution from basic supervised fine-tuning to RL-based customization and the potential for data-sharing arrangements.
Wu explains how the industry's thinking has evolved from believing prompt engineering would become obsolete to recognizing context engineering as critical. The focus has shifted from crafting perfect prompts to designing what tools, data, and retrieval mechanisms models have access to.
Discussion of OpenAI's pricing strategy, including why usage-based pricing is a 'one-way ratchet' that companies never abandon once adopted. Wu explains the cost-plus approach for API pricing and explores why outcome-based pricing may not be necessary when test-time compute already correlates with value.
Wu addresses OpenAI's open source strategy, explaining why releasing GPT-OSS doesn't create cannibalization risk and actually strengthens the ecosystem. He discusses the distinction between open weights and true open source, and why inference difficulty creates natural moats.
Wu explains OpenAI's Agent Builder product and the surprising discovery that most enterprise work requires deterministic, SOP-driven workflows rather than fully autonomous agents. He discusses two types of work: knowledge-based (like coding) versus procedural (like customer support), and why regulated industries need constrained model behavior.
How OpenAI Builds for 800 Million Weekly Users: Model Specialization and Fine-Tuning
Ask me anything about this podcast episode...
Try asking: