| Episode | Status |
|---|---|
| Episode | Status |
|---|---|
At 22, Brendan Foody is both the youngest Conversations with Tyler guest ever and the youngest unicorn founder on record. His company Mercor hires the experts who train frontier AI models—from poets g...
Brendan Foody, 22-year-old CEO of Mercor (fastest-growing unicorn ever), discusses how his company hires experts to train frontier AI models. The conversation covers AI evaluation methodologies, the future of knowledge work shifting toward building RL environments, labor market efficiency, and concrete insights on hiring practices. Key revelations include 30% annual improvement in AI performance on economically valuable tasks, why rubrics matter more than raw data, and how most knowledge workers will transition from doing repetitive analysis to training AI agents within five years.
Mercor hires domain experts (poets, economists, lawyers) to create evaluation rubrics for AI models rather than just providing raw text. The key insight is that experts teach models once, which then scales to billions of users. Discussion covers how to identify good poets, the importance of some disagreement among graders, and why rubrics are more valuable than raw training data.
Mercor's research with top domain experts reveals frontier models improving 30% annually on economically valuable tasks, with GPT-5 scoring 64% overall. The methodology involves surveying hundreds of experts about how they spend time, then creating corresponding prompts and rubrics weighted by economic value (salary/customer willingness to pay).
Deep philosophical discussion on whether to enshrine current aesthetic standards or historical ones. Foody argues for eventually modeling taste from every era, allowing personalization. Explores the tension between what average users prefer versus what top experts consider quality, and whether we should model historical poets like Milton rather than contemporary ones.
Foody predicts that within 5 years, most high-end knowledge workers will transition from doing repetitive tasks to building RL environments and training agents. This represents a shift similar to software development - fixed cost investment in teaching agents, then unlimited reuse. Society will become a massive reinforcement learning machine.
Concrete discussion of what data would be most valuable for training models. Foody argues that evaluation data (rubrics, test questions with answers, unit tests) is more valuable than raw output data. Proposes that economics journals should send referee reports and submissions, and top seminars should be recorded and anonymized.
Foody explains Mercor's hiring methodology, which uses their own AI technology. The key mistake most companies make is not measuring actual job skills, instead relying on vibe-based conversations. For technical roles, give candidates projects and grade them. For non-technical roles, drill into similar past experiences and talk to references.
The fundamental inefficiency in labor markets is disaggregation - candidates apply to dozens of jobs, companies consider a tiny fraction of candidates. LinkedIn has distribution but lacks effective matching. The solution requires AI-powered matching combined with the shift toward fractional, remote work and model training.
Critical framework for understanding job displacement. Software engineering is highly price elastic - 10x efficiency may create 10x more engineers building 100x more software. Other domains like accounting or customer support may have inelastic demand. Early-stage capital allocation and product distribution also show high elasticity.
Discussion of the strong statistical correlation between dyslexia and entrepreneurship. Foody's theory: dyslexic people learn to delegate early out of necessity, a critical founder skill that competent people often don't develop until later. Also forces focus on comparative advantages and big-picture thinking rather than getting lost in details.
Mercor's next focus is building evaluations for long-horizon tasks (days/weeks) with multiple tool integrations - bridging the gap between model intelligence and actual enterprise usefulness. The past two years focused on intelligence; the next phase is about practical utility in real workflows.
Brendan Foody on Teaching AI and the Future of Knowledge Work
Ask me anything about this podcast episode...
Try asking: