Episode	Podcast	Published	Duration	Status

AI & I

MCP Servers: Teaching AI to Use the Internet Like Humans

October 1, 2025•3100•8,911 words•Dan Shipper

Description

If your MCP server has dozens of tools, it’s probably built wrong.You need tools that are specific and clear for each use case—but you also can’t have too many. This creates an almost impossible trade...

Summary

Alex Rattray, CEO of Stainless (the API company behind OpenAI and Anthropic's SDKs), reveals why current MCP implementations are fundamentally broken and proposes a radical solution: code execution sandboxes instead of traditional tool calls. He shares how his team uses AI internally with custom Git-based knowledge repositories, explains the critical context window and security challenges plaguing MCP adoption, and outlines a vision where AI agents write and execute code directly against APIs rather than making dozens of individual tool calls.

Jump to Topic

APIs as the Internet's Nervous System and Stainless's Mission

Alex explains APIs as the 'dendrites of the internet'—the fundamental connections that enable all modern software. He positions Stainless's core mission as making computer-to-computer communication easier, which naturally extends to enabling LLMs to interact with APIs through MCP (Model Context Protocol).

•APIs are how computers talk to computers—without them, there is no internet functionality
•Stainless builds SDKs for major companies (OpenAI, Anthropic, Stripe) to make API integration seamless
•The rise of AI introduces a new type of 'computer' that needs to interact with existing systems
•MCP is positioned as the interface layer between LLMs and traditional APIs, similar to how SDKs serve human developers

Why Current MCP Implementations Are Failing

Alex reveals the fundamental problem with MCP: to replicate what humans can do in a dashboard, you'd need to expose hundreds of API endpoints as tools, which burns through the entire context window and confuses models. Current MCP servers are severely limited compared to their web UI counterparts, restricting AI capabilities to just a few operations.

•A full Stripe dashboard equivalent would require hundreds of MCP tools, consuming hundreds of thousands of tokens
•Context window limitations make it impossible to expose complete API functionality through traditional MCP tools
•Models get confused and perform poorly when presented with too many tool options simultaneously
•The vision of AI doing anything a human operator can do is currently unattainable with standard MCP architecture
•Security and permissions add another layer of complexity beyond just the context problem

Real-World MCP Use Case: Business Intelligence Across Multiple SaaS Tools

Alex shares how he personally uses MCP servers (Notion, HubSpot, Gong, Postgres) to query business data across multiple systems. He maintains a Git repository where Claude stores curated notes, customer quotes, and SQL queries for future reference, creating a persistent knowledge base that reduces repeated MCP calls.

•Uses MCP to query across Notion, HubSpot, Gong, and internal Postgres databases for business insights
•Maintains a Git repo where Claude Code stores customer quotes, SQL queries, and analysis for reuse
•The knowledge repository approach reduces context usage by caching frequently needed information
•Still experimental due to connection stability issues and other 'paper cuts' in current MCP implementations
•Team uses this for customer support, board prep analytics, and cross-referencing business data

Best Practices for Building Effective MCP Servers Today

Alex outlines current best practices: keep tool counts low, make descriptions precise and specific, minimize input parameters, return minimal response data, and invest heavily in evaluation systems. He emphasizes the need for product management discipline to identify high-value use cases rather than trying to expose entire APIs.

•Keep the number of tools relatively small—dozens of tools means it's probably built wrong
•Tool names and descriptions must be both concise and sufficiently detailed (good writing is hard)
•Minimize input schema properties and return only essential data to conserve context
•Use techniques like JQ filters to let models select relevant data from responses
•Build comprehensive eval systems across different clients (Cursor, Claude Code) and models
•Implement feedback mechanisms (like a 'send feedback' tool) to improve MCP servers iteratively

The Code Execution Solution: Replacing Tool Calls with Sandboxed TypeScript

Alex proposes a revolutionary approach: instead of dozens of MCP tools, give models just two—one to execute TypeScript code using the API's SDK, and one to search documentation. This reduces context usage to ~1,000 tokens upfront, eliminates pagination overhead, and leverages models' superior code-writing abilities over tool selection.

•Replace hundreds of tools with two: execute_code and search_docs
•Models write TypeScript using the API's SDK (e.g., stripe.customers.create) in a sandbox
•Pagination and iteration happen in code, returning only final results (10 lines vs thousands of tokens)
•Type checking provides immediate feedback on hallucinated API calls without model round-trips
•Code execution runs on servers near the API (AWS) for speed, not requiring model round-trips per action
•Upfront context cost drops from hundreds of thousands of tokens to ~1,000 tokens

Security Model: OAuth Scopes Over MCP Restrictions

Alex argues that security must happen at the API layer through OAuth with granular permissions, not by limiting MCP tool exposure. The code execution sandbox should restrict network access to only approved API endpoints, preventing models from making unauthorized external connections.

•Security should be enforced at the API layer via OAuth scopes, not MCP tool restrictions
•Code execution sandboxes must whitelist allowed network destinations (e.g., only api.stripe.com)
•Current approach of limiting MCP tools for security is insufficient—anything in the API is accessible
•OAuth with granular scopes is hard to build but necessary for proper security
•Network isolation in sandboxes prevents AI from exfiltrating data or accessing unintended services

The YOLO Adoption Strategy: Why Cautious Approaches Lose to Bold Ones

Dan pushes Alex on go-to-market strategy, arguing that AI products that win (Stable Diffusion, Claude Code) are those willing to be YOLO early, while cautious approaches (DALL-E's private beta, Codex CLI's restrictions) fall behind. Individual developers need access now, even with security trade-offs.

•Stable Diffusion beat DALL-E by being open and YOLO despite being technically inferior
•Claude Code's 'dangerously skip permissions' mode drove adoption over Codex's cautious approach
•Individual developers and small teams are the AI-first adopters, not enterprises
•A developer-focused version with fewer security restrictions could drive faster adoption
•Early adopters willing to accept risk become tomorrow's enterprise customers

The Future: From One-Off Tasks to Production Software via AI-Generated Code

Alex envisions a future where AI-written code for one-off tasks (like refunding a customer) becomes production software. When the same task repeats, the AI commits the code to the repo, turning exploratory chat interactions into permanent automation. Tool building becomes purely prompt engineering.

•One-off AI tasks that prove useful should automatically become permanent automation
•The same code execution sandbox serves both ad-hoc queries and production software generation
•Future 'tool building' will be primarily prompt engineering, not traditional coding
•Chat is good for exploration, but repeated tasks should become dashboards/automation
•The line between AI assistance and AI-generated production code will blur significantly

AI & I

MCP Servers: Teaching AI to Use the Internet Like Humans

0:00 / 0:00

View original episode →

Summary

Jump to Topic

APIs as the Internet's Nervous System and Stainless's Mission

•APIs are how computers talk to computers—without them, there is no internet functionality
•Stainless builds SDKs for major companies (OpenAI, Anthropic, Stripe) to make API integration seamless
•The rise of AI introduces a new type of 'computer' that needs to interact with existing systems
•MCP is positioned as the interface layer between LLMs and traditional APIs, similar to how SDKs serve human developers

Why Current MCP Implementations Are Failing

•A full Stripe dashboard equivalent would require hundreds of MCP tools, consuming hundreds of thousands of tokens
•Context window limitations make it impossible to expose complete API functionality through traditional MCP tools
•Models get confused and perform poorly when presented with too many tool options simultaneously
•The vision of AI doing anything a human operator can do is currently unattainable with standard MCP architecture
•Security and permissions add another layer of complexity beyond just the context problem

Real-World MCP Use Case: Business Intelligence Across Multiple SaaS Tools

•Uses MCP to query across Notion, HubSpot, Gong, and internal Postgres databases for business insights
•Maintains a Git repo where Claude Code stores customer quotes, SQL queries, and analysis for reuse
•The knowledge repository approach reduces context usage by caching frequently needed information
•Still experimental due to connection stability issues and other 'paper cuts' in current MCP implementations
•Team uses this for customer support, board prep analytics, and cross-referencing business data

Best Practices for Building Effective MCP Servers Today

•Keep the number of tools relatively small—dozens of tools means it's probably built wrong
•Tool names and descriptions must be both concise and sufficiently detailed (good writing is hard)
•Minimize input schema properties and return only essential data to conserve context
•Use techniques like JQ filters to let models select relevant data from responses
•Build comprehensive eval systems across different clients (Cursor, Claude Code) and models
•Implement feedback mechanisms (like a 'send feedback' tool) to improve MCP servers iteratively

The Code Execution Solution: Replacing Tool Calls with Sandboxed TypeScript

•Replace hundreds of tools with two: execute_code and search_docs
•Models write TypeScript using the API's SDK (e.g., stripe.customers.create) in a sandbox
•Pagination and iteration happen in code, returning only final results (10 lines vs thousands of tokens)
•Type checking provides immediate feedback on hallucinated API calls without model round-trips
•Code execution runs on servers near the API (AWS) for speed, not requiring model round-trips per action
•Upfront context cost drops from hundreds of thousands of tokens to ~1,000 tokens

Security Model: OAuth Scopes Over MCP Restrictions

•Security should be enforced at the API layer via OAuth scopes, not MCP tool restrictions
•Code execution sandboxes must whitelist allowed network destinations (e.g., only api.stripe.com)
•Current approach of limiting MCP tools for security is insufficient—anything in the API is accessible
•OAuth with granular scopes is hard to build but necessary for proper security
•Network isolation in sandboxes prevents AI from exfiltrating data or accessing unintended services

The YOLO Adoption Strategy: Why Cautious Approaches Lose to Bold Ones

•Stable Diffusion beat DALL-E by being open and YOLO despite being technically inferior
•Claude Code's 'dangerously skip permissions' mode drove adoption over Codex's cautious approach
•Individual developers and small teams are the AI-first adopters, not enterprises
•A developer-focused version with fewer security restrictions could drive faster adoption
•Early adopters willing to accept risk become tomorrow's enterprise customers

The Future: From One-Off Tasks to Production Software via AI-Generated Code

•One-off AI tasks that prove useful should automatically become permanent automation
•The same code execution sandbox serves both ad-hoc queries and production software generation
•Future 'tool building' will be primarily prompt engineering, not traditional coding
•Chat is good for exploration, but repeated tasks should become dashboards/automation
•The line between AI assistance and AI-generated production code will blur significantly

AI & I

MCP Servers: Teaching AI to Use the Internet Like Humans

0:00 / 0:00

MCP Servers: Teaching AI to Use the Internet Like Humans

Description

Summary

Jump to Topic

APIs as the Internet's Nervous System and Stainless's Mission

Why Current MCP Implementations Are Failing

Real-World MCP Use Case: Business Intelligence Across Multiple SaaS Tools

Best Practices for Building Effective MCP Servers Today

The Code Execution Solution: Replacing Tool Calls with Sandboxed TypeScript

Security Model: OAuth Scopes Over MCP Restrictions

The YOLO Adoption Strategy: Why Cautious Approaches Lose to Bold Ones

The Future: From One-Off Tasks to Production Software via AI-Generated Code

Navigate

Chat with Episode

Summary

Jump to Topic

APIs as the Internet's Nervous System and Stainless's Mission

Why Current MCP Implementations Are Failing

Real-World MCP Use Case: Business Intelligence Across Multiple SaaS Tools

Best Practices for Building Effective MCP Servers Today

The Code Execution Solution: Replacing Tool Calls with Sandboxed TypeScript

Security Model: OAuth Scopes Over MCP Restrictions

The YOLO Adoption Strategy: Why Cautious Approaches Lose to Bold Ones

The Future: From One-Off Tasks to Production Software via AI-Generated Code

Navigate

Chat with Episode