What Anthropic's MCP Launch Got Right — And What's Still Missing in 2026
An architectural review of the Model Context Protocol: its standardizing triumphs, multi-tenant auth challenges, and the path to edge-hosted agent ecosystems.
The Birth of an Integration Standard
When Anthropic open-sourced the Model Context Protocol (MCP), it sent shockwaves through the AI developer ecosystem.
Before MCP, the world of LLM tool-calling was a fragmented mess. If you wanted to build an integration for an AI assistant, you had to write custom tool schemas and glue code for every individual framework. A tool built for LangChain could not run inside LlamaIndex, Cursor, or Claude Desktop without a complete rewrite.
Anthropic solved this fragmentation by introducing a universal, open standard: MCP.
By standardizing how clients, hosts, and servers exchange tool definitions and resources over a lightweight JSON-RPC 2.0 protocol, MCP did for AI tools what LSP (Language Server Protocol) did for IDE compilers.
Yet, as we build increasingly complex, multi-user transactional agents in 2026, we are bumping against the hard architectural limits of the original MCP specification.
This post provides a deep architectural review of what Anthropic's MCP launch got right, and the three critical Model Context Protocol limitations that developers must solve to scale agents in production.
What MCP Got Right: The Core Triumphs
Anthropic's design of the MCP protocol is brilliant in its simplicity and focus. They nailed three architectural patterns that have solidified MCP as the industry standard:
1. Simple JSON-RPC 2.0 over Stdio and SSE
MCP uses a highly lightweight, transport-agnostic JSON-RPC 2.0 interface. It natively supports two communication channels:
- Stdio (Standard Input/Output): Perfect for local agents. Your IDE editor (like Claude Code) spawns the MCP server as a local subprocess, writing JSON requests to
stdinand reading tool execution responses fromstdout. - SSE (Server-Sent Events): Ideal for web applications. The client establishes a persistent HTTP connection to receive real-time tool events.
2. Standardized Resource and Prompt Templates
MCP goes beyond simple "tool calling." It establishes standard schemas for:
- Resources: Exposing static files, databases, or schemas directly to the LLM context.
- Prompts: Pre-defined system instructions and user templates that the model can fetch dynamically.
3. Local Client Decoupling
By decoupling the integration code from the LLM framework, you can register an MCP server once, and it works instantly across Claude Desktop, Claude Code, Cursor, Cline, or custom Python scripts without writing a single line of SDK glue code.
What's Still Missing: The Three Critical Limitations
While MCP is perfect for running local scripts on a single laptop, it falls apart when you try to scale it to a multi-tenant web application.
Here are the three critical architectural limitations of the current MCP specification:
┌─────────────────────────────────────────────────────────────────────────┐
│ The 3 Missing Pillars of Multi-Tenant MCP │
├─────────────────────────────────────┬───────────────────────────────────┤
│ 1. Multi-Tenant Auth │ Current: Plaintext local tokens │
│ │ Wanted: PKCE Token Isolation │
├─────────────────────────────────────┼───────────────────────────────────┤
│ 2. Edge Performance │ Current: Slow local subprocesses │
│ │ Wanted: Edge CDN routing (CF) │
├─────────────────────────────────────┼───────────────────────────────────┤
│ 3. Discoverability │ Current: Hardcoded local files │
│ │ Wanted: Central Dynamic Registry │
└─────────────────────────────────────┴───────────────────────────────────┘
Limitation 1: The Multi-Tenant Authentication Gap
In the current local MCP model, server credentials (like Slack access tokens or Stripe API keys) are stored in plain-text inside your local configuration files (e.g. ~/.claude.json or claude_desktop_config.json):
{
"mcpServers": {
"slack": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-slack"],
"env": {
"SLACK_BOT_TOKEN": "xoxb-your-plaintext-token"
}
}
}
}
This works for a developer's local machine, but it is a security disaster for multi-user SaaS applications. If you build a web app where 10,000 customers connect their own Slack accounts, you cannot spin up 10,000 separate local subprocesses, nor can you store 10,000 plaintext master tokens on your primary agent server.
The protocol completely lacks a native standard for delegated, vaulted token authorization (OAuth with PKCE).
Limitation 2: Subprocess Execution Overhead
Local stdio MCP servers require spawning a new OS subprocess for every tool call. This introduces significant CPU overhead and execution latency (often 500ms–1.5s per tool call), which scales poorly for high-traffic web backends.
To support real-time interactions, the integration layer must run in serverless, edge-computed environments (like Cloudflare Workers) where cold starts are under 10ms and API translations resolve in under 50ms.
Limitation 3: The Discovery and Registry Gap
There is no central, trusted registry for MCP servers. Developers are forced to browse random GitHub repositories or hardcode config paths manually. There is no standard for dynamic discovery: if an agent lands on a new API, it cannot automatically query a registry to resolve the correct tool mappings.
How wmcp.sh Solves the Missing Pieces
At wmcp.sh, we designed our edge-hosted gateway specifically to address these three critical limitations:
- PKCE Token Isolation Proxying: We keep raw integration keys out-of-band. The agent server is only issued a lightweight, scoped session token, and all tool requests route through our hardware-isolated vault proxy.
- Edge Routing (Cloudflare Workers): Our gateway is compiled and hosted on Cloudflare Workers, providing sub-50ms execution times worldwide and bypassing slow local subprocess starts.
- The Storefront & API Registry: We maintain a central, validated directory of public API and e-commerce schemas, enabling agents to dynamically discover and map tools on the fly.
Here is a Python script illustrating how easily a client can query tools dynamically from our edge registry, bypassing local subprocess starts entirely:
import requests
def discover_and_list_tools(target_url: str) -> dict:
# Query our dynamic edge gateway
r = requests.get(
"https://wmcp.sh/api/v1/tools",
params={"url": target_url},
timeout=6
)
if r.status_code == 200:
data = r.json()
print(f"[Registry] Resolved {len(data.get('tools', []))} tools for host: {data.get('host')}")
return data
return {}
# Dynamic registry lookup
res = discover_and_list_tools("https://www.everlane.com/products/mens-organic-cotton-tee")
Help Shape the Future of MCP
Anthropic gave us an incredible gift: a universal, open standard for AI tools.
But to transition from local developer toys to highly secure, multi-tenant enterprise agent applications, we must solve the authentication, performance, and discoverability gaps.
By adopting edge-hosted routing gateways and decentralized token vaults, we can build a secure, robust, and lightning-fast tool ecosystem that scales.
Explore our edge routing schemas at wmcp.sh and join the movement to scale the Model Context Protocol today.