Dec 14, 2025 10 min read

MCP is not the problem

The protocol is fine. The real question is when to use one, and most of the loudest critiques are really about that, not about MCP itself.

On this page

Every few weeks someone publishes a fresh “MCP is dead” post. I read it, nod at the specific complaint, then watch the headline get screenshotted into a thread that concludes the protocol itself is a mistake.

It isn’t. It’s just a protocol. The real problem is that we reach for an MCP server when something simpler would do, then blame the protocol when the agent buckles under what we shipped.

The loudest critiques are all worth taking seriously. Each one points at something real. None of them points at MCP.

The context-bloat complaint

This is the most common critique, and the most empirically grounded. The benchmark numbers are genuinely bad.

Zechner clocked Playwright at ~13.7k tokens and Chrome DevTools at ~18k. Each one eats 7–9% of a 200k window before you say hello. (source)
One writeup burned 143k of 200k tokens on tool definitions before the user’s first message, with three popular servers loaded. (The MCP Tax)
Cloudflare’s case is the most extreme: 2,500 endpoints via naive MCP took 1.17M tokens, versus around 1,000 with a search-and-execute layer plus a JS sandbox. (Code Mode)

The numbers are real, but notice what they’re about. They describe shoving every tool from every server into the prompt eagerly. That’s a deployment pattern, not the protocol.

That pattern is on the way out, and the fix has shaken out into three design schools.

Three schools of dynamic discovery

1. Search-then-load. Anthropic shipped its Tool Search Tool in November, and Claude Code is rolling it out as the default. Servers tag tools with defer_loading: true.

There are two flavors, regex and BM25. The agent searches a catalog of up to 10,000 tools and gets three to five tool_reference blocks back that expand inline.

Expansion happens server-side, so the system-prompt prefix stays intact and prompt caching keeps working. Anthropic reports Opus 4.5 climbing from 79.5% to 88.1% on large-toolset benchmarks with 85% fewer definition tokens.

2. Code-as-discovery. Anthropic’s Code Execution with MCP writeup from last month reframes the toolset as a filesystem the agent can browse.

The client lays each server out as ./servers/<name>/<tool>.ts. The agent lists the directory, reads only the files it needs, and writes TypeScript that imports them into a sandbox. They report 150k tokens dropping to 2k.

Cloudflare Code Mode pushes the same idea further. Two tools, search() and execute(), accept JavaScript and run in a V8 isolate against the full OpenAPI spec.

Their reported result: 2,500 endpoints in roughly a thousand tokens.

3. Lazy schema walk. open-mcp.org ships an expandSchema tool that breaks a tool’s input schema into one-level chunks.

Instead of paying the full schema cost upfront, the agent walks the tree on demand. It’s Tool Search applied to a single tool’s parameters instead of a whole catalog.

SEP-1576, the protocol-level proposal for context bloat, is converging on the same shape.

None of this is free. Search-then-load adds round trips that hurt agents trying to plan multi-step trajectories upfront. Code Mode needs a sandbox the client doesn’t always have. Lazy walks trade context for latency.

For small toolsets (under ~30 tools), eager injection still wins. Cursor caps configs at 40 tools. Cline ships a marketplace with filters. Servers can refuse to expose what isn’t relevant.

But the design space exists. The gap between naive MCP and well-deployed MCP is now an order of magnitude wide, and what’s left after the mitigations is your choice to wire up nine servers for a session that needs two.

The security complaint

This one is more serious. I want to be careful not to wave it away.

Simon Willison’s lethal trifecta frames it cleanly. Any agent with private data, untrusted content, and external comms is exploitable. Hardening doesn’t save you.

Invariant Labs’ GitHub MCP exploit showed it concretely: a malicious public-repo issue tricked an agent into exfiltrating from a private repo through the official server.

They’ve also documented tool poisoning, where a server swaps its tool description after the user has already approved it.

These are real vulnerabilities, and they’ve shipped to production.

But reread the trifecta. There’s no “M” in it. There’s no “P” either. The trifecta is a property of any tool-using agent.

Swap MCP for direct REST, for a CLI, for an in-process function, and the trifecta still bites. What MCP added is a common surface that makes the attacks easier to study and patch, not a new class of attack.

FastMCP’s confused-deputy advisories are MCP-specific. They got CVEs, patches, and writeups within weeks. That’s how protocols mature.

The honest read: MCP doesn’t give you security for free. Don’t deploy a server with the threat model of an internal RPC endpoint. That’s a deployment failure, not a protocol one.

The “wrong abstraction” complaint

This is the most interesting critique, and the one I think gets closest to the truth. It’s also where the framing keeps slipping.

The strongest essays in the genre all make the same move. They compare MCP to a specific better choice for a specific case:

Armin Ronacher: Your MCP Doesn’t Need 30 Tools, It Needs Code
Jeremiah Lowin: Stop Converting Your REST APIs to MCP
Mario Zechner: MCP vs CLI benchmarks
Peter Steinberger: Peekaboo 2.0: free the CLI from MCP shackles

For coding agents, the better choice is usually a CLI. Agents already know shells. The surface is text in, text out. You don’t need a protocol for that.

For procedural knowledge (“how do we deploy”), it’s a Skill or a doc the agent loads on demand. For a known, stable REST API, it’s direct HTTP. Hand over the spec and let it call.

Lowin’s framing is the one I keep coming back to:

An API that is “sophisticated” for a human is one with rich, composable, atomic parts. An API that is “sophisticated” for an agent is one that is ruthlessly curated and minimalist.

That’s right. It’s also the strongest argument for MCP existing. A ruthlessly curated, agent-shaped surface is exactly what an MCP server should be.

The mistake is to expose your internal API one-to-one through MCP and call it a day. Thoughtworks correctly put naive REST-to-MCP conversion in the Radar’s Hold ring.

MCP isn’t wrong. You skipped the design step.

The same logic runs in reverse. If your tool is already a CLI an agent can run, you don’t need to wrap it in MCP.

If your tool is a stateful session with discovery semantics a CLI can’t represent (Claude Desktop hitting your local Obsidian, ChatGPT reaching into your Linear workspace), MCP is exactly what you want.

Speakeasy’s defense puts it well. It’s a TCP-handshake-shaped problem. Nobody complains about TCP’s three-way handshake anymore. It earned its keep.

What does a server that clears all three look like?

Three names come up most when people praise specific MCP implementations: Context7, Sentry, and Linear. Run each one through the critiques above and only one walks out clean.

Server	Surface	Context-bloat	Lethal trifecta	Wrong abstraction
Context7	2 tools (`resolve-library-id`, `query-docs`)	Pass. Tiny footprint.	Pass. Public docs only, no private state, no privileged write.	Pass. The catalog is the value; a CLI can’t ship one.
Sentry	~15 tools across issues, events, alerts	Pass. Curated verbs.	Fail. Error payloads carry attacker-controlled strings (user agents, request bodies, stack traces from prod). Pair that with private telemetry and an agent that can post elsewhere, and the trifecta fully assembles.	Pass. Workspace-scoped, externally consumed.
Linear	~20 tools across issues, projects, cycles	Pass.	Fail. Email-to-issue, support integrations, and public intake forms all funnel attacker-controlled text into private workspace state.	Pass. Same shape as Sentry.

Sentry and Linear are good arguments for “well-curated surface.” Their teams are not careless. The reason they don’t clear the gauntlet is that the data they serve is fundamentally weaponizable, and the protocol is not what protects you from that. Context7 wins by accident of domain as much as by design: public library docs are simply less dangerous to expose.

The lesson is not “only build read-only public-data servers.” It’s that the trifecta is a property of the deployment surface, not the protocol. If your MCP server crosses two of the three legs, your hardening work happens at deploy time, not in your tool list.

Should this even be an MCP server?

If you’re partway into a build and the call isn’t obvious, work the questions below. The recommendation updates as you answer.

Decision picker

Should this be an MCP server?

01Who will actually call this tool?

02What are you packaging?

03Does it need session state, auth, or workspace context?

04Does it touch private user or workspace data?

05Does it ingest content from sources you do not control?

Recommendation (0/5 answered)

Pick an option above to see a recommendation.

A few rules the picker encodes:

If you write both ends of the loop, the protocol is friction. Ship a CLI or hit the API directly.
If you’re packaging how-to instead of do-thing, you want a Skill, not a tool. A markdown file the agent reads on demand is lighter than a tool definition that costs tokens every session.
Stateful, cross-session, externally consumed: that is exactly MCP’s sweet spot, and the only place it earns its overhead.
Private data plus untrusted content plus external comms still assembles the trifecta no matter what wrapper you pick. Harden the deployment; do not hope the protocol saves you.

So when is MCP the right call?

The decision rule I’ve ended up using:

Use MCP when the consumer is a client you don’t control and the surface is stateful. Claude Desktop, ChatGPT, Cursor, an IDE plugin. None of those can ship arbitrary integration code.

What they need is a uniform way to discover and invoke tools you wrote. That’s the host-to-tool standardization MCP actually delivers.

The ecosystem signal is hard to ignore. MCP support has landed in Claude Desktop, ChatGPT, Cursor, Windsurf, and OpenAI’s Agents SDK. Competitors converging on the same boundary is not an accident.

Use a CLI or direct API when the consumer is your own agent. If you’re writing both ends, the protocol overhead is friction with no upside.

Pipe the CLI. Call the API. Give the agent a sandbox and let it write code against your SDK.

Use a Skill or a prompt file when you’re packaging procedural knowledge, not a tool. A markdown file the agent reads to learn how to deploy is a different thing from a tool the agent calls to deploy.

Curate ruthlessly when you do build an MCP server. Five well-designed tools beat thirty mechanically generated ones, always. The agent reads all of them on every session, so design for that.

The plumbing isn’t the strategy

The line I keep coming back to is Zechner’s: “The protocol is just plumbing. What matters is whether your tool helps or hinders the agent’s ability to complete tasks.”

That’s the entire argument. MCP is plumbing. Reasonable plumbing, increasingly well-specified, with security bugs that keep getting patched.

It does one thing well: let a client you don’t control talk to a tool you wrote. Most of the failure modes pinned on it are really failures of the question one level up.

Did this thing need to be an MCP server at all?

If you’ve written off MCP because of a token-count screenshot or a security writeup, you’ve written off the wrong thing.

Write off the deployment pattern. Write off the reflex to wrap everything. Keep the protocol around for the cases it was built for, and reach for it on purpose.

MCP is not the problem

The context-bloat complaint

Three schools of dynamic discovery

The security complaint

The “wrong abstraction” complaint

What does a server that clears all three look like?

Should this even be an MCP server?

So when is MCP the right call?

The plumbing isn’t the strategy

The agent stack I actually ship with

Multi-agent is where context goes to die

Less memory, more context

The context-bloat complaint

Three schools of dynamic discovery

The security complaint

The “wrong abstraction” complaint

What does a server that clears all three look like?

Should this even be an MCP server?

So when is MCP the right call?

The plumbing isn’t the strategy

Related writing

The agent stack I actually ship with

Multi-agent is where context goes to die

Less memory, more context