· mcp / rest / api

MCP vs REST: When Each Makes Sense for AI Agents

REST is battle-tested, but MCP was built for agents. Here's when to use each — with a decision matrix, migration notes, and the honest case for both.

By toolchew

2,002 words · 11 min read

REST is what you reach for when a human developer writes the integration. MCP is what you reach for when an AI agent does it at runtime. WorkOS put it cleanly: “REST APIs serve developers. MCP serves AI agents.” The difference in caller type drives every architectural tradeoff downstream — statefulness, discovery, bidirectionality, and ecosystem depth.

This article is for backend and full-stack developers deciding how to expose services to AI agents, or evaluating whether to layer MCP on top of an existing REST API.

The architectural split

REST is defined by RFC 9110 (June 2022) as a “stateless application-level protocol for distributed, collaborative, hypertext information systems.” Every request carries everything the server needs to understand it. No session. No negotiation. Clients must know the endpoints in advance — that’s the contract.

MCP (Model Context Protocol, spec version 2025-11-25) is the opposite. The spec describes it as a “stateful session protocol focused on context exchange and sampling coordination.” It runs on JSON-RPC 2.0, borrows heavily from the Language Server Protocol design, and defines three roles: Hosts (LLM apps like Claude Desktop or Cursor), Clients (connectors maintaining one session per server), and Servers (capability providers).

DimensionRESTMCP (2025-11-25)
StateStatelessStateful, per-session
DiscoveryStatic OpenAPI docs (build-time)tools/list RPC (runtime)
SchemaOpenAPI/SwaggerJSON Schema 2020-12, per tool, live
DirectionRequest-response onlyBidirectional (server can invoke LLM, elicit user input)
AuthFlexible (API keys, OAuth 2.0, custom)OAuth 2.1 + PKCE, mandatory for HTTP transport
CachingNative (Cache-Control, ETag)Not defined
Long-running opsCustom polling/webhooksNative Tasks primitive (experimental)
Browser supportNativeRequires MCP client library
Horizontal scalingAny node, any requestSession affinity required
GovernanceIETFLinux Foundation / AAIF (Dec 2025)

Where MCP wins

Runtime tool discovery

This is MCP’s clearest structural advantage. An AI agent doesn’t know your endpoint catalogue the way a developer does. With REST, somebody reads the docs and hard-codes the calls. With MCP, the agent sends tools/list at session start and gets back a live catalog — machine-readable JSON Schema 2020-12 per tool, including what each parameter means and whether the tool is read-only or destructive.

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/list",
  "params": { "cursor": "optional-cursor-value" }
}

When the tool set changes, the server pushes notifications/tools/list_changed. The client re-fetches. No stale documentation for an agent to misread.

This also addresses the N×M adapter problem. Without a protocol standard, M tools connecting to N AI clients requires M×N custom integrations. MCP collapses that to M servers + N clients, each speaking the same protocol — the USB analogy the community keeps reaching for.

Multi-turn sessions and bidirectionality

REST is caller-initiated. Always. An LLM can call your API, but your API can never ask the LLM anything back.

MCP sessions go both ways. During an active session, a server can request sampling (ask the LLM to make an inference call), request elicitation (ask the user for structured input mid-workflow), and push progress notifications. The server becomes an active participant in the agent loop, not a passive data endpoint.

Long-running operations

MCP 2025-11-25 includes an experimental Tasks primitive with a proper state machine: workinginput_requiredworkingcompleted / failed / cancelled. A tool call returns a taskId immediately. The agent polls tasks/get or blocks on tasks/result. The input_required state lets the server pause and ask the user for something before continuing.

Build this in REST and you’re constructing webhook infrastructure, polling loops, and custom state storage. MCP standardizes the pattern.

Standardized auth

MCP mandates OAuth 2.1 with PKCE (S256 required) for HTTP transport. Access tokens go in Authorization: Bearer headers, never URI query strings. Resource Indicators (RFC 8707) are mandatory. For agentic workflows that need to delegate credentials across tool boundaries, this uniformity saves real engineering time.

Where REST wins

Caching

HTTP caching is 30 years deep. Cache-Control, ETag, Last-Modified, Vary — every CDN, proxy, and browser understands these natively. High-volume read APIs (product catalogs, dashboards, anything serving traffic at scale) can shed backend load by orders of magnitude through standard cache headers. MCP defines no caching semantics at the protocol level.

Browser clients

Every browser supports HTTP. Calling a REST endpoint from JavaScript requires zero dependencies. MCP requires the TypeScript SDK (v1.29.0, March 2026) or the Python SDK. That’s not an option in browser contexts where you can’t install libraries.

Stateless horizontal scaling

REST’s statelessness means any load-balanced node handles any request. MCP sessions require session affinity — sticky routing or a shared session store across nodes. At high throughput, neither option is free.

Deterministic developer integration

When a human writes the integration — not an LLM — REST is the right tool. You know the endpoint, the method, the parameters. No reasoning required in the call path. REST is how you build a reliable, testable, version-controlled integration.

Ecosystem depth

API gateways (Zuplo, Kong), test clients (Postman, HTTPie), monitoring stacks, documentation tooling — all treat REST as the default. MCP’s ecosystem is growing fast (97M monthly SDK downloads, 10,000+ servers as of November 2025, Linux Foundation governance as of December 2025), but it cannot match REST’s three-decade head start.

Idempotent retries

RFC 9110 §9.2.2 defines GET, PUT, and DELETE as safe or idempotent — a failed request can be safely retried. MCP has no equivalent retry contract, so the caller has to decide whether a partial tool execution is safe to run again.

The hybrid angle: it’s not either/or at the network layer

MCP’s Streamable HTTP transport (the current standard since March 2025, replacing the deprecated HTTP+SSE dual-connection design from 2024) is HTTP at the wire level — JSON-RPC 2.0 payloads over standard HTTP POST and GET. A single endpoint handles both request types. The server returns application/json for single responses and text/event-stream for streaming. MCP-Session-Id in headers manages session state. Last-Event-ID enables stream resumption after a disconnect.

The practical implication: MCP and REST traffic can share the same API gateway, TLS termination, and WAF. Cloudflare Workers supports createMcpHandler from the Agents SDK natively; existing infrastructure can serve both client types without separate network paths. “MCP vs REST” is partly a false dichotomy at the transport layer — it’s more accurately a question of which protocol contract to present on top of HTTP.

Migrating from REST to MCP

If you have a REST API today and want to expose it to AI agents, the common mistake is 1:1 mapping. The GitHub API problem illustrates why: 600+ REST endpoints, direct-mapped to MCP tools, creates an overwhelming decision space for LLMs. StackOne’s production report measured it: 50 tools burn 100,000+ tokens on tool definitions alone, before any meaningful conversation begins.

The right approach:

  1. Generate stubs from OpenAPI — use codegen as a starting point, not the finish line
  2. Prune aggressively — keep tools that are distinct and decision-relevant for agent workflows; collapse similar low-level endpoints
  3. Rewrite descriptions for LLM reasoning — REST docs explain what parameters do; MCP descriptions should tell the model when and why to invoke the tool
  4. Build higher-level abstractions — replace three sequential REST calls with one MCP tool that handles the workflow internally

LangChain’s langchain-mcp-tools v0.3.3 (March 2026, Python 3.11+) and the OpenAI Agents SDK both support MCP natively for agent frameworks. Known limitation of both: text results only. Resources, Prompts, and Sampling require direct SDK integration.

Performance: what the numbers actually show

No direct MCP-vs-REST latency benchmark from a primary source exists. What does exist:

  • MCP implementation benchmarks (tmdevlab, 3.9M requests, 50 concurrent users, k6 load testing): Java and Go hit ~1,624 RPS at 0.85ms average latency; Node.js is 559 RPS at 10.66ms; Python (FastMCP 2.12.0) is 292 RPS at 26.45ms.
  • WorkOS production estimate: MCP adds approximately 100–300ms overhead per call vs. a direct custom REST integration — the cost of the JSON-RPC hop on top of the underlying API call.

The language you pick for your MCP server matters more than most people expect. Go and Java implementations are roughly 5× faster than Node and 30× faster than Python at the MCP layer.

Security: what you’re actually taking on

MCP’s stateful, trust-heavy design opens attack surfaces REST’s stateless model doesn’t have. Documented incidents:

  • April 2025: Researchers published analysis of prompt injection via tool descriptions, tool permission abuse, and lookalike tool attacks
  • June 2025: CVE-2025-49596 — critical RCE via supply chain attack, patched
  • Asana MCP incident: “Confused Deputy” attack — server cached context without re-verifying tenant identity across session boundaries

The 2025-11-25 spec mandates mitigations: servers must validate the Origin header and return 403 on invalid origins; must bind to localhost by default; access tokens must never appear in URI query strings. Still, a 2025 Hacker News thread described 95% of community MCP servers as low-quality, with misleading or dangerous tool descriptions.

If you’re deploying MCP in production, treat tool description quality as a security surface, not just a UX detail.

Recommendation matrix

SituationUseReason
AI agent discovering tools at runtimeMCPtools/list + push notifications on change
Multi-turn session with mid-workflow user inputMCPTasks primitive, input_required state, elicitation
Long-running async ops with status trackingMCPNative Tasks state machine
3+ AI clients hitting the same toolsMCPEliminates N×M adapter problem
Human developer writing deterministic codeRESTKnown endpoints, no LLM reasoning in the call path
High-volume reads with cachingRESTNative HTTP cache headers, CDN-friendly
Browser-hosted UI calling backendRESTNo client library needed
Existing REST API with occasional AI useREST + MCP wrapperAdd an MCP server on top; don’t rewrite
Enterprise audit trail, existing gatewayREST30-year ecosystem, existing compliance tooling
High-throughput horizontal scalingRESTStateless = any node handles any request

Verdict

Use MCP when the caller is an LLM that needs to reason about what to call. Use REST when the caller is a developer who already knows.

For most teams today: keep your REST API. Build a thin MCP server on top of it, designed around task-level tools rather than resource-level operations. You get AI agent compatibility without rewriting a working system. Railway, Cloudflare Workers, and similar platforms all handle Streamable HTTP deployments without specialized infrastructure.

If you’re building AI tooling from scratch and the primary consumers will be LLMs, start with MCP. The runtime discovery, bidirectionality, and standardized auth pay off immediately. Claude Code and Cursor both support MCP natively if you want a testbed without standing up a full agent framework.

Caveats

  • The MCP Tasks primitive is experimental in 2025-11-25 — check current spec status before building production workflows on it.
  • The tmdevlab benchmarks compare MCP implementations against each other, not against REST. The WorkOS 100–300ms overhead is a production estimate, not a controlled experiment.
  • Zuplo, Cloudflare Workers, LangChain, Claude Code, Cursor, and Railway links are affiliate links — see disclosure above.
  • MCP server quality varies widely. Evaluate tool descriptions carefully before trusting a third-party server in a production agent workflow.

References