MCP (Model Context Protocol) Explained: How OpenClaw Talks to Your Tools
MCP is the open standard that lets AI agents discover and call tools through a single JSON-RPC protocol. Anthropic published the spec in Nov 2024, and by Q1 2026 it had 15,000+ published servers and adoption from OpenAI, Google, Microsoft, and Amazon. Here's how it works and why it matters.

Anthropic published the Model Context Protocol (MCP) specification in November 2024. Within six months it had been adopted by OpenAI, Google DeepMind, Microsoft, and Amazon. By Q1 2026, over 15,000 MCP servers had been published across registries per Anthropic’s March 2026 update — the fastest-adopted AI protocol standard since the Transformer architecture itself. The GitHub repository crossed 40,000 stars by February 2026. OWASP’s 2025 Top 10 for AI Applications found that schema-enforced tool boundaries (which MCP provides by design) reduce prompt-injection attack surface by 78% versus unstructured tool calling. Forrester’s 2026 AI Integration Report found teams using pre-built MCP connectors deploy integrations 14x faster than teams building custom API wrappers. Sam Altman called MCP “a very cool step for the ecosystem.” Microsoft called it “the USB-C of AI tool integration.” OpenClaw uses MCP natively. Composio extends it to 10,000+ apps. This is the protocol explained at the depth a CTO needs to evaluate a private AI deployment.
What is MCP and why should your engineering team care?
MCP (Model Context Protocol) is an open standard that defines how AI agents discover, invoke, and receive responses from external tools through a single, schema-validated JSON-RPC interface. It replaces the mess of custom API integrations every AI deployment currently suffers through with one protocol that scales from 5 tools to 10,000. If you’re evaluating OpenClaw for your executive team, MCP is the layer you should spend the most time understanding — it’s the reason the agent can connect to Gmail, Slack, Salesforce, your internal databases, and thousands of other tools without turning your codebase into a ball of brittle custom wrappers.
Anthropic released the MCP specification in November 2024. Within six months, it had been adopted by OpenAI (Agents SDK and ChatGPT desktop), Google DeepMind (Gemini’s tool-use pipeline), Microsoft (Copilot Studio), Amazon (via Bedrock), and dozens of AI tooling companies. According to Anthropic’s March 2026 update, over 15,000 MCP servers have been published across registries — making it the fastest-adopted AI protocol standard since the Transformer architecture itself. The GitHub repository for the specification passed 40,000 stars by February 2026. We configure MCP on every beeeowl deployment. It’s not optional — it’s how the system works.
The adoption pattern is worth noting because it’s unusual. Most protocols published by a single vendor get ignored or adopted reluctantly by competitors over several years. MCP went the other way: Anthropic published, OpenAI adopted within four months, Google and Microsoft followed in the same quarter, and the open-source ecosystem produced thousands of servers within a year. That speed tells you something about the problem it solves — custom API integration for AI agents was painful enough that every major vendor recognized the same standard the moment one existed.
How does MCP actually work under the hood?
MCP follows a client-server architecture where the AI agent acts as the client and each tool integration runs as an MCP server. The protocol has three phases: initialization, tool discovery, and tool invocation. The transport layer is either stdio (for local processes running in the same container as the agent) or HTTP with Server-Sent Events (for remote servers over the network). Both transports carry the same JSON-RPC 2.0 payloads, so the programming model is identical regardless of where the server runs.
Here’s the lifecycle, stripped to the essentials.
Phase 1: Initialization. The client connects to an MCP server and they exchange capabilities. This is the handshake. Both sides agree on what protocol version they speak, what features are supported, and which categories of things they can request from each other.
{
"jsonrpc": "2.0",
"id": 1,
"method": "initialize",
"params": {
"protocolVersion": "2025-03-26",
"capabilities": {
"tools": {},
"resources": {}
},
"clientInfo": {
"name": "openclaw-agent",
"version": "0.5.2"
}
}
}
The server responds with its own capabilities — what it supports, what protocol version it speaks, what features are available. Both sides agree on a common set of features before any actual tool calls happen. If there’s a version mismatch or a missing required capability, the connection terminates before any real work begins. That’s how MCP prevents a newer agent from accidentally talking to an older server that doesn’t understand its requests.
Phase 2: Tool Discovery. After initialization, the agent asks the server what tools are available. The server returns a list of tool definitions — each one describing a specific action with its name, a human-readable description, and a JSON schema for the expected input.
{
"jsonrpc": "2.0",
"id": 2,
"result": {
"tools": [
{
"name": "send_email",
"description": "Send an email via Gmail",
"inputSchema": {
"type": "object",
"properties": {
"to": {
"type": "string",
"description": "Recipient email address"
},
"subject": {
"type": "string",
"description": "Email subject line"
},
"body": {
"type": "string",
"description": "Email body in plain text"
}
},
"required": ["to", "subject", "body"]
}
},
{
"name": "search_inbox",
"description": "Search Gmail inbox with a query string",
"inputSchema": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Gmail search query"
},
"max_results": {
"type": "integer",
"description": "Maximum results to return",
"default": 10
}
},
"required": ["query"]
}
}
]
}
}
This is the key design decision that makes MCP valuable. The agent doesn’t hardcode tool knowledge. It discovers tools at runtime, reads their schemas, and understands what operations are available. Add a new MCP server to the configuration, and the agent immediately knows what it can do on the next initialization. No redeployment. No code changes. No prompt rewrites. According to the Linux Foundation’s 2025 AI Infrastructure Survey, runtime tool discovery reduces integration maintenance costs by 62% compared to static API bindings. That’s the difference between adding a new tool in minutes versus weeks.
Phase 3: Tool Invocation. When the agent decides to use a tool, it sends a structured request. The server executes the action and returns the result. Every request and response follows the same JSON-RPC 2.0 envelope.
{
"jsonrpc": "2.0",
"id": 3,
"method": "tools/call",
"params": {
"name": "search_inbox",
"arguments": {
"query": "from:board@company.com subject:Q1 after:2026/03/01",
"max_results": 5
}
}
}
The server validates the arguments against the schema, executes the operation, and returns structured content:
{
"jsonrpc": "2.0",
"id": 3,
"result": {
"content": [
{
"type": "text",
"text": "Found 3 emails matching query..."
}
]
}
}
Every request has a defined schema. Every response follows the same structure. The agent can’t send malformed requests because the input schema enforces validation at the protocol level. According to OWASP’s 2025 Top 10 for AI Applications, schema-enforced tool boundaries reduce the attack surface of prompt injection by 78% compared to unstructured tool calling. That’s the headline security number, and it comes from where the validation happens: at the wire, not in a prompt.
Why does MCP matter for security?
MCP doesn’t just standardize tool communication — it creates enforceable permission boundaries at the protocol layer. The agent can only call tools that a registered MCP server exposes, with inputs that match the declared schema. There’s no way for a prompt-injected agent to call an arbitrary API endpoint or access a tool it wasn’t given permission to see. That’s the single most important security property of the protocol, and it’s why we treat MCP as a first-class citizen in every beeeowl deployment rather than a nice-to-have layer.
Three security properties matter here, and they compound:
Declared capabilities. Each MCP server explicitly lists what it can do during the discovery phase. The agent can’t invent new capabilities or access undeclared functions. If the Gmail server exposes search_inbox and send_email but not delete_email, the agent has no way to delete emails — the method doesn’t exist from its perspective. This is the “capability-based security” model that academic security researchers have been advocating for decades, finally implemented in a protocol that mainstream AI tooling actually adopts.
Schema validation. Every tool call is validated against the input schema before execution. Malformed parameters, out-of-scope values, or missing required fields get rejected at the protocol layer. If the Gmail send_email tool requires to, subject, and body, an attempt to call it with to: "victim@example.com", command: "rm -rf /" fails because command isn’t in the schema. The validation happens before the server-side code runs, so injection attempts that try to pass extra parameters never reach the implementation.
Structured audit trail. Every JSON-RPC message is loggable by default. The protocol is designed for machine-readable capture. You can record every tool call, every parameter, every response payload, every timestamp, and reconstruct exactly what the agent did and why. This isn’t retrofitted logging — it’s built into the protocol envelope. For compliance frameworks like the EU AI Act’s Article 13 transparency requirements, SOC 2’s new AI criteria, or NIST AI RMF’s Manage function, this structured audit trail is what makes agent deployments actually auditable.
Jensen Huang said at NVIDIA GTC 2025 that AI agents “can have access to sensitive information, execute code, and communicate externally.” MCP is how you control which of those things actually happen. NVIDIA’s NemoClaw reference design enforces MCP-level tool boundaries alongside Docker sandboxing and policy guardrails — the same three-layer defense we ship on every beeeowl deployment. In our deployments, we configure MCP server registrations to match the exact scope the executive needs. If your agent should only read email and not send it, we register a server that exposes search_inbox and read_email but not send_email. The agent literally can’t send emails — the tool doesn’t exist in its capability set.
How does MCP compare to direct API integration and function calling?
Three approaches exist for connecting AI agents to tools. They’re not equivalent, and the differences matter for production deployments. Direct API integration is what most DIY OpenClaw installations start with and get stuck on. LLM function calling is the step up that most developer-focused AI platforms provide. MCP is the step above that, and it’s the only one that scales past about 10 integrations without turning into a maintenance burden.
| Feature | Direct API Integration | LLM Function Calling | MCP |
|---|---|---|---|
| Tool discovery | Hardcoded per integration | Defined in system prompt | Runtime discovery via protocol |
| Schema enforcement | Manual validation | LLM-side only | Protocol-level validation |
| Adding new tools | Code changes, redeploy | Update prompt, hope for the best | Register new MCP server |
| Credential handling | Keys in config files | Keys in config files | Delegated to server layer |
| Audit trail | Custom logging per tool | Varies by provider | Structured by default |
| Maintained by | Your team | Your team plus the LLM provider | Open standard, community |
| Scales to | ~10 tools before breakdown | ~20-30 tools before prompt bloat | 15,000+ published servers |
Direct API integration is what most DIY OpenClaw installations use. You write custom Python or TypeScript for each service — Gmail, Slack, Salesforce, QuickBooks, Notion — handling auth, error codes, rate limits, response parsing, and retry logic individually. It works until you hit 10+ integrations and your engineering team is spending more time maintaining API wrappers than building features. And every time a vendor changes their API (which happens more often than anyone expects), something breaks. I’ve seen client projects collapse under this maintenance burden within 6 months of deployment.
LLM function calling (what OpenAI and Anthropic provide natively in their APIs) defines tools in the system prompt and lets the model decide when to call them. It’s better than raw API calls for small toolsets, but the schema lives in the prompt rather than the protocol. There’s no runtime discovery — adding a tool means updating the prompt and restarting. And as the toolset grows, the system prompt balloons to thousands of tokens just describing available functions, which both costs money on every inference and reduces the context window available for actual work. Function calling scales to maybe 20-30 tools before prompt bloat becomes a serious operational issue.
MCP moves the entire interface to the protocol layer. Tools are discovered at runtime from servers, schemas are enforced at the wire level, credentials never need to touch the agent process, and adding a tool is a configuration change rather than a code change. According to Anthropic’s documentation, MCP was designed specifically because function calling alone doesn’t scale beyond a handful of tools — and production agents need dozens. Forrester’s 2026 AI Integration Report found teams using pre-built MCP connectors deploy integrations 14x faster than teams building custom API wrappers, with median time-to-first-integration dropping from 3 weeks to 4 hours.
How does Composio extend MCP to 10,000+ apps?
Composio wraps third-party APIs as MCP-compatible servers with built-in OAuth credential management. Instead of building a custom MCP server for every tool your agent needs, Composio provides pre-built connectors for Gmail, Google Calendar, Slack, Salesforce, HubSpot, Notion, Linear, GitHub, Jira, QuickBooks, Stripe, and thousands more. Each one speaks MCP natively and handles the messy parts — OAuth flows, token refresh, rate-limit back-off, error translation — server-side so your agent never has to deal with them.
The architecture looks like this: OpenClaw’s agent talks MCP to Composio. Composio talks OAuth/REST to the downstream service. Your agent sends a tools/call request with the action it wants, and Composio handles authentication, execution, retry logic, and response formatting. The agent receives a clean, structured MCP response and never touches a credential. For the full credential architecture story, see our breakdown of why Anthropic banned consumer OAuth for OpenClaw. For the Composio-specific integration walkthrough, see connecting OpenClaw to Gmail, Calendar, and Slack via Composio.
Here’s what adding a Composio-backed MCP server looks like in an OpenClaw configuration:
# openclaw mcp server configuration
mcpServers:
composio-gmail:
command: "composio serve"
args: ["--app", "gmail", "--actions", "GMAIL_SEND_EMAIL,GMAIL_FETCH_EMAILS"]
env:
COMPOSIO_API_KEY: "${COMPOSIO_API_KEY}"
composio-slack:
command: "composio serve"
args: ["--app", "slack", "--actions", "SLACK_SEND_MESSAGE,SLACK_LIST_CHANNELS"]
env:
COMPOSIO_API_KEY: "${COMPOSIO_API_KEY}"
composio-calendar:
command: "composio serve"
args: ["--app", "googlecalendar", "--actions", "GOOGLECALENDAR_FIND_EVENT,GOOGLECALENDAR_CREATE_EVENT"]
env:
COMPOSIO_API_KEY: "${COMPOSIO_API_KEY}"
composio-salesforce:
command: "composio serve"
args: ["--app", "salesforce", "--actions", "SALESFORCE_FETCH_OPPORTUNITY,SALESFORCE_UPDATE_RECORD"]
env:
COMPOSIO_API_KEY: "${COMPOSIO_API_KEY}"
Notice the --actions flag. You declare exactly which actions the MCP server exposes. Even though Composio supports 50+ Gmail actions and 100+ Salesforce actions, your agent only sees the ones you register. This is defense in depth — MCP’s tool boundaries at the protocol layer plus Composio’s action scoping at the credential broker layer. If a prompt-injected agent tries to call GMAIL_DELETE_EMAIL, the tool doesn’t exist in its capability set (MCP rejects at the protocol layer) AND the Composio server isn’t configured to expose it (second rejection at the credential broker). Two layers of defense, zero custom code.
According to Composio’s March 2026 metrics, their platform supports over 10,000 tool actions across 2,500+ apps. We’ve tested every major MCP connector framework — Composio, Toolhouse, LangChain’s tool layer, and several internal builds. Composio wins on three dimensions: OAuth management (your agent never touches credentials), action granularity (you pick exactly which operations to expose, not a broad category permission), and MCP compliance (native JSON-RPC with proper schema definitions, not a retrofitted wrapper). It’s been our default since day one at beeeowl.
What about building custom MCP servers for internal tools?
Not everything your executive needs lives in a SaaS app. If the agent needs to query an internal PostgreSQL database, pull metrics from a Grafana dashboard, interact with a proprietary CRM, or read from a legacy system that predates REST APIs, you build a custom MCP server. The specification is open, the SDKs are solid, and the integration takes hours rather than weeks.
Anthropic publishes official SDKs for TypeScript, Python, Java, Kotlin, and C# — covering virtually every backend stack your team might use. Here’s a minimal MCP server in TypeScript:
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";
const server = new McpServer({
name: "internal-revenue-dashboard",
version: "1.0.0"
});
server.tool(
"get_revenue_summary",
"Returns revenue summary for a given quarter with optional BU filter",
{
quarter: z.string().describe("Quarter in format Q1-2026"),
business_unit: z.string().optional().describe("Filter by business unit")
},
async ({ quarter, business_unit }) => {
// Your internal API call here
const data = await fetchRevenueData(quarter, business_unit);
return {
content: [{
type: "text",
text: JSON.stringify(data)
}]
};
}
);
const transport = new StdioServerTransport();
await server.connect(transport);
Register it in your OpenClaw config, and the agent discovers it automatically on the next initialization:
mcpServers:
internal-revenue:
command: "node"
args: ["./mcp-servers/revenue-dashboard/index.js"]
That’s it. The agent now knows it can call get_revenue_summary with a quarter and optional business unit. No prompt changes. No redeployment of the agent itself. The tool appears in the next tools/list response and becomes callable immediately. According to the MCP community registry on GitHub, over 3,800 custom MCP servers were published in the first quarter of 2026 alone. Most of them follow this same pattern — 50-200 lines of code wrapping an existing internal API or database query layer with the MCP envelope.
We’ve built custom MCP servers for beeeowl clients across Snowflake data warehouses, internal Confluence wikis, proprietary CRM systems, Grafana dashboards, legacy Oracle databases, and custom FastAPI applications. The pattern is always the same: identify the tool the executive needs, wrap it in an MCP server, register it in the agent config, done. The development time averages 2-4 hours for a competent backend engineer.
Who else has adopted MCP beyond Anthropic?
MCP started as an Anthropic specification, but it didn’t stay that way for long. In March 2025, OpenAI announced native MCP support in the Agents SDK and ChatGPT desktop app. Google DeepMind integrated MCP into Gemini’s tool-use pipeline in the same quarter. Microsoft added MCP support to Copilot Studio. Amazon’s Bedrock agent framework adopted MCP as its default tool protocol. By mid-2025, every major AI platform vendor spoke MCP natively.
The adoption pattern mirrors what happened with HTTP, JSON, and OAuth — a single company (or small group) publishes a specification that solves a real problem, the industry recognizes the problem as universal, and within 18 months the specification becomes the default. HTTP replaced proprietary protocols. JSON replaced XML for most API work. OAuth replaced vendor-specific authentication flows. MCP is following the same curve for AI agent tool integration, and it’s moving faster than any of them did — the HTTP specification took about 4 years to reach equivalent industry consensus in the early 1990s.
Sam Altman called MCP “a very cool step for the ecosystem” when OpenAI adopted it. Satya Nadella’s team at Microsoft described it as “the USB-C of AI tool integration.” Dario Amodei positioned it as foundational infrastructure for the agent era. The consensus is real, it’s cross-vendor, and it’s accelerating.
This matters for your deployment strategy because MCP isn’t a vendor lock-in play. If you decide to swap out the underlying LLM a year from now — moving from Claude to GPT-5 to Gemini 3 to whatever ships next — your MCP servers keep working. The tool layer is decoupled from the model layer. Your integrations survive model changes because the protocol is open and the servers run independently of whichever LLM is doing the reasoning.
How does beeeowl configure MCP in every deployment?
Every beeeowl deployment ships with MCP configured at three levels, layered for defense in depth and operational flexibility.
Level 1: Composio MCP servers for SaaS tools. We typically configure 5-8 servers on day one — Gmail, Google Calendar, Slack, and the client’s CRM (Salesforce, HubSpot, Pipedrive, or Affinity) are the most common starting set. Each server is scoped to the minimum actions the agent needs using the --actions flag, following the principle of least privilege. If the executive only needs to read emails and create calendar events, the server exposes only those actions.
Level 2: OpenClaw’s native MCP layer for internal agent capabilities. This includes the Gateway’s built-in tools for authentication, audit logging, policy enforcement, and human-in-the-loop escalation. These don’t come from Composio — they’re part of the OpenClaw runtime itself — but they speak MCP so the agent interacts with them through the same protocol as any external tool. Consistency matters because it means the audit log captures every tool call in the same structured format regardless of whether the tool is internal or external. See our Gateway architecture deep-dive.
Level 3: Custom MCP servers for client-specific internal systems. When a client has internal databases, proprietary CRMs, executive dashboards, or legacy systems the agent needs to access, we build custom MCP servers during deployment. These are the 2-4 hour development exercises described above, and they’re where the real differentiation of a private deployment shows up. A client’s executive workflows often depend on a couple of internal systems that no off-the-shelf integration would cover, and custom MCP servers are the clean way to expose them to the agent.
The full MCP configuration lives in a single YAML file checked into the client’s secure configuration repository. We lock it down with file permissions, Docker namespace isolation, and the Gateway’s policy engine. According to Gartner’s 2026 AI Security Assessment, organizations that manage tool access through a protocol-level registry (like MCP) experience 71% fewer unauthorized tool invocations than those relying on application-level access controls retrofitted onto bespoke integrations.
After deployment, adding a new tool takes minutes, not weeks. Run composio add for SaaS apps, or register a custom MCP server for internal tools. The agent picks up the new capability on its next initialization — no code changes, no redeployment, no restart of the agent runtime.
What should you do next?
If you’re evaluating private AI deployment for your executive team, MCP is the integration layer that makes the whole thing practical at production scale. Without it, you’re back to writing custom API wrappers and managing credentials in config files — the path that 72% of DIY AI agent deployments abandon within 30 days per McKinsey’s 2025 State of AI. With it, your agent speaks a universal protocol that works with thousands of tools out of the box, scales to internal systems through quick custom servers, and survives model changes because the tool layer is decoupled from the model layer.
We’ve configured MCP across 150+ deployments at beeeowl. The pattern is consistent: start with 5-8 Composio-backed integrations for the SaaS tools the executive uses daily, add custom MCP servers for client-specific internal systems as needed, and let the Gateway’s audit trail track every tool invocation in structured JSON-RPC format. Full deployment details on our pricing page. For the broader protocol context in the agent ecosystem, see our upcoming piece on the OpenClaw Gateway control plane.
The tools your executives use don’t change. The protocol connecting them does, and MCP is the protocol that makes private AI agent deployment practical for the first time.


