Schema DriftMCP HealthSilent Failures

Schema Drift in MCP: The Silent Failure Your Agents Cannot Detect

A field was renamed in a community MCP server update. Agents kept calling the tool with the old field name, got empty results, and hallucinated downstream answers for three days before anyone noticed. Schema drift is the silent failure that your agents cannot detect on their own.

April 2, 2026·8 min read·LangSight Engineering
Schema Drift in MCP: The Silent Failure Your Agents Cannot Detect
Schema DriftThe silent failure your agents can't detect

What is schema drift?

Every MCP tool has an input schema — a JSON Schema definition that describes what arguments the tool accepts. When an agent connects to an MCP server, it calls tools/list and receives these schemas. The LLM uses them to construct correct tool call arguments.

Schema drift occurs when the MCP server changes a tool's schema between versions. A parameter gets renamed. A field type changes from string to integer. A new required argument is added. An entire tool is removed or replaced with a different name.

The agent, which was tested and deployed against the old schema, continues calling tools with the old argument names and types. Depending on how the server handles unrecognized arguments, this results in one of three outcomes — each progressively harder to detect.

Three failure modes of schema drift

Mode 1: Hard failure (best case)

The MCP server validates incoming arguments against the new schema and rejects calls with invalid arguments. The agent gets a clear error: "error": "Unknown parameter: customer_id. Did you mean: account_id?"

This is the best outcome because it is immediately visible. The agent fails loudly, the error appears in traces, and someone investigates. Unfortunately, many MCP servers do not do strict input validation — they accept unknown arguments silently.

Mode 2: Partial results (harder to detect)

The server accepts the call but ignores the unrecognized argument. The tool returns results, but without the intended filter. Instead of returning customer #456, it returns all customers (or the first customer, or an empty result set depending on the implementation).

# Before schema drift: the agent sends the correct field name
tool_call: get_customer(customer_id=456)
→ Returns: { "name": "Acme Corp", "plan": "enterprise" }

# After schema drift: field renamed to account_id
# The agent still sends customer_id (the old name)
tool_call: get_customer(customer_id=456)
→ Returns: { "name": null, "plan": null }
# Server ignored the unknown field, returned empty result

# The agent now reasons over empty data:
"I could not find any customer with that ID. The customer
 may not exist in our system."
# WRONG — the customer exists, the field name just changed

This is the most dangerous mode. The tool call succeeds (no error), the agent gets data back, but the data is wrong or empty. The agent then confidently reasons over incorrect data and provides the user with a wrong answer — which looks correct because it is well-formatted and articulate.

Mode 3: Semantic shift (hardest to detect)

The field name stays the same but its meaning changes. A status field that previously accepted "active" | "inactive" now accepts "enabled" | "disabled" | "suspended". The agent sends "active", the server does not recognize it, and returns results for all statuses — or no results.

Semantic shifts are nearly impossible to detect without comparing the full schema definition (including enum values, descriptions, and constraints) between versions.

Why agents cannot detect schema drift

When an agent session starts, the client calls tools/list and gets the current tool schemas. The agent uses these schemas for that session. But the schemas the agent was tested against might be different from the schemas the server is currently serving.

The agent has no memory of what the schema looked like when it was tested and deployed. It sees the current schema, constructs arguments based on the LLM's understanding of the current schema, and makes the call. If the schema changed between the last deployment and the current session, the agent does not know.

Even if the agent re-fetches schemas at the start of each session (which most frameworks do), this does not help. The agent's behavior was tuned against the old schema. The LLM's system prompt, examples, and training data all reference the old field names and types. The new schema is different, but the agent does not know what changed or how to adapt.

How LangSight tracks schema drift

LangSight stores a snapshot of every MCP server's tool schemas on every health check. When the health checker runs (every 30 seconds by default), it calls tools/list, computes a hash of the full schema response, and compares it against the last known hash.

If the hash changes, LangSight generates a detailed diff showing exactly what changed:

$ langsight mcp-health

Schema drift detected on crm-mcp:

  Tool: get_customer
  Change type: BREAKING — field renamed
  Before: { "customer_id": { "type": "string", "required": true } }
  After:  { "account_id":  { "type": "string", "required": true } }

  Tool: search_contacts
  Change type: COMPATIBLE — new optional field added
  Before: { "query": { "type": "string" } }
  After:  { "query": { "type": "string" }, "limit": { "type": "integer", "default": 50 } }

  Tool: delete_customer
  Change type: REMOVED — tool no longer available

  Impact: 3 agents use crm-mcp (support-agent, onboarding-agent, billing-agent)
  Action: review changes before upgrading agents

The diff categorizes changes into three types:

  • BREAKING — field renamed, field type changed, required field added, tool removed. Agents will fail or produce incorrect results.
  • COMPATIBLE — new optional field added, new tool added, description updated. Agents will continue working but may not take advantage of new capabilities.
  • SUSPICIOUS — description changed significantly (possible poisoning), enum values changed (semantic shift). Requires manual review.

The rug pull attack

Schema drift is usually accidental — a developer renames a field without considering backward compatibility. But it can also be intentional.

A "rug pull" attack works like this: an attacker publishes a useful, well-reviewed MCP server. It gains adoption — hundreds of agents depend on it. Then the attacker pushes an update that changes tool descriptions to include poisoned instructions, or changes tool schemas to redirect data to attacker-controlled endpoints.

If teams auto-update MCP server dependencies (which many do), the poisoned version deploys silently. The tool names are the same. The schemas look similar. But the behavior has changed.

Schema drift detection catches this because it detects any change — including description changes that might contain injection patterns. Combined with LangSight's poisoning detector, the alert includes both "the schema changed" and "the new description contains suspicious patterns."

Versioning strategies

Pin exact versions

Never use latest or unpinned versions for MCP servers in production. Pin the exact version or commit hash. This ensures that schema changes only happen when you explicitly upgrade.

# .langsight.yaml — pinned versions
servers:
  - name: crm-mcp
    transport: stdio
    command: "uvx --from crm-mcp==2.1.4 crm-server"
    schema_pin: "sha256:a1b2c3d4..."  # hash of expected schema

Schema pinning

Beyond version pinning, pin the expected schema hash. If the server returns a different schema than expected — even if the version number has not changed — LangSight alerts. This catches scenarios where a server binary is modified without changing its version (compromised supply chain).

Staged rollouts

When upgrading an MCP server version, do not upgrade all agents at once. Upgrade one agent, monitor for 24 hours, check for schema-related errors and anomalies, then roll out to the rest. LangSight's per-agent session data makes it easy to compare error rates before and after the upgrade.

Key takeaways

  • Schema drift is the most under-monitored MCP failure mode. It causes silent data corruption, not loud errors. Agents return wrong answers confidently.
  • Three failure modes, all dangerous: hard failures (detectable), partial results (subtle), and semantic shifts (nearly invisible).
  • Agents cannot detect schema drift on their own. They see the current schema, not the difference between current and tested schema.
  • Automated detection is essential. Snapshot schemas on every health check, diff on change, categorize as BREAKING/COMPATIBLE/SUSPICIOUS.
  • Pin versions and schemas. Never auto-update MCP servers in production. Pin both the version and the expected schema hash.

Related articles

Detect schema drift before your agents break

LangSight snapshots tool schemas on every health check and alerts on any change — with diffs, categorization, and impact analysis. Self-host free, Apache 2.0.

Get started →