On MCP security

Posted on April 6, 2025 • 3 minutes • 618 words

LLM MCP security encompasses numerous critical concerns.

Recently, Invariant Labs notified the community of a new type of attack called tool poisoning .

Their article raises several security concerns that have been largely overlooked amid the excitement around new AI applications with MCP.

Free-form text instruction

The root cause of tool poisoning attack is that the agent ingests MCP’s free-form text output from tools/list without any verification. This unverified content is then shared with other contexts like tools/call or whatever else the agent is processing.

This is a hard problem that none of the current MCP servers can really prevent. It’s up to the client (the agent) to fix that.

One of the possible solutions is for MCP specs to require servers return tool’s description in a strict schema that describes execution plan instead of free-form instruction. Then the agent decides how to act on it.

This would make it easier to validate tool intentions and prevent manipulation through malicious text instructions. Additionally, having a standardized schema would improve interoperability between different MCP implementations and agent frameworks.

In an ideal world, we need something like Coq/TLA+ for tools to prove that with this execution plan, they can achieve that task. This formal verification approach would provide mathematical guarantees about tool behavior, eliminating entire classes of security vulnerabilities that current systems struggle to address.

If we can do this, we can limit what agent can do with tools & disallow tool-to-agent instruction entirely.

Out-of-band review for sensitive MCP

Some companies implement a second model/tool (a gatekeeper LLM) to verify MCP tool description, but I don’t think this is the optimal approach.

While several companies are trying this strategy, it lacks deterministic guarantees and can introduce additional complexity without solving the underlying security issues.

Tool permissions

MCP currently doesn’t have specifications or requirements for tool capability boundaries. For example, a tool might advertise itself as performing calculations while actually reading your SSH keys and uploading them elsewhere.

Solution: Use a sandboxed environment, such as a container, microVM, or WebAssembly.

Shameless plug: This is part of the reason why I built hyper-mcp .

hyper-mcp uses a plugin system that lets you extend MCP server capabilities. Need to interact with GitHub? Just add the GitHub plugin.

When designing hyper-mcp, I made two key architectural decisions:

OCI for plugin distribution: The Open Container Initiative provides robust security features that we should leverage: image signing, signature verification, included SBOMs, and more. This allows us to build on existing security infrastructure rather than reinventing it.
WASM for sandboxing: WebAssembly provides a secure-by-default execution environment with precise control over capabilities.
- WASI (WebAssembly System Interface) allows for fine-grained permission controls, making it perfect for running untrusted code safely. This approach combines performance with security, allowing tools to run at near-native speed while maintaining strong isolation guarantees.
- The portable nature of WASM also enables tools to run consistently across different platforms and environments.

hyper-mcp leverages WebAssembly (WASI) to create a sandboxed environment for each tool to run in. Tools are locked down by default with no filesystem access and no network access, unless they explicitly declare what permissions they need to function.

hyper-mcp is still susceptible to tool shadowing attack but its security model is better than none.

Conclusion

As AI agents and LLMs become more prevalent, securing MCP implementations must be a priority. The combination of structured schemas for tool descriptions and proper sandboxing represents our best path forward.

While no solution is perfect yet, awareness of these attack vectors is the first step toward building more secure AI systems.

The community needs to collaborate on establishing better security standards by directly contributing to the MCP specs; before widespread adoption makes these issues more difficult to address.