Setting Boundaries: Getting Zero-Trust Tool Calling Right for Agentic AI (www.macawsecurity.com)

0 points 2 days ago ago | visit original

🤖 AI Summary

Researchers and practitioners are warning that the biggest security gap in agentic AI isn’t just prompt injection but the lack of explicit control/data-plane boundaries when LLMs get tool access. Mixing untrusted input with tool-calling enables three core attack vectors—intent hijacking, tool chaining, and context poisoning—and common mitigations (blocklists, prompt engineering, detection models) are brittle or probabilistic. The proposed fix reframes the problem as an architectural one: make tools the enforcement boundary and enforce intent and policy with cryptographic guarantees, not heuristics. The team introduces “Authenticated Workflows”: every actor (LLM, tool, app, user) has a cryptographic identity; intent is expressed as signed, policy-bound invocations; receivers verify signatures and enforce policies before execution; and every call emits an attestation chain. For LLMs/MCP this includes Authenticated Prompts, depth limits and intent binding so evolving prompts cannot escalate privileges. Effective permissions are the cryptographic intersection of User Intent ∩ App Policy ∩ Tool Policy ∩ System Policy ∩ Context State, making prompt injection unable to forge authority. Implementations (SecureOpenAI, SecureMCP) reportedly block complex injection chains while remaining transparent to developers—akin to mTLS for tool-calling—offering a provable, zero-trust way to secure agentic AI at scale.

Loading comments...

loading comments...