Prompt injection tool misuse
A malicious prompt causes the agent to misuse a tool, breaching intended boundaries.
- See attack-chain-template.md for full structure.
- Related: docs/01-threat-model.md, docs/02-attack-surfaces.md, patterns/secure-tool-calling.md
An attacker hides directives inside ordinary user input so the agent treats them as legitimate goals, calls a tool with attacker-chosen parameters, and reaches the downstream system before any policy check has a chance to fire.
- Untrusted input
- Agent
- Tool
- Downstream system
- Intent reinterpreted, — no policy check
Defence source-labels every input, separates instructions from data, forces a policy decision before any tool is invoked, and routes the call through a tool broker so that even a successful injection cannot reach the downstream system unchecked.
- 1Source labelling — on every input
- 2Instruction-data — separation
- 3Policy decision — before action
- 4Tool broker: — allowlist and schema validation
- 5Outcome control — and end-to-end trace