Poisoned retrieved context
Retrieved context is manipulated to influence agent behaviour or outputs.
- See attack-chain-template.md for full structure.
- Related: docs/01-threat-model.md, patterns/memory-security.md
An attacker plants a document the retriever will rank highly so that when the agent fetches context for an unrelated task it pulls in attacker-controlled content and treats it as authoritative evidence for a high-impact decision.
- Attacker plants — poisoned docRetriever ranks — by relevance
- Retriever ranks — by relevanceAgent treats content — as evidence
- Agent treats content — as evidenceAction on — false premise
- Action on — false premiseOrganisational — impact
Defence checks every retrieval result for provenance and freshness, attaches an explicit trust label, and routes it through a policy decision so that low-trust or stale content cannot silently shape the agent’s reasoning.
- Retrieval — resultProvenance — check
- Provenance — checkFreshness — filter
- Freshness — filterTrust label — applied
- Trust label — appliedPolicy — decision
- Policy — decisionAgent — reasoning