Skip to content

Hidden instruction document ingestion

Instructions are hidden in ingested documents, causing the agent to act unexpectedly.

An attacker conceals directives inside a document — in metadata, comments, or invisible text — so when retrieval ranks it highly the agent ingests it and treats the embedded instructions as control rather than evidence, overriding policy or invoking tools the user never asked for.

  1. Document with — hidden instructionsEmbedding — and ranking
  2. Embedding — and rankingHigh-rank — retrieval
  3. High-rank — retrievalAgent treats — text as control
  4. Agent treats — text as controlPolicy override / — unintended tool call



Defence checks provenance and freshness at ingestion, scans for instruction-shaped content, enforces instruction-data separation in the agent’s context, and runs every action through a policy decision and runtime guardrail so embedded directives cannot quietly become commands.

  1. 1Provenance and freshness — check at ingestion
  2. 2Instruction-shape — anomaly detection
  3. 3Instruction-data — separation in context
  4. 4Policy decision — before action
  5. 5Runtime guardrail — on drift from approved task