Agentic attack chains
Agentic attack chains describe how local weaknesses compose into organisational impact. The risk is rarely a single prompt, response, or tool call. It is the path from influence, to intent, to authority, to state, to action, to impact.
This document is a defensive model. It does not provide operational exploit instructions. It helps teams identify where a chain can start, how it can move across the execution system, what evidence should be preserved, and where controls should interrupt the path.
For the surface map behind these chains, read Attack Surfaces: Agentic Execution Systems. For the underlying failure modes, read the threat model.
Progressive Breach Model
Agentic breach chains often follow this progression:
- Untrusted language or data
- intent or goal compromise
- tool, workflow, or code action
- credential or delegated authority use
- memory, state, or context change
- propagation across agents or systems
- unsafe autonomous action
- organisational impact
The breach progression below shows eight chronological stages from untrusted input to organisational impact.
- timeline
- title Progressive breach progression
- Influence : Untrusted language or data
- Intent : Goal compromise
- Action : Tool, workflow, or code
- Authority : Credential or delegated authority
- State : Memory or context change
- Propagation : Across agents or systems
- Autonomy : Unsafe autonomous action
- Impact : Organisational impact
Not every incident follows every stage. Some chains stop at unsafe disclosure or misleading advice. Others skip memory or multi-agent propagation. The model is useful because it makes defenders ask where influence becomes action and where action becomes durable impact.
Chain Stages
| Stage | What changes | Control opportunity | Evidence to preserve |
|---|---|---|---|
| Influence enters | User input, retrieved text, tool output, message, ticket, email, or web content affects reasoning. | Label source, trust level, freshness, and sensitivity. | Prompt, source metadata, retrieval query, tool result, or message origin. |
| Intent shifts | The agent reinterprets the goal, prioritises a competing objective, or hides risk. | Keep user intent separate and re-check alignment before action. | User goal, planning trace, policy decision, and risk explanation. |
| Boundary is crossed | The agent selects a tool, workflow, code path, MCP server, skill, extension, or second agent. | Validate tool choice against task, identity, data sensitivity, and expected outcome. | Tool selection reason, parameters, capability metadata, and policy result. |
| Authority is used | A token, service identity, user session, approval, or delegated permission enables action. | Broker credentials and require scoped, task-bound authority. | Effective identity, scope, lifetime, approval, and secret-handling record. |
| State changes | Memory, files, tickets, repositories, queues, summaries, or downstream records are updated. | Review, scope, expire, and audit state changes. | Diff, memory write, state owner, provenance, and rollback path. |
| Propagation occurs | Another agent, workflow, human, or system treats manipulated output as trusted input. | Authenticate and label cross-agent messages and shared artefacts. | Sender, recipient, task scope, shared state, and downstream action trace. |
| Impact lands | Data, code, cloud resources, communications, financial processes, operations, or customers are affected. | Apply outcome controls, approval gates, rollback, and incident response. | Final action, business owner, customer or operational effect, and remediation record. |
Chain Pattern 1: Instruction Influence To Tool Action
In this pattern, untrusted language changes how the agent interprets the user’s goal. The agent then selects a tool or workflow that can affect a real system.
Defensive concern: the system treats instruction-like text as control input even though it came from an untrusted source.
Control points:
- Mark external language as data unless it is explicitly trusted to instruct behaviour.
- Compare the planned tool call with the user’s original goal.
- Require policy review for tools that write, delete, send, deploy, approve, or change configuration.
- Preserve the source text, tool parameters, policy result, and downstream effect in one trace.
Merge-ready evidence: reviewers can see why the action was selected, which source influenced it, which authority was used, and why the action matched the user’s approved intent.
Chain Pattern 2: Poisoned Context To Misleading Approval
In this pattern, retrieved context supplies misleading evidence. The agent uses that evidence to justify an approval request or a sensitive recommendation.
Defensive concern: the human approver sees a confident summary but not the provenance, freshness, trust level, or uncertainty behind it.
Control points:
- Attach source, freshness, sensitivity, and trust labels to retrieved evidence.
- Require multiple or higher-trust sources for high-impact actions.
- Show the approver the relevant context, the uncertainty, and the expected effect.
- Log the approval decision with the evidence available at the time.
Merge-ready evidence: the approval record includes what the reviewer saw, which sources supported the decision, and how the system handled uncertainty.
Chain Pattern 3: Tool Result To Memory Persistence
In this pattern, a temporary tool response or external message becomes stored memory. Later, the agent retrieves that memory as trusted context and uses it to guide action.
Defensive concern: a short-lived influence becomes persistent state without provenance, review, or expiry.
Control points:
- Treat memory writes as state changes, not as harmless notes.
- Require provenance, owner, scope, reason, and expiry for memory entries.
- Prevent secrets, policy overrides, and untrusted behavioural instructions from entering memory.
- Log future memory reads that influence tool calls, approvals, or decisions.
Merge-ready evidence: memory entries can be inspected, corrected, expired, deleted, and tied back to the source that created them.
Chain Pattern 4: Broad Authority To Downstream Change
In this pattern, the agent acts through credentials that are broader than the task requires. A narrow request can therefore affect repositories, cloud resources, SaaS records, customer communications, or operational workflows outside the intended scope.
Defensive concern: the system cannot prove that authority was limited to the approved task and expected outcome.
Control points:
- Broker credentials per task, action, and approval.
- Use short-lived scopes and deny actions outside the approved boundary.
- Show effective identity, scope, and expected effect before high-impact actions.
- Preserve audit evidence that connects identity, policy, approval, and outcome.
Merge-ready evidence: each downstream change can be traced to a task-bound identity, explicit scope, policy decision, and approval record where required.
Chain Pattern 5: Cross-Agent Propagation
In this pattern, one agent’s output becomes another agent’s input through a queue, shared memory, ticket, comment, orchestration system, or delegated task. The receiving agent treats the content as trusted enough to act.
Defensive concern: trust labels, scope, and authority are lost when work moves between agents.
Control points:
- Authenticate agent-to-agent messages and label origin, trust level, and delegated scope.
- Prevent one agent from silently granting authority or policy exceptions to another.
- Treat shared memory, task artefacts, and queues as boundaries.
- Link traces across agents so incident review can reconstruct the full path.
Merge-ready evidence: the organisation can show which agent produced an artefact, which agent consumed it, what authority was delegated, and what downstream action followed.
Chain Pattern 6: Observability Gap To Repeated Failure
In this pattern, the organisation cannot connect prompts, retrieved context, tool calls, memory changes, credentials, approvals, and downstream actions. The incident may be noticed only after impact, and the same weakness can recur because the path is not visible.
Defensive concern: governance relies on final outputs or isolated logs rather than a reconstructable action path.
Control points:
- Link prompt, context, tool, credential, memory, approval, output, and downstream-action records.
- Review traces for multi-step behaviour and risky tool combinations.
- Test workflows that include memory, tools, approvals, autonomy, and agent-to-agent hand-offs.
- Use incident evidence to improve policies, evaluations, and approval design.
Merge-ready evidence: reviewers can reconstruct what happened, why it was allowed, what changed, and which control should prevent recurrence.
Interruption Checklist
For any proposed agentic workflow, ask where the chain would be interrupted:
- Can untrusted language be prevented from becoming control instruction?
- Can goal alignment be checked before a risky tool call?
- Can policy decisions evaluate intent, authority, data sensitivity, and likely impact together?
- Can credentials be scoped to the task and revoked after use?
- Can memory writes be reviewed, expired, corrected, and traced?
- Can cross-agent messages preserve origin, trust level, and delegated scope?
- Can humans see enough evidence before approving sensitive action?
- Can the full path from influence to outcome be reconstructed after the fact?
If the answer is unclear, the system is difficult to govern even if the model appears safe in isolation.
Engineering Patterns That Implement Each Interruption
Each interruption question above maps to one or more secure engineering patterns. The patterns describe the boundaries, decision points, audit edges, and deny or revise branches that engineers can build to. Use the secure engineering patterns overview for the full map; the table below is the quick lookup.
| Interruption question | Pattern(s) |
|---|---|
| Can untrusted language be prevented from becoming control instruction? | Secure Agent Runtime, Memory Security, Secure MCP |
| Can goal alignment be checked before a risky tool call? | Secure Agent Runtime, Secure Tool Calling |
| Can policy decisions evaluate intent, authority, data sensitivity, and likely impact together? | Secure Agent Runtime, Secure Tool Calling |
| Can credentials be scoped to the task and revoked after use? | Credential And Token Boundaries |
| Can memory writes be reviewed, expired, corrected, and traced? | Memory Security |
| Can cross-agent messages preserve origin, trust level, and delegated scope? | Planned multi-agent pattern |
| Can humans see enough evidence before approving sensitive action? | Secure Agent Runtime, Secure Tool Calling |
| Can the full path from influence to outcome be reconstructed after the fact? | Secure Agent Runtime, audit evidence sections in all five patterns |
Relationship To Defence Architecture
These chains show where controls need to operate. The defence architecture organises those controls into layers for identity, policy decisions, runtime guardrails, tool brokering, credential brokering, memory and context controls, observability, human approval, audit, outcome control, and governance.
The interaction between agent, policy, tool broker, tools, memory, and audit is shown below as a sequence diagram. The diagram is about who calls whom and in what order.
- User
- Agent
- Policy
- Tool broker
- Tool or MCP
- Memory
- Audit