Landscape map

The security boundary for agentic AI is the execution system around the model: the prompts it receives, the context it retrieves, the tools it can call, the credentials it can use, the memory it can update, the code it can write or run, the approvals it can request, and the downstream systems it can affect.

This landscape map gives readers a common frame before moving into the threat model. It treats agentic systems as execution environments, not as isolated model endpoints.

The diagram below shows the agentic execution system as five stacked layers, with the control posture wrapping every layer.

1Control posture: Observe → Interpret → Constrain → Audit
2Inputs: instructions, retrieved context, memory
3Agent reasoning
4Action layer: tools, MCP, code, credentials, approvals
5Downstream systems and assurance evidence

The Protected Object

In a model-centred system, the protected object is often the prompt, the completion, or the data sent to and from the model. In an agentic system, the protected object is broader: it is the system of action that forms around language, tools, state, identity, and authority.

Component	Security question
Instructions	Which messages, prompts, policies, and delegated goals can shape behaviour?
Context	Which retrieved documents, data sources, and conversation state influence decisions?
Tools	Which functions, APIs, systems, files, and workflows can the agent invoke?
Credentials	Which user, service, or delegated authority does the action use?
Memory	Which facts, preferences, summaries, and learned state can persist across turns or sessions?
Code execution	Which generated scripts, shell commands, notebooks, or automation paths can change systems?
Approvals	Which actions require human review, and what evidence does the reviewer see?
Downstream systems	Which repositories, SaaS platforms, cloud resources, data stores, and communication channels can be changed?

The central question is therefore not only whether a model output is safe. It is what the agentic system can do, under whose authority, with which context, through which tools, and under what controls.

Language As An Execution Layer

Language becomes part of the execution layer when instructions can change action. A prompt, retrieved document, issue comment, ticket, email, chat message, web page, or tool response may influence whether an agent reads data, calls an API, writes code, updates memory, opens a pull request, sends a message, or changes configuration.

This does not mean language is executable in the same way as a binary or script. It means language can participate in execution paths by steering systems that have authority. Security therefore needs to inspect more than text safety. It needs to understand the relationship between instruction, intent, authority, tool choice, data sensitivity, and outcome.

Useful control questions include:

Which instructions are trusted, untrusted, system-owned, user-owned, or retrieved from external sources?
Which instructions can override, redirect, or reinterpret the user’s goal?
Which tool calls or memory writes can be triggered by language alone?
Which actions require policy checks, approval gates, sandboxing, or credential brokering before execution?

Main Risk Surfaces

Agentic risk appears wherever language, authority, state, and action meet.

Surface	Failure focus	Control question
Instruction flow	Untrusted instructions influence goals, policies, or tool choices.	Which instruction sources are allowed to steer behaviour, and which must be treated as data?
Retrieved context	External or stale context changes interpretation, priorities, or decisions.	How is context sourced, labelled, filtered, and bounded before use?
Tool interfaces	Tools expose unsafe actions, broad parameters, weak validation, or risky composition.	What can each tool do, and how is intent, authority, input, and output checked?
Credentials and tokens	Agents act with excessive, unclear, or poorly scoped authority.	Which identity is used for each action, and can credentials be limited per task?
Memory	Persistent state stores manipulated facts, preferences, summaries, or instructions.	What may be written to memory, who can influence it, and how is it reviewed or expired?
Code and automation	Generated code, scripts, or file operations create side effects beyond the reviewed output.	Where can code run, what can it touch, and what evidence is preserved?
MCP, skills, and extensions	Tool servers or packaged capabilities become authority-bearing execution boundaries.	How are capabilities discovered, trusted, configured, monitored, and revoked?
Human approvals	Reviewers approve actions without enough context, risk signal, or diff visibility.	What must a reviewer see before approving an action?
Multi-agent communication	One compromised agent influences another agent, queue, workflow, or shared memory.	How are messages authenticated, scoped, and constrained across agent boundaries?
Monitoring and evaluation	Logs, traces, benchmarks, and tests miss multi-step behaviour and downstream effects.	Can the organisation observe action paths, not only final responses?

These surfaces overlap. A retrieved document can influence a tool call. A tool response can update memory. A memory entry can affect a later approval request. A token can turn a weak instruction into an organisational action.

How Failures Compose

Most agentic failures are not single events. They are paths across the execution system.

Untrusted instructionCompromised intent
Compromised intentRisky tool selection
Risky tool selectionDelegated authority use
Delegated authority usePersistent state or downstream change
Persistent state or downstream changeBroader organisational impact

Different systems will expose different paths, but the pattern is consistent: a weak boundary in one place becomes more serious when it is connected to tools, credentials, memory, automation, or other agents.

Common composition patterns include:

An instruction attack changes how the agent interprets the user’s goal, then tool access turns that interpretation into action.
Poisoned context changes the evidence base, then memory makes the change persistent across sessions.
Broad credentials allow a narrow task to affect systems outside the user’s intended scope.
A weak approval path lets a human approve an action without seeing the instruction source, tool parameters, or likely impact.
A compromised tool, skill, or MCP server becomes a bridge between language-level influence and system-level side effects.
In a multi-agent workflow, one agent’s manipulated output becomes another agent’s trusted input.

Phase 5 will expand these paths into detailed breach-chain models. This phase establishes the shared landscape and vocabulary.

Control Posture

Securing agentic systems requires controls that operate across the execution environment:

Control posture	What it must cover
Observe	Prompts, retrieved context, tool calls, credential use, memory reads and writes, approvals, outputs, and downstream actions.
Interpret	Intent, instruction source, authority, data sensitivity, tool risk, policy fit, and likely outcome.
Constrain	Tool permissions, credential scopes, memory writes, code execution, data movement, approval gates, and autonomous actions.
Audit	Evidence for review, incident response, assurance, governance, and continuous improvement.

The threat model turns this landscape into specific failure modes and defender questions.