AI agents as attack pivots: the new lateral movement

A structural shift in cross-system compromise

Christian Schneider · 4 Mar 2026 · 16 min read
TL;DR
AI agents create a new form of lateral movement by bridging isolated systems through delegated authority and tool access – without new network paths or stolen credentials. Prompt injection exploits their autonomy and shared instruction-data channel, as shown in real-world incidents like Clinejection and unauthorized npm publishes. Multiple frameworks now recognize the pattern. Defenses require treating agents as trust boundaries with scoped access, strong identity, taint tracking, and segmentation.

Read on if your organization deploys AI agents that connect to multiple systems — the lateral movement risk they introduce isn't just theoretical.

This post is part of my series on securing agentic AI systems. Previous posts covered prompt injection amplification, threat modeling with a five-zone lens, MCP security architecture, RAG attack surfaces, and memory poisoning. Subscribe to my RSS feed to follow along.

A third class of lateral movement

In February 2026, a security researcher disclosed how a single GitHub issue title – just a sentence with a prompt injection payload – could compromise an AI coding assistant’s entire CI/CD pipeline. Eight days later, an unauthorized party used exactly that to compromise an npm publish token to push a poisoned package update; every developer who installed during the eight-hour window before detection got an unwanted payload (details below). The attacker never touched a single target machine directly. The bridge between a public comment field and a privileged software supply chain was an AI agent doing exactly what it was designed to do: read issues, run commands, publish packages. Your SIEM wouldn’t have flagged any of it.

For decades, lateral movement meant one of two things. Network-based: an attacker hops between VLANs, pivots through RDP sessions, exploits trust relationships between subnets. Identity-based: stolen credentials, Kerberos ticket abuse, token replay across services. Both are well-understood, and the defense playbooks for both are mature.

AI agents introduce something different. They move across systems not through network connections or credential replay, but through natural language instructions and tool invocations. The agent doesn’t need network access to the target system – it already has authenticated API connections to multiple systems as part of its normal operation. An attacker who compromises the agent’s input doesn’t need to steal credentials or exploit a network path. The agent’s own legitimate permissions become the attack surface.

The obvious pushback: isn’t this just identity-based lateral movement? The agent has credentials, it uses authenticated APIs – that’s credential abuse, not a new category. The distinction matters. In identity-based movement, the attacker acquires identity material (tokens, tickets, secrets) and replays it across services. In agent-mediated movement, the attacker never touches the credentials. They subvert the decision layer that already wields legitimate identities, injecting control flow through untrusted content. The pivot is a confused-deputy attack – the agent acts on behalf of the attacker using its own ambient authority, not because the attacker stole anything, but because the agent was persuaded. That’s a fundamentally different defensive problem.

I’m calling this agent-mediated lateral movement – a third class of pivot that sits alongside the network and identity dimensions. Orca Security independently coined “AI Lateral Movement” to describe the same phenomenon, and their research provides compelling proof-of-concept evidence. But the structural pattern is broader than any single vendor’s framing: the Promptware Kill Chain analysis (Brodt et al.) shows how prompt injection has evolved into a multistep, malware-like process that enables lateral movement across agentic AI systems and connected resources. Something fundamental changed.

The numbers suggest this isn’t an edge case. The Cisco State of AI Security 2026 report found that 83% of organizations plan to deploy agentic AI, but only 29% feel ready to secure those deployments. That gap between deployment ambition and security readiness is where agent-mediated lateral movement thrives.

How agents become bridges

What makes agents uniquely dangerous as pivot points? No previous technology combined all three of these properties:

Broad tool access
A single agent connects to email, CRM, databases, code repositories, cloud APIs, file systems, and more. The OWASP AI Vulnerability Scoring System (AIVSS) calls this the “External Tool Control Surface” – and unlike traditional middleware with narrow, well-defined interfaces, an agent’s tool surface is effectively unbounded. Each connected system is a potential pivot target.
Execution autonomy
The agent acts without human approval at each system boundary. When a vulnerability is exploited in System A, the agent propagates the attacker’s instructions to Systems B, C, and D without anyone reviewing the action. The agents are trusted to cross boundaries that humans would think twice about.
Natural language as the instruction channel
This is the structural root of the problem. Instructions and malicious payloads share the same channel – the agent literally cannot distinguish trusted instructions from untrusted data at an architectural level. The Cloud Security Alliance’s Agentic Trust Framework calls this the collapsed “instruction boundary.” Attackers inject instructions through any content the agent processes: email bodies, file metadata, issue titles, order comments, Slack messages.

The combination creates what I think of as a trust bridge: a low-trust input surface (a public GitHub issue, an email, a Slack message) is connected through the agent to a high-trust system (CI/CD pipelines, cloud infrastructure, payment systems) that was never designed to receive instructions from that input source. The agent is the bridge, and its legitimate permissions are the road.

Terminology

Three terms recur throughout this post, and they’re worth pinning down here because the rest of the argument depends on them.

Agent-mediated lateral movement
The specific attack pattern: an attacker uses an AI agent’s legitimate, authenticated connections to pivot between systems that have no direct trust relationship, by injecting instructions through content the agent processes. It differs from automation abuse in SOAR or ITSM systems because the attack vector is natural language, not API manipulation or workflow misconfiguration.
Trust bridge
The structural condition that enables it: a source zone (low-trust input), a bridge mechanism (agent with tool access and execution autonomy), and a destination zone (high-trust system) – connected only because the agent spans both.
Toxic combinations
A term coined by Pillar Security’s taint-flow analysis – what happens when individually safe tool permissions combine through an agent to create dangerous input-output paths; related to what Simon Willison calls the “lethal trifecta” of sensitive data, untrusted inputs, and outbound communication.
Agent-mediated lateral movement: every step uses legitimate permissions, and security monitoring sees normal behavior throughout
Agent-mediated lateral movement: every step uses legitimate permissions, and security monitoring sees normal behavior throughout

Attack chains: incident and demonstrations

This isn’t theoretical. The Clinejection supply chain compromise is a confirmed real-world incident. The other two cases that follow are staged security research demonstrations – but they show the same structural pattern generalizing across platforms: low-trust input, AI agent as pivot, high-trust action across a system boundary.

Clinejection: GitHub issue to npm compromise

Security researcher Adnan Khan discovered a vulnerability chain in the Cline AI coding assistant’s GitHub Actions workflow. The demonstrated attack chain: a crafted issue with prompt injection in the title triggered Cline’s AI triage agent (Claude). The agent executed a malicious bash command, which poisoned the GitHub Actions cache. The cached payload stole the npm publish token during the next release cycle.

Eight days after public disclosure, an unauthorized party used a compromised npm publish token (GHSA-9ppg-jx86-fqw7) to publish cline@2.3.0. The only modification: a postinstall script that globally installed an unauthorized package. A corrected version (2.4.0) was published roughly eight hours later. Only the CLI was affected – the VS Code extension and JetBrains plugin were not compromised. Public information confirms the token compromise and the unauthorized publish; whether the attacker executed every step of the demonstrated injection chain is not established in the advisory, but very likely.

Count the boundaries crossed: a public GitHub issue to an AI triage agent, to shell execution, to CI/CD cache state, to npm publish credentials, to the npm registry, and ultimately to developer machines. The agent bridged an untrusted comment field and a privileged software supply chain. No network intrusion. No memory exploit. Just a sentence in an issue title.

Agent-mediated lateral movement in cloud and e-commerce

Security researchers have demonstrated agent-mediated lateral movement (which they call “AI Lateral Movement”) across two platforms:

In the Prowler proof-of-concept (a cloud security scanner), prompt injection was embedded in EC2 instance metadata tags – a field rarely treated as an input vector. The AI remediation agent processed the tags as instructions and was coerced into invoking tools beyond its intended scope. In environments with write-capable tools, the same pattern can escalate to privileged actions across the account.
In a separate attack hypothesis against Open Mercato (an AI‑supportive CRM/ERP foundation framework), an order comment field carried injected instructions to the AI customer service agent. The staged scenario demonstrated how a business data field – something meant for “please leave at the door” – becomes an instruction carrier for an agent with backend access.

Here’s what gets me about both demonstrations: traditional security controls saw nothing. No network anomalies, no credential theft, no privilege escalation events in the logs. The agent used its own legitimate permissions at every step. If you’ve spent any time tuning SIEM rules for lateral movement detection, you’ll appreciate how completely this bypasses the playbook.

MCP as the literal bridge mechanism

The Cisco State of AI Security 2026 report documented attack scenarios where malicious GitHub issues with hidden instructions were processed by agents via Model Context Protocol (MCP) servers, leading to private repository data exfiltration. Cisco’s framing is direct: the “connective tissue” of the AI ecosystem has created “a vast and often unmonitored attack surface.”

I covered MCP-specific attack vectors in depth in my MCP security architecture post. What this post adds is the broader pattern: MCP is one bridge mechanism, but the agent-as-pivot problem exists regardless of the specific protocol. If you’re evaluating MCP servers for your agent stack right now, that post is the place to start.

Jake Williams (IANS Faculty) puts it bluntly: "[Model Context Protocol] will be the AI-related security issue of 2026" (IANS, February 2026).

Mapping to the five-zone lens

In my threat modeling post, I introduced a five-zone discovery lens for tracing attack paths through agentic systems. Every agent-as-pivot attack maps to this framework, and seeing the pattern helps explain why traditional security controls miss them:

Zone 1 — Input processing
Where the injection enters: a GitHub issue title (Clinejection), EC2 metadata tags (Prowler), an order comment field (Open Mercato). Each is a data field that the agent processes as potential instructions.
Zone 2 — Agent reasoning
Where goal hijacking occurs: in every case, the agent’s planning loop is redirected to serve the attacker’s objectives. The agent executes attacker-controlled instructions as its own planned actions – there’s no “exploitation” in the traditional sense, just persuasion.
Zone 3 — Tool execution
Where the bridge completes: the agent’s legitimate tool access becomes the attacker’s execution surface: bash commands (Clinejection), cloud API calls (Prowler), and backend operations (Open Mercato).
Zone 4 — Memory and state
Where persistence is established: in the Clinejection case, GitHub Actions cache poisoning abused shared CI workflow state—not agentic memory in the strict sense, but a persistence layer that outlived the initial execution context. In contrast, true agent memory poisoning affects long-lived instruction or retrieval stores. In both cases, a one-time injection can become a durable foothold for the attacker.
Zone 5 — Output and inter-agent communication
Where compromise propagates: when agents pass outputs to other agents or systems, the compromise cascades. The OWASP Top 10 for Agentic Applications (2026) captures these patterns explicitly: ASI07 (Insecure Inter-Agent Communication) and ASI08 (Cascading Failures) describe exactly this cross-system propagation.

The attack enters through Zone 1, hijacks Zone 2, executes through Zone 3, persists via Zone 4, and propagates through Zone 5. Traditional security tools typically monitor within a single zone. Agent-mediated lateral movement crosses all five.

Notice the pattern across every case: the attacker did not breach the network perimeter or exploit a software vulnerability. Instead, they injected instructions into an AI-powered workflow. The agent’s own legitimate permissions were the entire attack surface. That changes how you defend.

Framework convergence

What convinced me this is a real structural shift, not just a collection of incidents, is the framework convergence. Six independent organizations arrived at the same conclusion from different angles:

OWASP ASI Top 10
Already referenced in the five-zone mapping above, dedicates four of its ten items to cross-system bridging: ASI03 (Identity & Privilege Abuse), ASI04 (Agentic Supply Chain Vulnerabilities), ASI07 (Insecure Inter-Agent Communication), and ASI08 (Cascading Failures).
OWASP AIVSS
The OWASP AI Vulnerability Scoring System introduces an Agentic AI Risk Score that layers amplification factors – autonomy, tool use, multi-agent interactions, non-determinism, and self-modification – on top of CVSS v4.0 base scores, directly quantifying how agent capabilities amplify traditional vulnerabilities.
CSA MAESTRO
The MAESTRO framework maps cross-layer attack propagation across its seven layers.
MITRE ATLAS
ATLAS now includes a dedicated Lateral Movement tactic and agentic techniques such as AI Agent Tool Invocation and Exfiltration via AI Agent Tool Invocation, plus mitigations like Restrict AI Agent Tool Invocation on Untrusted Data and Human In-the-Loop for AI Agent Actions (release notes).
Agentic Trust Framework
Josh Woodruff’s Agentic Trust Framework (CSA, February 2026) identifies five execution boundaries that agents collapse.
Viral Agent Loop
Jiang et al. introduce the “Viral Agent Loop” (February 2026) – a model where agents act as vectors for self-propagating worms without exploiting code-level flaws, advocating a Zero-Trust Runtime Architecture that treats context as untrusted control flow.

The terminology converges across independent sources: Cisco warns that the “connective tissue” linking agents, models, and enterprise systems is an unmonitored attack surface. F5 and others describe the security challenge of securing AI-driven integrations and runtime pathways that didn’t exist before. When multiple research groups and industry players independently describe the same structural phenomenon with parallel metaphors, it underscores that this isn’t just theoretical.

The security paradox

There’s an irony here that’s worth sitting with. Security AI agents – the ones designed to monitor, detect, and respond – require access to SIEM data, vulnerability scans, threat intelligence, identity stores, and network topology. If compromised, an attacker doesn’t just get data access. They get a complete map of what you can detect, what you can’t, where your blind spots are, and how you respond. The agent designed to protect the infrastructure becomes, if compromised, the most valuable pivot point in the entire environment.

Every post in this series has focused on business AI agents – coding assistants, customer service bots, enterprise automation. But the same structural vulnerabilities apply to security agents, with higher-value data access and broader system visibility. If you’re deploying AI into your SOC, this isn’t a “nice to consider.” It’s the highest-stakes version of the trust bridge problem.

Treating agents as trust boundaries

So how do you defend against a pivot that uses legitimate permissions, generates no network anomalies, and crosses systems through natural language? Not by watching for the attack – by that point it looks identical to normal agent behavior. The answer, I think, comes from treating every agent as a trust boundary – not just a tool, but an entity that requires the same scrutiny as a privileged user or a network perimeter.

The Agentic Trust Framework (Josh Woodruff, CSA, February 2, 2026) structures this around five questions:

Who are you?
Assign each agent a unique cryptographic identity—not an inherited user context. Agents should be managed as Non-Human Identities (NHIs): machine credentials, service accounts, and API keys now vastly outnumber human users in most enterprises. The correct practice is to issue short-lived, role-specific credentials for every agent instance, kept distinct from the user credentials that originally launched the agent. This demands full lifecycle governance: provisioning, rotation, revocation, and audit. I’ll explore this further in an upcoming post.
What are you doing?
Observability, anomaly detection, and intent analysis. You cannot fully eliminate prompt injection, but you can detect when an agent’s behavior deviates from its declared scope. Define explicit operational baselines, monitor for goal hijacking (the agent pursuing objectives it was never assigned), and ensure that model outputs never automatically translate into authority without validation.
What are you eating? What are you serving?
Input validation, data protection, and output governance. Apply least-privilege data access per agent, per session, per task. Track data lineage so you always know what content the agent ingested and what it produced. Adopt taint-flow analysis (as highlighted by Pillar Security) to map which input-output combinations create unacceptable risk. Define sources (public tickets, emails, Slack, order notes, cloud tags), sinks (script execution, git writes, IAM changes, payments, outbound messages), and propagation points (memory, summaries, inter-agent handoffs) – then enforce policy at the sinks. Block, or require explicit approval, whenever tainted data is about to trigger a privileged action.
Where can you go?
This is where Simon Willison’s “Agentic Trifecta” comes into play: sensitive data, untrusted input, and outbound communication—any two may be justified, but combining all three in a single agent session is toxic (Willison, June 2025). Treat tool access controls like network segmentation. Design your architecture so that no agent session can ever hold all three at once, and apply special scrutiny and monitoring whenever an agent crosses two.
What if you go rogue?
Put in place circuit breakers, kill switches, and real containment plans. Elevate high-risk, cross-system actions for human approval. Monitor agent behaviors with the same rigor as privileged user accounts—because that’s effectively what AI agents are.

What to do about it

If you take one thing from this post, let it be this: agents are trust boundaries. Treat them like you would treat a privileged user account or a network perimeter, not like a productivity tool.

Start by mapping your agent bridges. For every deployed agent, identify which systems it connects and which of those connections cross trust boundaries. If an agent reads from an untrusted source and writes to a privileged system, you have a trust bridge that needs controls.

Then break the toxic combinations. Segment agent tool access so that no single agent spans a low-trust input and a high-trust output. This is network segmentation thinking applied to tool access. Use OWASP AIVSS to score each agent deployment: its Agentic AI Risk Score layers amplification factors (autonomy, tool access, multi-agent interactions) on top of CVSS base scores, giving you a single number to prioritize the deployments with the widest bridge spans.

Traditional SIEM rules won’t catch an agent using its own legitimate permissions to pivot across systems – there are no anomalous network connections or failed logins to trigger alerts. You need behavioral baselines specific to agent activity. At minimum, log five things per agent action: input provenance (source system and trust label), tool invocation (tool name, arguments, result size), policy decision (allowed or blocked, with reason), human approval events (when required), and cross-system side effects (any write actions). Alert when the pattern shifts, not when a rule fires. If your cloud security agent suddenly starts querying HR data it has never touched before, that deviation is your detection signal. This directly aligns with MITRE ATLAS mitigations around restricting tool invocation on untrusted data and requiring human-in-the-loop for high-risk agent actions.

Finally, prepare for the supply chain scenario. The Cisco State of AI Security 2026 report warns of a “SolarWinds of AI” – a mass compromise through a widely used AI library or foundation model. Your agent inventory and kill-switch capability determine how quickly you can respond. Audit your agent dependencies the way you audit npm packages: pin versions, review changelogs, and maintain a revocation path for each major integration.

Treat your agents as trust boundaries, not just productivity tools. Unscoped agents don’t automate work—they automate compromise.



If this resonated...
I help organizations assess and secure their AI agent deployments through agentic AI security assessments, covering agent-mediated lateral movement, trust bridge analysis, MCP security, and defense architecture. If your agents connect systems that weren’t designed to interact, get in touch to map your exposure.