Coding Agent Security

Assessment info at a glance: Download the one-pager (PDF).

Securing how your engineers use AI coding agents

Most engineering organizations are already shipping code written with the help of AI coding agents. GitHub Copilot, Cursor, Claude Code, Codex, OpenCode, Windsurf, Devin, Aider, and a long tail of MCP-based extensions are in production use today, often before any formal security or legal review has happened. The result is a familiar pattern: developers love it, productivity numbers move, and the security team has no defensible answer to the question “how is this governed?”

This assessment gives you that answer in two sessions. We work through a structured 78-question workshop covering nine domains: adoption, governance, identity and secrets, MCP and tool connectivity, code and data exposure, generation safety, supply chain, execution sandboxing, and monitoring. From the answers you get a prioritized risk overview and a two-wave remediation roadmap covering the next nine months.

The structure is simple: a first session to gather where you actually stand, and a second to walk through the results and the roadmap.

Before the assessment we have a short scoping call. It pins down which agents and teams are in scope, what topics deserve the most depth, and who should be in the room for each session.

If your concern is the AI agents you build for customers (chatbots, autonomous orchestrations, multi-agent systems), see Agentic AI Security instead. That engagement covers attacks against your AI product. This one covers attacks against your software supply chain through the AI tools your engineers use to build everything else.

Why existing controls miss this

Coding agents do not look like a new threat surface from the outside. The developer still opens an IDE, writes code, opens a PR. Your SAST runs, your dependency scanner runs, code review happens. On paper, nothing changed.

What changed is who is writing the code, what code is leaving your network, and what privileges the writing process holds.

The agent runs as the developer. A coding agent with shell, file write, and network tools inherits the developer’s effective privileges. On a workstation with daily local admin or stored cloud credentials, a prompt-injected agent is a credentialed insider. Sandboxing of agent execution and tightly scoped tool permissions matter more in an agent-assisted workflow than they used to.

The vendor sees more than you think. Inline completions send small windows. Chat with project context sends open files. Cursor Background Agents and similar cloud workers clone whole repositories into vendor infrastructure. Most organizations have not assessed which of these paths their teams actually use, which vendors hold the resulting code, and what the contractual data-handling commitments are.

MCP is the fastest-growing integration surface in your dev stack and the least mature on the security side. MCP servers receive credentials, hold privileged tool access, and ship as packages from npm, PyPI, GitHub, or vendor sites. They get pulled in with the same casual ceremony as any other dev dependency, then handed shell, database, and cloud SDK access. Tool descriptions become part of the agent’s context, which makes a malicious description a persistent prompt injection.

Shadow adoption is the norm, not the exception. Personal vendor accounts bypass SSO, DPAs, and audit. Engineers paste production stack traces into chat boxes. Slopsquatted dependencies suggested by overconfident agents land in lockfiles. None of this shows up in a standard application security review.

This assessment is built specifically for that gap. It is not a pentest, not a code review, and not a training. It is a structured workshop and a defensible report that lets you say, with evidence, where you are and what you intend to do about it.

Assessment coverage: nine domains

The workshop and report are organized into the same nine domains. Each gets its own severity verdict in the final report, with findings mapped to recognized standards.

Inventory and adoption

What is actually in use, by whom, and where it runs.

We talk through which agents are in active use across your engineering org — sanctioned, tolerated, or known shadow usage — and how that distributes across teams, geographies, and seniority. We go through deployment modes (IDE plugin, CLI, web, code-review bot, autonomous CI/CD agent) and, separately, runtime locations (local IDE process, local CLI, corporate-hosted runner, vendor cloud background agent, browser-based cloud workspace). The runtime location distinction is doing most of the work here: a local plugin keeps code on the dev machine; a vendor cloud background agent clones the repo into vendor infrastructure. That single difference reshapes the downstream risk picture.

We also cover adoption scale, shadow visibility (do you actually know who is using what), corporate-versus-personal account split, repository scope rules, and the environments where agents run (laptop, devcontainer, Codespaces/Gitpod/Coder, CI/CD, production debugging, customer environments).

Governance and policy

The written scaffolding around AI coding adoption.

We discuss whether you have a formal AI coding policy, whether new agents and MCP servers go through a real approval gate or are self-service, how risk is classified, and what vendor due diligence has happened on the agent vendors holding your source code. Training and awareness coverage is part of this domain too: secret handling in prompts, prompt injection in pasted content, IP and license risk, social engineering of agents through poisoned documents, slopsquatting awareness, and destructive-action awareness all belong here.

Many engagements find this domain at “ad hoc” while pipeline-side controls are already mature. That asymmetry tends to show up later, when someone asks for evidence.

Identity, authentication, and secrets

Who the agent is, what credentials it holds, and what stops your secrets from leaving the workstation.

We discuss the predominant identity model (corporate SSO versus personal vendor accounts versus shared service accounts), SSO and SCIM enforcement across agent vendors, and secret-redaction controls on agent traffic. We capture where dev-side secrets actually live (centralized secret manager, OS keychain, .env files, IDE config, shell history, source) because a coding agent with file or shell tools can read whatever the developer can read.

Token scope and lifetime get specific attention. Many environments default to full repo write with mixed or indefinite lifetimes. That is a one-line finding with a measurable remediation. We also cover chat-history leakage risk, MCP server credential provisioning, and session isolation across users, repos, and projects.

MCP and tool connectivity

The integration layer that is growing faster than its security controls.

We take stock of the MCP servers in active use (GitHub, Slack, Atlassian, internal forks, and so on), go through source vetting before adoption, hosting models (local on dev machine, devcontainer, central internal service, vendor SaaS), and network egress posture. Tool description audits and rug-pull detection are part of this domain: a benign MCP server today can become malicious tomorrow through an upstream change.

The capability surface is the other half. We go through which tool categories your agents can invoke (shell execution, file write inside or outside the workspace, network requests, browser automation, database access, cloud SDK and IaC tools, email or messaging send, code execution sandboxes) and the tool approval defaults. The difference between “every action requires confirmation” and “full auto mode” reshapes your blast radius more than any other single setting.

Third-party marketplaces (Cursor extensions, IDE plugin stores, MCP directories) get their own item because they are a discovery layer with limited safety guarantees.

For the threat lens this domain draws on, see my blog post on securing MCP with a defense-first architecture.

Code context and data exposure

What code leaves your network, where it goes, and what the vendor is allowed to do with it.

Different products send different amounts of code. Inline completions send small windows. Chat with project context can send entire open files. Background agents clone the repo into vendor infrastructure. We discuss which of these paths your teams actually use and whether sensitive repositories (PCI, PHI, customer crypto material, critical business logic) get special handling or are treated like everything else.

We walk through what is documented around vendor data handling (training opt-out, zero-retention modes, region pinning, DPA, BYOK), prompt telemetry scope, the availability of self-hosted options for sensitive workloads, agent long-term memory controls, and the IP classification driving repo-level rules. Customer-data exposure through agent prompts is a common pain point and gets its own line.

Code generation safety and quality gates

What comes back from the agent, and what catches the bad parts before they ship.

We discuss whether human review of AI-generated code is mandatory, optional, or skipped, and whether AI-authored code is attributed (tag, comment, or commit trailer). Then we walk through the security scanning stack actually wired into the pipeline (SAST, secret scanning, SCA, IaC, container, license compliance) because AI-generated code concentrates certain bug classes: weak crypto, broad CORS, eval/exec, SQL string concatenation, unsafe deserialization, TLS verification disabled.

Prompt-injection defenses for source inputs (issues, PR comments, code comments, retrieved documents that the agent consumes) are part of this domain. So are destructive-action controls (pre-execution diff or preview, confirmation prompts, allowlists, sandbox execution, snapshots), test coverage requirements for AI-generated changes, commit signing and attribution for agent-authored commits, and license-contamination checks.

Supply chain and dependency risk

The packages, binaries, and rule files that ride along with AI-assisted development.

Slopsquatting (AI-hallucinated or recently-registered typosquatted packages) is the highest-traffic operational risk in this domain. We go through the controls that matter here: internal registry or proxy with allowlist, lockfile enforcement, pre-install scanning, cooling periods on newly-published packages, and human review for AI-suggested dependencies.

Beyond packages, the agent itself is a binary that runs with developer privileges. We discuss how agent binaries reach machines (vendor-signed installer, corporate app store, internal mirror, public marketplace, direct web download), and how MCP servers themselves are pulled in (signing, version pinning, internal fork or mirror, update review). Third-party rule packs, system prompts, and personas (Cursor rules, CLAUDE.md fragments, Copilot custom instructions, agent persona libraries) get their own item because they are unsigned text fragments that shape agent behavior and can carry hidden instructions.

SBOM coverage of AI-introduced dependencies and the contractual stance on whether your code trains vendor models complete this domain.

Execution environment and sandboxing

Where agent-initiated commands actually run, and what stops them from going further than they should.

We discuss where shell commands and code that the agent runs actually execute (laptop host OS, local devcontainer, remote dev environment, vendor-hosted runner, CI/CD runner, production read-only, production with write), the strength of shell sandboxing (microVM, container with restricted privileges, container with default privileges, process-only, none), and dev-environment network egress controls.

Dev-machine hardening (EDR, full-disk encryption, MDM, host firewall, least-privilege local user, application allowlisting) gets explicit coverage because the agent inherits the host’s containment. Agent access to production, the use of coding agents inside the CI/CD pipeline (PR review bots, autonomous bug fixes, test generation, security-finding triage), and whether agents can bypass branch protections or CODEOWNERS round out this domain.

Monitoring, audit, and incident response

The visibility that everything else depends on.

We discuss what is logged centrally (prompts, completions, tool and MCP invocations with arguments, network traffic from MCP servers, commits and PRs authored by agents, authentication and session events), retention policy, whether logs are SIEM-integrated, and whether anomaly detection is tuned to agent behavior (unusual tool sequences, exfiltration patterns, destructive-action spikes, off-hours activity, cross-repo activity from one session).

We also cover whether an incident response playbook exists for the four most common agent-specific scenarios (poisoned MCP server, leaked agent token, prompt-injection-induced destructive action, vendor data exposure) and whether credential revocation has been timed end-to-end against a tabletop. Forensic capability (full session replay, transcript export, tool-call audit log, network capture) gets its own item because reconstruction is what turns an incident into a postmortem rather than a guess.

The category closes with a self-assessed maturity score from 0 to 10. It anchors the report and gives you a measurable baseline to improve against.

What you receive

At the end of the engagement you receive three artifacts.

A risk overview. A prioritized view of risks across the nine domains, grounded in your actual workshop answers rather than a generic checklist. Each item names what it is, why it matters in your environment, and the severity given your stated usage profile.

A two-wave remediation roadmap. Wave 1 for the next 0–3 months covers the foundational changes. Wave 2 for 3–9 months covers the strengthening work. Each roadmap item points back to the specific risk it addresses.

A residual risk outlook. Where you stand after each wave, what risk remains, and which categories are structural rather than fixable. Useful for board and leadership communication.

The report is written for two readers: the CISO or head of engineering reading the executive summary, and the platform or detection engineer reading the risk overview and roadmap line by line.

Engagement approach

Scoping call

A short call to pin down what is in scope before any prep happens.

Which agents are sanctioned, tolerated, or known to be in shadow use. Which teams are in scope and what topics deserve the most depth. Whether the workshop is remote or onsite, half day or full day, single workshop or split across two half-days. Who should be in the room: typically someone from security, someone from platform or developer experience who knows the actual tool inventory, and at least one developer who uses the agents every day.

Assessment workshop (Session 1)

We work through the 78 questions live, in the agreed setting.

The workshop is facilitated, not interrogative. Questions are surfaced in the order that builds context: adoption first, then governance, then identity and secrets, then the integration layer, and so on. Where the answer is “we don’t know,” I mark it explicitly rather than guessing. That has its own value: items marked as unknown become explicit scope-limitations in the report rather than silent gaps.

Report production

Remote, shortly after the workshop.

I take the workshop output and produce the risk overview, the two-wave roadmap, and the residual risk outlook. Each finding’s severity is weighed against how the agents are actually used in your environment. The report carries individual analysis and depth rather than a generic checklist.

Risk and roadmap session (Session 2)

We go through the report together.

We go through the risk overview category by category. Your team challenges severities and adds context where it helps. From there we work the two-wave roadmap from the top, agreeing owners and timeline assumptions for Wave 1 items so the document is actionable when you leave the session. The output is the same report with your annotations folded in, plus an action sheet for the first 90 days.

Follow-up check-in (optional)

A short follow-up once you have had time to work with the report.

We review progress against Wave 1 items, surface new risks that have emerged since the assessment (new MCP servers, new agents, new repositories in scope), and decide whether the assessment should be re-run on a regular cadence as your posture evolves.

Customer-side and consultant-side time is between one and two days each, split across the two sessions, plus the optional follow-up. The calendar duration is typically one to two weeks because the report preparation between the sessions runs asynchronously.

Is this assessment right for you?

This engagement is the right fit when AI coding agents are already in production use across your engineering organization, when leadership or customers are asking how AI-assisted development is governed, or when you have started to feel the gap between heavy MCP adoption and the lack of vetting, logging, and incident response around it.

It is also useful as a baseline before larger architectural decisions: choosing between vendor cloud background agents and self-hosted alternatives, designing the next iteration of your dev-environment hardening, deciding which repositories should never see a coding agent, or building out the policy that will land in your acceptable-use document next quarter.

If your starting point is earlier, say you are still exploring whether to adopt agents, running a pilot in one team, or your security team is new to this topic, a 3-hour custom focus session on the highest-leverage controls makes more sense than the full assessment. The intro can be turned into the assessment later when you are ready.

For background on the threat patterns this assessment is built around, see my research and blog series and the Agentic AI Security engagement, which is the right one if your concern is the AI systems your customers interact with rather than the AI tooling your engineers use.

Ready to assess how AI coding agents are used in your org? Let’s talk about scoping the two-session package for your team.