Application Security Insights

From LLM to agentic AI: prompt injection got worse

Thu, 29 Jan 2026 06:32:00 GMT

Agentic AI attack chains

TL;DR

Agentic AI systems transform prompt injection from an isolated model manipulation into coordinated multi-tool attack chains. According to the OWASP Top 10 for Agentic Applications 2026, what was once a single manipulated output can now hijack an agent’s planning, execute privileged tool calls, persist malicious instructions in memory, and propagate attacks across connected systems. Organizations deploying agentic AI must implement defense-in-depth controls including input validation on all data sources, goal-lock mechanisms, tool sandboxing with minimal privileges, and strategic human-in-the-loop approval for high-impact actions.

Prompt injection has topped the OWASP Top 10 for LLM Applications since the list’s inception. For simple chatbot integrations, this vulnerability typically meant a user could trick the model into ignoring its instructions or leaking its system prompt. Annoying, sometimes embarrassing, but often contained.

Then came the era of Agentic AI.

In June 2025, researchers disclosed EchoLeak (CVE-2025-32711), a zero-click prompt injection vulnerability in Microsoft 365 Copilot. Without any user interaction, an attacker’s carefully crafted email could coerce Copilot into accessing internal files and transmitting their contents to an attacker-controlled server. A single injection, delivered via a benign-looking email, cascaded through the agent’s retrieval capabilities to exfiltrate chat logs, OneDrive files, SharePoint content, and Teams messages.

This is the new reality. What was once a single manipulated output has become orchestrated multi-tool chains achieving unintended outcomes. The business impact is severe: unauthorized data exfiltration, regulatory exposure under GDPR and similar frameworks, reputational damage from compromised AI assistants acting on behalf of your organization, and potential liability when an agent takes actions your users never authorized. And as organizations race to deploy agentic systems (Gartner predicts 33% of enterprise applications will utilize agentic AI by 2028), the attack surface is expanding faster than most security teams realize.

In this post, I will walk through why agentic systems fundamentally amplify prompt injection risks, how to evolve your security controls for this new paradigm, and the defense-in-depth architecture patterns that can help contain the blast radius when, not if, an injection succeeds.

The amplification effect

To understand why prompt injection becomes dramatically worse in agentic systems, we need to examine what changes when you move from a stateless LLM call to an autonomous agent.

In a traditional LLM integration, prompt injection (OWASP LLM01) typically affects a single model interaction. The attacker manipulates the prompt, the model produces an unintended output, and that output is returned to the user or passed to one downstream system. The blast radius is limited by the scope of that single inference call.

Agentic systems change this equation entirely. The OWASP Top 10 for Agentic Applications 2026 introduces ASI01 (Agent Goal Hijack), which captures the broader agentic impact where a manipulated input doesn’t just alter one output. It redirects goals, planning, and multi-step behavior across the entire agent workflow.

Consider the differences in attack progression. In a simple LLM chatbot, an attacker injects a prompt that makes the model reveal its system prompt or produce harmful content. The damage is contained to that conversation. In an agentic system, that same injection can now hijack the agent’s planning process, causing it to select different tools than intended. The agent might execute those tools with the user’s inherited privileges. Results from one compromised tool call flow into the next iteration of reasoning. The agent might persist malicious instructions in memory for future sessions. And in multi-agent architectures, the compromised agent can propagate tainted instructions to peer agents.

The key insight from the OWASP Agentic security guidance is this: agents amplify existing LLM vulnerabilities. What was a single manipulated output becomes an orchestrated multi-tool kill chain achieving unintended outcomes.

The “Promptware kill chain”

Researchers have begun modeling these multi-step attacks using a framework they call the Promptware Kill Chain, treating prompt injection payloads as a new class of malware that executes in natural language space rather than machine code.

The kill chain proceeds through five stages:

Initial access occurs when the payload enters the LLM’s context via direct or indirect prompt injection, through user input, a poisoned document, a malicious email, a website with hidden malicious commands, or compromised RAG data.
Privilege escalation happens when jailbreaking techniques bypass safety training, allowing the payload to overcome the model’s built-in guardrails.
Persistence is achieved when the payload corrupts long-term memory, ensuring it survives across sessions.
Lateral movement spreads the attack across users, devices, connected services, or other agents in multi-agent architectures.
The attacker achieves their actions on objective, whether that is data exfiltration, unauthorized transactions, or system compromise.

This model helps explain why traditional prompt injection defenses, focused solely on input filtering, fail in agentic contexts. By the time you detect the injection, the agent may have already executed multiple tool calls, persisted malicious data, and propagated to other systems.

Indirect injection

The primary agentic attack vector

While direct prompt injection (where a user explicitly crafts malicious input) remains a concern, indirect prompt injection has emerged as the dominant threat vector for agentic systems.

Indirect injection occurs when malicious instructions are embedded in external data sources that the agent retrieves and processes: documents summarized by a RAG pipeline, emails processed by an assistant, web pages fetched during research, calendar invitations parsed for scheduling, code repositories analyzed during development, and API responses from third-party services.

The agent cannot reliably distinguish between legitimate content and attacker-controlled instructions. As OpenAI acknowledged in December 2025, prompt injection “is unlikely to ever be fully solved” because it represents a fundamental architectural challenge: blending trusted and untrusted inputs in the same context window.

This is why the EchoLeak attack was so effective. The injection payload was embedded in a benign-looking email, a data source Copilot was designed to process. The payload didn’t need to trick a human; it only needed to be parsed by the agent’s retrieval system.

The MCP attack surface

As agentic AI adoption accelerates, the Model Context Protocol (MCP) has emerged as a standard for connecting LLMs to external tools. While MCP provides a structured way to define tool capabilities, it also introduces a significant attack surface.

Tool poisoning occurs when attackers embed malicious instructions within the descriptions of MCP tools. The LLM uses this metadata to determine which tools to invoke, meaning compromised descriptions can manipulate the model into executing unintended tool calls, without the user seeing anything suspicious.

Rug pull attacks exploit the fact that MCP tools can mutate their definitions after installation. You approve a safe-looking tool on Day 1, and by Day 7 it has quietly modified its behavior to exfiltrate your API keys.

Cross-tool contamination happens in environments with multiple MCP servers, where a compromised server can influence the behavior of legitimate tools through shared context or memory.

Defending against MCP attacks

To mitigate these risks, implement several safeguards:

Pin tool definitions by computing and storing a hash of each MCP tool’s schema and description at approval time, then verify this hash on each invocation. Any mutation triggers a re-approval workflow.
Implement tool isolation by running each MCP server in a separate process or container with its own credential scope, preventing cross-tool contamination.
Monitor for behavioral drift by logging tool invocation patterns and alerting on anomalies such as a “read-only” tool suddenly attempting write operations or network calls to unexpected domains.
Establish a vendor assessment process that evaluates MCP tool providers for security practices before installation, treating them with the same rigor as any third-party dependency.

I’ll cover MCP security in a dedicated post soon, so stay tuned.

Evolving your security controls

The migration checklist

If you are moving from simple LLM integrations to agentic architectures, or building agentic systems from scratch, here are the security controls that must evolve.

Input validation must expand

For traditional LLM integrations, input validation typically focused on the user prompt: checking length limits, filtering known injection patterns, and perhaps running a classifier to detect malicious intent.

For agentic systems, you must validate every data source the agent touches. This includes user prompts (direct injection defense), RAG corpus contents (indirect injection defense), tool responses and API payloads, email and document contents before summarization, MCP tool descriptions and metadata, and inter-agent messages in multi-agent architectures.

The validation approach should combine syntactic checks (length limits, format validation), semantic analysis (“Does this content contain instruction-like patterns?”), and provenance tracking (“Where did this data originate, and do we trust that source?”).

For practical implementation, consider deploying prompt-injection classifiers such as LLM Guard, complemented by output-validation frameworks like Guardrails AI, as validation and control layers around the LLM. These open-source tools help detect common injection patterns and enforce constraints at different stages of the pipeline, ideally before untrusted content can influence agent behavior.

In a RAG pipeline, tag each retrieved chunk with its source and trust level, then include this provenance metadata in the context so downstream validation can apply appropriate scrutiny.

Output handling requires context-aware encoding

The principle from OWASP LLM05 (Improper Output Handling) becomes even more critical in agentic systems: treat all model output as untrusted user input.

Before any LLM-generated content flows to a downstream system, apply context-appropriate encoding. For HTML contexts, use HTML entity encoding. For SQL contexts, use parameterized queries. Never let the LLM generate raw SQL that is directly executed. For shell contexts, avoid this entirely if possible; if you must, use sandboxing and strict allowlists rather than blocklists. For JavaScript contexts, apply JSON encoding and strict Content Security Policies. For inter-agent messages, validate structure and content before processing.

The key insight is that LLM output should never be passed directly to any interpreter, whether that is a database engine, a shell, a browser, or another agent, without proper validation, encoding, and guards.

Privilege scope must be per-tool, per-task

In simple integrations, you might give the LLM access to a single API with a long-lived token. Agentic systems demand a more granular approach.

Implement per-tool privilege profiles that define exactly what each tool can access, what actions it can perform, what rate limits apply, and what egress destinations are allowed. An email summarization tool should have read-only access to email, not the ability to send or delete messages.

Use short-lived, task-scoped credentials rather than persistent tokens. If an agent needs database access for a specific query, issue a token that expires after that task completes and is scoped to read-only access on the relevant tables.

Consider the blast radius of each privilege grant. If this tool were compromised via prompt injection, what is the worst-case outcome? Design your privilege model to minimize that worst case.

Human-in-the-loop must be strategic

The OWASP Agentic guidance emphasizes human-in-the-loop (HITL) controls for high-impact actions. But HITL can become a bottleneck, or worse, a rubber-stamp exercise where reviewers approve everything without scrutiny.

Design HITL to be strategic rather than exhaustive. Categorize actions by impact: read-only operations might proceed automatically, while write operations require review, and destructive or irreversible operations require explicit confirmation with a preview of what will happen.

Implement pre-execution diffs that show the reviewer exactly what the agent intends to do before it does it. For a file modification, show the diff; for an email send, show the full message and recipients; for a database write, show the exact records that will change.

Protect against HITL fatigue by batching similar low-risk requests and making sure high-risk requests are rare enough that reviewers give them genuine attention. If reviewers are approving hundreds of requests per day, the control has failed.

Memory isolation prevents cross-session contamination

Agentic systems often maintain memory across sessions to provide context and personalization. This memory becomes a persistence vector for prompt injection attacks. An attacker who can write to the agent’s memory can influence all future interactions.

Implement memory segmentation that isolates user sessions and domain contexts from each other. One user’s conversation should never leak into another user’s context. Where shared memory is necessary (for example, organizational knowledge), implement strict validation before any content is committed to shared state.

Scan all memory writes for instruction-like content. If a user’s conversation includes text that looks like a system prompt or tool invocation, that should trigger additional scrutiny before persistence.

Maintain snapshots and rollback capabilities so you can recover from memory poisoning attacks.

Defense-in-depth for agentic systems

Single-layer defenses fail against multi-step attacks. The solution is defense-in-depth: multiple independent security controls at each layer of the agentic architecture, so that a failure in one control does not lead to complete compromise.

Layer 1: Input Perimeter

At the input perimeter, implement prompt injection classifiers that detect known attack patterns. Route all natural-language inputs, whether from users, documents, or external systems, through these classifiers. Apply content disarm and reconstruction (CDR) to documents before the agent processes them, stripping potentially malicious elements while preserving legitimate content.

Maintain trust levels for different input sources. Direct user input might be “medium trust,” while content from external websites is “low trust,” and verified internal systems are “high trust.” These trust levels should influence how aggressively you validate and constrain the content.

Layer 2: Goal and Planning Validation

Before the agent executes a plan, validate that the plan aligns with the intended goal. Define explicit, auditable goals in the system configuration, not just in the system prompt, which can be manipulated.

Implement goal-lock mechanisms that detect unexpected shifts in the agent’s objectives. If a user asked for email summarization and the agent is suddenly planning to access the file system, that deviation should trigger an alert or require confirmation.

Use a separate validation model (distinct from the primary agent) to assess whether the planned actions are consistent with the stated goal. This “guardian” pattern works by feeding the agent’s proposed plan to a smaller, faster model with a strict prompt: “Given the user’s original request X, does this plan contain any actions that are not directly necessary to fulfill X? Flag any file system access, network calls, or data exports that appear unrelated to the stated goal.” This provides defense against attacks that successfully compromise the primary model’s reasoning, at the cost of additional latency and compute. That’s a worthwhile tradeoff for high-stakes operations.

Layer 3: Tool Execution Sandboxing

Run all tool executions in isolated sandboxes with restricted network access, file system access, and privilege levels. The agent should never run as root or with administrative privileges.

Implement outbound network allowlists so that even a compromised tool cannot exfiltrate data to arbitrary destinations or establish C2 channels. If a tool needs to make HTTP requests, specify exactly which domains it can contact.

For code execution capabilities, increasingly common in agentic systems, use taint tracking on generated code and require safe interpreters that restrict dangerous operations. Ban eval() and equivalent functions with untrusted content.

Layer 4: Output Validation and Encoding

Before any output reaches a downstream system or user, validate that it conforms to expected formats and does not contain suspicious patterns. Apply context-appropriate encoding as described earlier.

Implement anomaly detection on outputs to identify responses that deviate significantly from expected patterns. This can catch attacks that successfully evade input-side defenses.

Layer 5: Monitoring and Response

Log all agent actions, tool invocations, memory operations, and inter-agent communications. These logs should be tamper-evident and retained long enough to support incident investigation.

Implement real-time anomaly detection that can identify attack patterns across the kill chain: unusual sequences of tool calls, unexpected data access patterns, signs of privilege escalation or lateral movement.

Maintain kill switches that can immediately revoke an agent’s credentials and halt its operations if a compromise is detected. In multi-agent systems, implement circuit breakers that can isolate a compromised agent from its peers.

Checklist for agentic security

When reviewing code that implements agentic AI features, use this checklist:

For input handling, ask: Are all user inputs validated before reaching the LLM? Are indirect inputs (files, URLs, emails, RAG data) sanitized? Is there a trust classification for different input sources?
For output handling, ask: Is LLM output encoded appropriately for the target context? Is there validation before downstream use? Are parameterized queries used for any database operations?
For privilege scope, ask: Does each tool have minimum necessary permissions? Are credentials short-lived and task-scoped? Is there a documented blast radius for each privilege grant?
For human approval, ask: Are high-impact actions gated by human confirmation? Is there a pre-execution preview? Is the approval flow resistant to fatigue attacks?
For memory handling, ask: Is memory properly segmented by user and session? Are memory writes scanned for injection patterns? Is there rollback capability?
For monitoring, ask: Are all agent actions logged with sufficient detail? Is there anomaly detection? Are kill switches and circuit breakers implemented?

Quick wins: where to start

If you cannot implement the full defense-in-depth architecture immediately, prioritize these five controls that provide the highest security ROI for the least effort:

Implement outbound network allowlists. Most agentic systems do not need to contact arbitrary internet destinations. Restrict egress to only the domains your tools legitimately require. This single control can prevent most data exfiltration scenarios.
Require human approval for all write and delete operations. Start with a simple rule: any action that modifies external state requires a human click. You can refine the granularity later.
Deploy a prompt injection classifier on all external inputs. These checks can be integrated easily and will catch the most common injection patterns in documents and emails.
Audit your current MCP tool permissions. Create a simple spreadsheet listing each tool, what it can access, and what happens if it is compromised. This exercise alone often reveals unnecessary privileges that can be immediately revoked.
Enable comprehensive logging. You cannot detect what you do not log. Make sure all tool invocations, their inputs, and their outputs are recorded with timestamps and user context.

In the long term: Build the complete defense-in-depth architecture, including goal validation, memory isolation, and real-time anomaly detection. Establish incident response procedures specific to agent compromise.

The shift to agentic AI is inevitable and offers tremendous value. But it also requires us to evolve our security thinking from protecting individual model interactions to securing autonomous systems that plan, decide, and act across multiple steps and services. Organizations that build security in from the start will be the ones that succeed. Those that scramble to retrofit controls after the first headline-grabbing breach will not.

If this resonated…

If you’re working on GenAI or agentic systems and want to better understand the security risks, I help teams with threat modeling, architecture reviews, and practical hardening. Details are here: Agile Threat Modeling and Security Architecture.

Published at: https://christian-schneider.net/blog/prompt-injection-agentic-amplification/

Dependency cooldowns: a simple supply chain fix

Tue, 27 Jan 2026 06:45:00 GMT

The golden hour problem

TL;DR

Most supply chain attacks—including short-lived campaigns like the Nx incident and the recent wormable Shai-Hulud incarnations—exploit a narrow window between malicious package publication and detection. In DevSecOps consulting engagements, simple cooldown policies have proven effective at eliminating exposure: a zero-cost 7-day delay breaks the attacker’s time advantage and neutralizes short-lived blast radii easily.

Most supply chain attacks share a common pattern: malicious code gets published to a package registry, and within hours it’s already been downloaded thousands of times before anyone notices. By the time security researchers flag the package or the registry removes it, the damage is done.

This is the golden hour of supply chain attacks: the window where attackers race to compromise systems before their malicious package gets detected and removed. They exploit the immediate-adoption culture of modern development. When a popular package releases a new version, CI/CD pipelines worldwide pull it automatically within minutes, giving attackers just enough time to compromise thousands of build systems.

Consider the Nx supply chain attack from August 2025: Malicious packages were published to npm at 22:32 UTC on August 26. NPM was alerted at 02:58 UTC and removed all affected versions within an hour. Total exposure window: roughly 4–5 hours. Yet in that brief period, thousands of developers had their secrets exfiltrated, including SSH keys, GitHub tokens, and API credentials. The malware even attempted to leverage local AI CLI tools for reconnaissance, a disturbing first in supply chain attacks.

There’s a remarkably simple countermeasure that breaks this attack model entirely: dependency cooldowns.

What are dependency cooldowns?

A dependency cooldown is exactly what it sounds like: a waiting period before your tooling accepts new package versions. Instead of immediately adopting version 1.2.4 when it’s published, you wait 5 to 10 days before considering it for your project.

This approach works because of simple economics. Attackers publishing malicious packages face a race against time. Registry security teams, automated malware scanners, and the security community are constantly scanning for suspicious packages. Most malicious packages get detected and removed within days, often hours. A 7-day cooldown means you never touch packages during their most dangerous period.

The math is compelling: if malicious packages are typically removed within 24–72 hours, even a 7-day cooldown gives you a comfortable safety margin. Organizations with cooldown policies during the Nx incident were simply never exposed since the malicious versions had been removed days before their pipelines would have considered them.

It’s important to be precise about what cooldowns solve: Cooldowns address version freshness risk, the risk of blindly adopting new, unvetted releases. They do not mitigate known vulnerability risk. Once a vulnerability is identified and a fix is published, the risk calculus flips: delay becomes the dangerous option.

From a business perspective, the Nx and Shai-Hulud incidents exposed thousands of build systems to credential theft. Even without assigning specific costs per compromised environment, incidents of this scale translate into massive organizational impact across response effort, recovery time, and long-term risk exposure. A cooldown policy costs nothing and would have prevented this entire class of attack.

Tool support

Several dependency management tools now support cooldowns natively:

Dependabot introduced the cooldown option in mid-2025, allowing you to specify minimum age requirements before version updates are proposed. You can configure different delays based on semantic version changes, with longer waits for major versions and shorter ones for patches. Dependabot’s cooldown applies only to routine version updates, not security updates, so CVE patches should still flow through promptly. See the Dependabot cooldown documentation for configuration details.

Teams should still periodically validate this behavior in their own repositories. Cooldown logic is applied at update runtime, and overly broad configuration or exclusions can silently suppress updates if not tested.

Renovate offers similar functionality through its minimumReleaseAge setting (previously called stabilityDays). Renovate creates branches for pending updates but marks them with a “pending” status check until the cooldown expires. If you have automerge enabled, updates won’t merge until they’ve aged sufficiently. A notable behavior change in Renovate 42: packages without a release timestamp are now treated as if they haven’t passed the cooldown period, which is safer than the previous behavior. The Renovate minimum release age documentation covers the configuration options.

In Renovate setups with broad package rules, security updates can still appear “pending” unless explicitly excluded from cooldown logic. For this reason, security-specific rules are strongly recommended.

pnpm added the minimum-release-age setting in version 10.16, which filters packages by publish date and automatically remaps dist-tags to versions that meet the age requirement. This preserves semantic version compatibility while enforcing your security delay.

For ecosystems without native cooldown support, lock files provide a manual alternative. Tools like Poetry, uv, or Go modules with go.sum pin exact versions, including transitive dependencies, so newly published releases are never pulled in implicitly. Even when updates are scheduled weekly or bi-weekly, the refresh is a conscious, explicit step: you update the lock file, review the diff, and only then accept newer versions. This creates a de-facto cooldown window, ensuring that dependencies must “age” until the next planned refresh instead of being adopted immediately after release. The key is treating dependency updates as a deliberate, reviewable activity rather than something that happens automatically in the background.

A common misconfiguration trap

One recurring failure mode I see in audits is teams enabling cooldowns, assuming they are “safe,” and then relaxing their active monitoring of security advisories. Cooldowns reduce exposure to unknown malicious releases. They do nothing for known vulnerabilities already present in your dependency tree.

Without active vulnerability alerting and triage, cooldowns can actually increase dwell time for exploitable CVEs. Cooldowns are a preventive control, not a detective one.

Transitive dependencies: the hidden risk

Here’s a point that’s easy to miss: cooldowns must effectively apply to your entire dependency graph, not just direct dependencies. A malicious package introduced as a transitive dependency can still reach production even if your direct imports are carefully curated.

Modern dependency update tools can account for this, when they are used with lockfiles and conservative update policies. Tools like Dependabot and Renovate operate on the resolved dependency graph, meaning updates (including transitives) are proposed via lockfile changes rather than silently flowing in. As long as lockfiles are committed and updates are gated, transitive dependencies won’t change unless you explicitly accept an update.

A dangerous anti-pattern is allowing floating transitive dependencies in production while only cooling down direct dependencies. This recreates the golden-hour problem one level down the graph, exactly where attackers increasingly aim.

If you rely on manual version pinning or ecosystems without strong lockfile enforcement, this safety net disappears. In those cases, you must regularly regenerate and review the full dependency graph (for example via mvn dependency:tree, pip-compile, or equivalent tooling) to detect unexpected transitive additions or version shifts.

Handling urgent security patches

Cooldowns work best when paired with an explicit security SLA, for example: critical dependency CVEs must be triaged within 24 hours and patched within 72.

Cooldowns should apply to routine updates, not emergency security patches. Dependabot explicitly excludes security updates from cooldown rules. Renovate allows you to force immediate updates for specific packages through its Dependency Dashboard or security-specific rules.

For emergency overrides, establish a clear process. The security team should approve bypasses with documented justification. Record all cooldown bypasses in your security log for audit purposes.

Such fast-tracked packages deserve additional scrutiny. Where feasible, perform manual or automated review of the delta: look for obfuscation, dynamic code execution, unexpected network access, or new persistence mechanisms. Once the normal cooldown period expires, re-verify that the package remains trustworthy.

What cooldowns don’t protect against

Let's be clear about the strengths and limitations.

Dependency cooldowns are effective against:

Compromised maintainer accounts with short-lived malicious releases
Automated malware injection and wormable release pipelines

They are not effective against:

Typosquatting attacks using similar package names
Long-term maintainer compromise
Zero-day vulnerabilities where fixes must be applied immediately

In other words: cooldowns buy you time, not certainty. Use that time to let scanners run, advisories surface, and the community react. Then decide from a position of information, not urgency.

Cooldowns are one layer in a defense-in-depth strategy. Combine them with SBOM generation, vulnerability scanning using tools like Trivy or Grype, code signing verification, and regular dependency audits. I’ll cover code signing and attestation of dependencies in a dedicated post soon, so stay tuned.

Getting started today

Rule of thumb: Delay unknown updates by default, fast-track known security fixes deliberately.

If you take nothing else from this post, implement a 7-day cooldown on your automated CI/CD dependency updates this week. The configuration is minimal, the protection is immediate, and the risk reduction is real.

For teams worried about being “slowed down”: you’re likely already waiting days or weeks between dependency updates in practice. Cooldowns simply formalize this delay and make sure it applies consistently, including on that one rushed Friday afternoon deploy.

Attackers are counting on you to adopt their malicious packages immediately. Make them wait.

Building secure pipelines?

Adding security to CI/CD is easy to start and hard to get right. I help teams do it properly. More info: DevSecOps Pipeline Consulting.

Published at: https://christian-schneider.net/blog/dependency-cooldowns-supply-chain-defense/

Ship fast, but guard faster: securing DevOps itself

Sat, 24 Jan 2026 16:00:00 GMT

Attack surfaces inside CI/CD

TL;DR

Your CI/CD pipelines have become high-leverage attack targets—not your application code. This post distills the Break-the-Chain controls from my Real-World DevOps Attacks keynote: replace long-lived credentials with OIDC federation, pin all actions to SHA hashes, sign artifacts with Sigstore, enforce minimal GITHUB_TOKEN permissions, and isolate your build environments.

This blog post is not about scanning your application code for vulnerabilities. That topic has been written about endlessly. Instead, it’s about securing your DevOps itself: your workflows, automation, secrets, registries, and supply chains. The infrastructure attackers increasingly target.

According to GitGuardian’s 2025 State of Secrets Sprawl report, secret-scanning tools detected 23.8 million leaked credentials in public repositories last year. And that’s only what was found. In another incident, a single poisoned workflow compromised 23,000+ repositories in March 2025. Then there’s the Shai-Hulud worm, which spread through the npm ecosystem twice in late 2025. Speed multiplies everything: delivery and disaster.

This post distills the defensive Break-the-Chain controls from my Real-World DevOps Attacks keynote into an actionable field manual. We’ll skip the blow-by-blow incident autopsies (watch the keynote for those) and focus on what actually blocks, detects, and contains the next breach.

Four attack vectors

Modern CI/CD pipelines present attackers with four primary entry points. Each requires a distinct defensive mindset.

Secrets & Credentials Long-lived tokens sitting in environment variables, config files, or workflow logs. Recent breaches have demonstrated how a single compromised secret store can cascade into customer breaches across an entire ecosystem.

Workflow & Action Poisoning Exploits the trust we place in automation. Malicious pull requests or hijacked third-party actions execute attacker code inside your runner with your permissions. The tj-actions/changed-files incident showed how one compromised action can harvest secrets from thousands of downstream projects within hours.

Artifact & Registry Tampering Targets the outputs of your build process. Unsigned images, poisoned packages, or hijacked release binaries become trojans delivered through your own deployment pipelines. Some attacks on build tooling went undetected for months while quietly exfiltrating credentials from CI environments worldwide.

Dependency & Supply Chain Compromise Typosquatting, maintainer takeovers, and malicious lifecycle scripts exploit the implicit trust in open-source ecosystems. The XZ Utils backdoor proved that even heavily-scrutinized projects can be subverted through patient social engineering.

Let’s examine the Break-the-Chain controls for each.

Secrets hygiene

Secrets are what attackers want most. They require layered protection.

SHORT means eliminating long-lived credentials entirely. Replace static Personal Access Tokens with OIDC federation wherever possible. GitHub Actions can authenticate directly with AWS, Azure, and GCP using short-lived, automatically-rotated tokens that never touch disk. No static keys to leak. No secrets to rotate manually. The authentication happens through cryptographic identity verification rather than shared secrets.

SHRINK addresses the blast radius when credentials do leak. Every secret should have the smallest permission set possible. For the built-in GITHUB_TOKEN, explicitly declare read-only defaults at the repository level. Never grant write permissions unless the job actually requires them, and document why.

SEPARATE isolates secret access by environment. Use GitHub Environments to gate production secrets behind approval workflows and branch protections. Staging secrets should never unlock production resources. Compromising a development workflow shouldn’t automatically grant access to production infrastructure.

SHIELD means detecting and responding to leaks before attackers can exploit them. Enable secret scanning with push protection. When a secret is detected, the commit is blocked before it reaches the repository. Combine this with automated rotation and comprehensive audit logging so you can trace exactly who accessed what and when.

Workflow hardening

Workflows are code. Treat them with the same rigour you apply to application security.

LOCK addresses the mutability problem. Tags are mutable. A compromised maintainer can retag a malicious release to an existing version number, and every workflow referencing that tag will silently start running attacker code. Always pin actions to the full commit SHA, which is immutable. Use Dependabot or Renovate to automate SHA updates while preserving this immutability guarantee.

LIMIT restricts the power of the GITHUB_TOKEN. Set the repository default to read-only and require explicit permission elevation per job. This forces developers to think about what permissions each workflow actually needs rather than running everything with maximum privileges.

SCAN means analyzing workflow files for dangerous patterns before they reach production. Flag patterns like pull_request_target combined with actions/checkout of the PR head. This combination allows untrusted code from external contributors to execute with write permissions to your repository.

RESTRICT controls which actions can run at all. Use GitHub’s Actions Policies to allow only actions from verified creators or your organization. Block marketplace actions by default and explicitly allowlist vetted dependencies. This prevents developers from casually adding untrusted automation.

SANDBOX addresses the self-hosted runner problem. If you use self-hosted runners, treat them as ephemeral and untrusted. Spin up fresh VMs per job, never persist state between runs, and network-isolate them from production infrastructure. A compromised runner should never become a pivot point into your internal network.

Artifact integrity

If you can’t verify it, you can’t trust it.

SIGN creates a verifiable chain of custody. Use Sigstore and Cosign to sign container images and binaries. Keyless signing with OIDC identity ties signatures to your CI/CD workflow identity rather than long-lived signing keys that could be stolen. The signature proves who built the artifact.

ATTEST proves where and how an artifact was built. SLSA (Supply-chain Levels for Software Artifacts) attestations capture the build environment, inputs, and process. GitHub Actions can generate SLSA Level 3 attestations automatically, providing tamper-evident provenance that auditors and downstream consumers can verify.

VERIFY closes the loop by enforcing signature checks before deployment. Configure your container runtime to reject unsigned images. In Kubernetes, use admission controllers like Sigstore Policy Controller or Kyverno to block any image that lacks valid signatures or attestations from reaching your clusters.

REPRODUCE provides the strongest defense: reproducible builds allow independent verification that source code produces a specific binary. If anyone can rebuild your artifact from source and get bit-for-bit identical output, you’ve eliminated single points of compromise in your build infrastructure.

Dependency security

Your dependencies are your attack surface. The npm ecosystem learned this painfully in November 2025 when the Shai-Hulud 2.0 worm (dubbed “The Second Coming”) compromised 796 npm packages totalling over 20 million weekly downloads. The self-replicating malware hijacked maintainer accounts from widely-used projects, then used npm’s preinstall lifecycle hooks to execute before installation even completed. It harvested credentials from local filesystems and cloud environments, exfiltrating them to attacker-controlled repositories. The attack included a “dead man’s switch” that threatened to wipe user home directories if its channels were severed.

A disciplined approach helps contain such attacks.

MIRROR means pulling dependencies through a private registry like Artifactory, Nexus, or GitHub Packages. This provides caching, auditability, and a kill-switch. When an upstream package is compromised, you can block it at your mirror before it reaches any build environment.

LOCK enforces deterministic builds. Commit package-lock.json, go.sum, or requirements.txt with pinned hashes. Reject builds where lockfiles are missing or modified without explicit review. This prevents silent dependency updates that could introduce compromised versions.

SCAN detects malicious patterns before they execute. npm packages can run arbitrary code during preinstall, postinstall, and similar hooks. That’s exactly how Shai-Hulud spread. Use tools that analyze lifecycle scripts before installation, or disable them entirely in CI with npm ci --ignore-scripts. For rapid incident response scanning, I’ve open-sourced quick-npm-module-scanner, a lightweight tool that lets you scan for adjustable IoCs when new threats emerge.

ISOLATE segments your build environments. Run dependency installation in isolated, network-restricted containers. If a malicious package attempts to exfiltrate data, network policies should block egress to anything except your approved registries.

In response to Shai-Hulud, npm has accelerated its security roadmap. Trusted publishing uses OIDC tokens instead of stored credentials, which means there’s nothing to steal from developer machines. npm provenance checks that published packages match a trusted workflow origin; if stolen tokens are used outside that path, publish attempts are rejected. In December 2025, npm permanently revoked all classic tokens and replaced them with short-lived session tokens. These ecosystem-level changes don’t eliminate risk, but they significantly raise the bar for attackers.

Five moves this month

Theory is worthless without action. Here are five concrete improvements you can schedule immediately:

Audit your GITHUB_TOKEN permissions. Review all workflows. Set repository defaults to read-only. Explicitly declare minimal permissions per job. This single change limits the blast radius of any workflow compromise.

Replace one PAT with OIDC. Pick your most sensitive deployment workflow. Migrate from static credentials to OIDC federation with your cloud provider. Once you’ve done it once, the pattern becomes repeatable.

Pin all actions to SHAs. Replace tag references with commit SHAs across all your workflows. Configure Dependabot to manage updates while maintaining immutability.

Enable secret scanning with push protection. This single setting prevents the most common class of credential leaks before they happen. The friction is minimal; the protection is substantial.

Run a 45-minute threat huddle. Gather your team. Sketch your CI/CD data flows on a whiteboard. Ask: “Where could an attacker inject code? What would they steal? How would we know?” Document the risks and prioritize mitigations.

Think like an attacker

The controls above are reactive. They address known attack patterns. To stay ahead, you need to think like an attacker.

Threat modeling your CI/CD infrastructure reveals blind spots that checklists miss. Map your pipelines end-to-end: source control, build triggers, secret stores, artifact registries, deployment targets. For each component, ask what trust boundaries exist, what happens if this component is compromised, and how you would detect it.

If you want help building a threat model tailored to your infrastructure, or need hands-on guidance implementing these controls, check out my DevSecOps Pipeline consulting and Agile Threat Modeling services.

Closing thoughts

DevOps velocity is a competitive advantage, but only if your pipelines don’t become the attack vector. The incidents we’ve seen aren’t sophisticated nation-state operations. They’re opportunistic exploitation of basic hygiene failures.

Here’s the thing: the Break-the-Chain controls aren’t complicated. Short-lived credentials. Pinned dependencies. Signed artifacts. Minimal permissions. None of these require exotic tooling or massive budgets. They require discipline.

Ship fast, but guard faster.

Published at: https://christian-schneider.net/blog/ship-fast-but-guard-faster/

12 steps to secure software

Sat, 13 Apr 2024 13:00:00 GMT

Secure software development

TL;DR

Drawing from my experience conducting OWASP SAMM assessments and DevSecOps implementations across dozens of organizations, I’ve identified 12 technical leverage points that provide the highest security ROI. The sequence matters: start with patch management and infrastructure hardening (the most exploited vulnerabilities), then progress through static analysis, secure coding, and threat modeling, before tackling the higher-effort controls like encryption and SIEM. Each step includes specific tool recommendations, DevSecOps integration guidance, and cloud-specific implementations for AWS and Azure.

In the rapidly evolving landscape of digital technology, the Secure Software Development Lifecycle (SSDLC) emerges as a crucial bastion against the ever-increasing threats in cyberspace. Yet, many companies, particularly those at the nascent stages of their cybersecurity journey, grapple with where to begin. This article aims to demystify the path forward, spotlighting the low-hanging technical fruits in secure software development that can substantially bolster your defenses.

12 technical leverage points

The twelve steps I’ve outlined are intentionally focused on technical measures, chosen for their ability to scale swiftly across a corporation and make an immediate impact on enhancing cybersecurity. However, it’s crucial to recognize that the journey to robust IT security doesn’t end here: Process and organizational measures play an equally vital role in creating a comprehensive defense strategy. These aspects, which encompass the broader cultural and procedural framework within which technology operates, will be the focus of my follow-up article, ensuring a holistic approach to securing your digital landscape.

Within this article, each of the twelve security steps is not only dissected for its inherent value but is also aligned with DevSecOps principles, highlighting its relevance in integrating security into continuous delivery and deployment workflows. Additionally, for organizations leveraging the cloud, guidance is provided on how each step can be effectively applied in a cloud-based setting, ensuring a comprehensive security posture that resonates with both traditional and modern IT environments.

0.Awareness

Before we dive into the 12 essential steps to secure your software development, let’s talk about my sneaky step 0: Awareness. It's like the invisible ink of my security blueprint: not officially one of the 12, but underpinning everything we do. Just remember, while implementing these steps, awareness should already be twinkling in the back of your mind, laying the foundation for a fortress of security.

Why

Awareness is the foundational step in any cybersecurity strategy, crucial for understanding the importance of security measures and fostering a culture of vigilance. By recognizing the potential risks and the impact of security breaches, organizations can prioritize and commit to comprehensive security practices.

NIST mandates the implementation of security awareness and training programs as part of its comprehensive cybersecurity guidelines, ensuring that all personnel are educated about their roles in safeguarding information systems. For more details, refer to NIST Special Publication 800-50.

How

Cultivate awareness through regular training sessions, engaging seminars, and updated security briefings that keep all employees informed about the latest security threats and best practices. Additionally, utilize internal newsletters, security awareness posters, and e-learning modules to ensure that security remains a visible and ongoing priority throughout the organization.

Incorporate Live Hacking Events as powerful eye-openers to demonstrate real-world vulnerabilities and the ease with which breaches can occur.

Now, let’s begin to uncover the 12 technical leverage points…

1.Patch Management of Systems & Dependencies

Why	This is an excellent starting point, as keeping systems and dependencies up-to-date through Software Composition Analysis (SCA) is one of the most effective ways to protect against known vulnerabilities with relatively low effort.
How	Implementing this step can involve using scanners like Grype or Trivy to detect vulnerabilities in your built artifacts, and tools like OWASP Dependency Check or OWASP Dependency Track for managing library dependencies. These tools scan your project dependencies against a database of known vulnerabilities, providing insights and recommendations for updates or patches.
Effort	Medium: The initial setup and integration into your development workflow can take some time, but once configured, these tools run automatically, making the effort mostly upfront and then periodic for updates and reviews.
DevSecOps	Yes: These tools can be seamlessly integrated into DevSecOps CI/CD pipelines.
Cloud	Cloud environments often automate the patching of hosted services and infrastructure, significantly easing this aspect of security management. However, for custom applications and third-party dependencies, the responsibility usually falls on the cloud customer to ensure they are regularly updated, leveraging tools provided by the cloud platform for automation where possible: For AWS, consider to build an end-to-end AWS DevSecOps CI/CD pipeline which also covers dependency checking. For Azure, GitHub Advanced Security for Azure DevOps also covers dependency checking.

2.Hardening of Infrastructure & Configuration

Why	Hardening systems early in the security enhancement process is wise, as it reduces the attack surface by eliminating unnecessary services and securing configurations, providing a strong foundation for subsequent steps.
How	Hardening Infrastructure and Configuration involves adhering to secure standards like CIS Benchmarks, using specific tools for Docker and Kubernetes security assessments, leveraging minimal footprint images for containers, employing IaC scanning with tools like KICS, and conducting Linux system audits with Lynis.
Effort	High: Initial setup and understanding of benchmark standards can be time-intensive, but adopting automation tools can streamline the process.
DevSecOps	Somewhat: Security scans of infrastructure can be automated to run at regular intervals outside of commit pipelines, ensuring ongoing security assessments without impeding the continuous integration process. IaC scanners can be integrated into CI/CD pipelines to catch misconfigurations early.
Cloud	Utilize the cloud provider’s best practices like setting up security groups, network access controls, and ensuring that default configurations are changed to secure settings. Automation and template-based deployments can help maintain consistency across environments. Where possible, leverage cloud-native security tools to monitor and enforce security configurations: For AWS, consider using `cfn-nag`, `cdk-nag`, Checkov, TFLint and others to scan Infrastructure-as-Code (IaC) definitions. Also execute the CIS Benchmarks scans within the cloud to ensure secure configurations. Using AWS Inspector can also help to assess the security and compliance of the applications running on AWS. For Azure, consider using the Microsoft Security DevOps extension to scan Infrastructure-as-Code (IaC) definitions. Also execute the CIS Benchmarks scans within the cloud to ensure secure configurations.

3.Static Code Analysis

Why	Introducing automated tools to identify potential security issues in the codebase is a logical next step after securing the underlying infrastructure, as it builds security directly into the software development process.
How	Static Code Analysis can be performed using commercial tools or a blend of the following open-source tools, depending on your programming language: Find Security Bugs (Java, Groovy, Scala, Kotlin) Security Code Scan (.NET) GoSec (Go) Brakeman (Ruby) Bandit (Python) NodeJsScan (Node.js) PHP CS Security Audit (PHP) SonarQube / SonarCloud Semgrep
Effort	Medium: While integration into the development workflow is straightforward, setting up and configuring the tools to suit your specific needs may require some initial investment in time.
DevSecOps	Yes: These tools can be easily integrated into CI/CD pipelines to automatically scan code for vulnerabilities.
Cloud	Cloud-based tools can be configured to scan source code during the build process, providing real-time feedback to developers on security issues, for example: For AWS, integrating SonarCloud with AWS CodePipeline can create a comprehensive end-to-end AWS DevSecOps CI/CD pipeline. For Azure, integrating the Microsoft Security DevOps extension and Microsoft Security Code Analysis or GitHub Advanced Security for Azure DevOps is beneficial.

4.Secure Coding Requirements

Why	Establishing and adhering to secure coding standards is a natural progression from static code analysis, ensuring that developers are guided by best practices from the outset.
How	Develop Secure Coding Requirements by customizing guidelines based on your tech stack, referencing OWASP Top 10 and OWASP API Top 10 for common vulnerabilities and using OWASP Cheat Sheet Series for best practices in specific areas like authentication, session management, and encryption. It’s crucial to include fundamental security principles such as Separation of Data & Code, Input Validation, Least Privilege, Fail Safe Defaults, Encapsulation and others.
Effort	Low: Customizing secure coding guidelines requires just an initial investment to align with your specific environment.
DevSecOps	No: Secure coding requirements are more about setting an initial standard rather than direct CI/CD integration. Compliance is typically verified through code scans, already covered in Step 3.
Cloud	Use cloud-based development environments that enforce these practices and offer real-time feedback to developers on security issues. Seize the resources provided by cloud platforms to educate developers on secure coding practices and ensure compliance with security standards: For AWS, explore the AWS Well-Architected Framework, particularly the Security Pillar, which provides best practices and strategies to help you secure your workloads. For Azure, Microsoft’s Secure Development Best Practices on Azure offer a series of articles detailing security activities and controls for cloud application development.

5.Secure Coding Training for Developers

Why	Training developers on security best practices (referencing the secure coding requirements) complements the previous step by reinforcing the importance of security and empowering developers to write secure code. NIST mandates the implementation of security awareness and training programs as part of its comprehensive cybersecurity guidelines, ensuring that all personnel are educated about their roles in safeguarding information systems. For more details, refer to NIST Special Publication 800-50.
How	Effective security training should be engaging, with a mix of hands-on exercises and real-world scenarios that resonate with different roles within the organization, from developers and testers to architects and ops teams. Tailoring content to the company’s tech stack and deployment strategy makes the training more relevant and impactful. The training should emphasize the overarching aspect of Defense in Depth: Layering security measures so that if one mechanism fails, another is in place to protect the system.
Effort	High: Creating comprehensive, role-specific training that’s both informative and engaging demands substantial resources but stands as a strategic investment towards fostering a sustainable security culture and mitigating risks effectively.
DevSecOps	No: Security training isn’t directly integrated into DevSecOps pipelines.
Cloud	Leverage online platforms and cloud provider resources for up-to-date security training tailored to cloud development. Encourage participation in cloud-specific security training programs and certifications to build awareness and expertise: For AWS, explore the AWS Training and Certification for Security, which offers various resources to help developers understand security best practices, including secure coding for cloud environments. For Azure, Trust Center provides a comprehensive guide on developer security best practices.

This step is positioned later in the sequence primarily because comprehensive security training does not scale as swiftly across larger organizations compared to earlier measures. In smaller companies, implementing widespread training might be more feasible early on due to fewer personnel, allowing for quicker, organization-wide education.

6.Internal Application Security Verification

Why	Conducting internal reviews and verifications of application security helps identify and mitigate issues early, leveraging the foundation built by the preceding steps.
How	For Internal Application Security Verification, utilizing frameworks like OWASP ASVS provides a comprehensive checklist for verifying the security of web applications against industry-standard benchmarks. Tailoring these guidelines to fit the specific needs of your organization enhances the effectiveness of your security practices.
Effort	Medium to High: Implementing a thorough security verification process using standards like OWASP ASVS requires an initial investment in understanding and adapting the guidelines, but the ongoing effort ensures robust application security.
DevSecOps	Somewhat: While OWASP ASVS isn’t directly integrated into DevSecOps pipelines, its guidelines can inform automated security testing and code review processes, helping to maintain high security standards throughout the development lifecycle.
Cloud	Conduct security assessments using cloud-native tools or third-party solutions integrated with the cloud environment: For AWS, the AWS Security Hub provides a comprehensive view of your security state within AWS environments and can automate checks against security industry standards. For Azure, the Azure Security Center offers a unified infrastructure security management system that strengthens the security posture.

7.Threat Modeling

Why	This step involves a more strategic assessment of potential threats and is appropriately positioned after some internal security measures have been established, allowing for more informed threat modeling.
How	Initiate threat modeling with guidance from the Threat Modeling Manifesto to understand principles and practices. Integrating free tools like AttackTree and Threagile offer structured methodologies for identifying threats in a top-down and bottom-up manner, respectively. Typically, this process is led by a security architect or a dedicated coach for the initial models to ensure thoroughness and accuracy.
Effort	Medium: Initially high as the team learns the process and creates the first set of models with expert guidance. Over time, maintaining and updating these models as part of regular development cycles requires significantly less effort, becoming more efficient as the team gains experience.
DevSecOps	Somewhat: Threat Modeling can be partially integrated by using tools with APIs or CLIs, such as Threagile and AttackTree, to automate checks against threat model outcomes.
Cloud	Use threat modeling tools with cloud-specific templates to assess risks and design mitigations. Ingest cloud-specific tool suggestions from the cloud provider’s security resources: For AWS, the AWS Well-Architected Tool can help you review your workloads against AWS best practices and identify potential security risks. For Azure, the Microsoft Threat Modeling Tool provides threat modeling capabilities to help you identify and mitigate security risks in your Azure environment.

8.External Penetration Testing

Why	Bringing in external experts to test the security of applications adds an additional layer of scrutiny and is well-timed after internal assessments and threat modeling.
How	Penetration testing to uncover vulnerabilities not prevented or detected earlier comes in various forms: Black Box with no prior knowledge, Grey Box with some knowledge about the architecture and tech stack, and White Box with full knowledge usually including source code access. For structured methodologies, refer to the OWASP Testing Guide for comprehensive insights. Also, it’s important to establish a feedback loop from the findings to understand the root causes and adjust internal practices accordingly, preventing future regressions.
Effort	High: Due to the need for specialized skills and the comprehensive nature of the tests. External penetration tests are periodic but crucial for uncovering vulnerabilities that internal measures might miss.
DevSecOps	Somewhat: Integrating DAST tools into CI/CD pipelines offers a step towards automation by simulating external attacks on live applications, although it doesn’t encompass the full scope of manual penetration testing. Insights from both DAST and manual tests should inform and enhance DevSecOps practices to preemptively address potential vulnerabilities.
Cloud	Use approved methods and tools to test the security of cloud-hosted applications and infrastructure from an external perspective, identifying vulnerabilities that could be exploited by attackers. Coordinate with cloud providers to comply with their penetration testing policies and procedures: For AWS, refer to AWS Penetration Testing. For Azure, refer to Azure Penetration Testing.

9.Secure Authentication & Authorization

Why	Focusing on robust authentication and authorization mechanisms is crucial and requires the secure foundation established by earlier steps to be effectively implemented.
How	Enhance your application’s security by elevating Authentication and Authorization mechanisms. Implement Multi-Factor Authentication (MFA), enforce strong password policies, and apply the principle of Least Privilege across all access points. Utilize trusted authentication providers and protocols to ensure robust security. In Microservice architectures, adopt a Zero Trust approach by propagating tokens inside the backend (using a Service Mesh might help here), thereby decentralizing authorization and making it harder for attackers to move laterally within the system. This strategy integrates advanced security practices into your application’s authentication and authorization layers, significantly reducing the likelihood of unauthorized access and enhancing overall security posture.
Effort	High: Implementing advanced authentication mechanisms like MFA and integrating Zero Trust architecture requires a significant initial setup and ongoing management to ensure compliance and effectiveness.
DevSecOps	Somewhat: To effectively integrate authentication and authorization into DevSecOps, consider automating authorization testing as recommended by OWASP. Utilize the guidelines in the Authorization Testing Automation Cheat Sheet to ensure that authorization mechanisms are consistently validated throughout the development lifecycle, enhancing security and compliance with minimal manual effort.
Cloud	Implement cloud-native identity and access management (IAM) services to manage user identities, permissions, and access controls. Utilize multi-factor authentication, role-based access control, and least privilege principles to secure access to cloud resources: For AWS, AWS IAM allows you to set up and manage permissions in a granular manner. For enabling MFA for end users in AWS, refer to the Amazon Cognito documentation. For Azure, Entra AD offers comprehensive identity and access management, both in the cloud and on-premises. For enabling MFA for end users in AWS, refer to the Entra AD B2C documentation.

This step is positioned later due to the potentially high effort required to enhance authentication and authorization mechanisms, particularly in complex, large-scale, or legacy systems. This step can be more resource-intensive and challenging to implement across grown architectures. However, if your architecture is simpler or currently under development, prioritizing this step earlier could be more feasible and impactful.

10.Encryption of Sensitive Data

Why	Encrypting sensitive data is crucial not only for ensuring privacy and security but also for complying with data protection regulations like GDPR and CCPA. Following the principles of privacy-by-design, encryption should be integrated after secure authentication and authorization frameworks are established to provide a comprehensive security posture. As Amazon CTO Werner Vogels famously said: “Dance like nobody’s watching, encrypt like everyone is.” This emphasizes the need to assume that external scrutiny is constant, underscoring the importance of rigorous data protection practices.
How	Implement encryption effectively by distinguishing between in-transit and at-rest requirements: For in-transit data, secure all forms of communication, not just HTTPS traffic, by implementing protocols like TLS for JDBC and other messaging protocols. For data at-rest, focus on employing strong encryption standards and robust key management practices. This involves not only choosing the right encryption algorithms but also ensuring the secure generation, storage, and handling of encryption keys to prevent unauthorized access.
Effort	Medium to High: Implementing effective encryption strategies involves setting up protocols for both data in transit and at rest, alongside managing encryption keys securely. The effort includes both technical implementation and ongoing management to ensure compliance and maintain security.
DevSecOps	Somewhat: Encryption practices should be integrated into DevSecOps by automating key management and renewal processes, and by ensuring encryption standards are maintained throughout the software development lifecycle.
Cloud	Employ cloud services for data encryption, both at rest and in transit, using the cloud provider’s built-in encryption capabilities. Ensure proper key management practices, possibly using the cloud provider’s key management service, to secure encryption keys. For AWS, encrypting sensitive data using AWS Key Management Service (KMS). Regarding data protection regulations, refer to AWS Artifact. For Azure, using Azure Key Vault for managing and encrypting keys and secrets. Regarding data protection regulations, refer to the Service Trust Portal.

This step is positioned later in the process because implementing encryption across large or legacy architectures often requires considerable effort and can be resource-intensive. This makes it a more challenging step for established systems. However, if your system architecture is straightforward or currently in the developmental phase, integrating encryption earlier might be more manageable and beneficial.

11.Security Monitoring & Incident Response Plan

Why	Implementing Security Information and Event Management (SIEM) systems and developing incident response plans are complex but essential steps for identifying and responding to security incidents efficiently.
How	Implement a Security Information and Event Management (SIEM) system to gain real-time insights into your application’s security posture. Additionally, integrating OSSEC can enhance host-based intrusion detection by monitoring parameters such as log files, file integrity, and rootkit detection. This allows for swift detection and response to threats. Utilize open-source tools like ELK Stack (Elasticsearch, Logstash, Kibana) for logging and monitoring. Develop a robust Incident Response Plan to effectively manage and mitigate the impacts of security incidents, ensuring continuity and maintaining trust. Refer to the NIST Incident Response Guide for structured approaches. Remember to include post-mortem root-cause analysis to learn from incidents and improve security practices as a feedback loop to the aforementioned steps.
Effort	High: Setting up a SIEM system and developing an Incident Response Plan require significant investment in terms of both time and expertise. Initial configuration, integrating various tools like OSSEC with the ELK Stack, and ensuring all components are working cohesively are complex tasks. Regular updates and training are also necessary to maintain efficacy.
DevSecOps	No: Direct integration of SIEM into DevSecOps CI/CD pipelines is not typically feasible, as SIEM functions primarily focus on security monitoring and incident management within live production environments.
Cloud	Develop an incident response plan that leverages cloud services for rapid response and recovery, and ensure it’s regularly tested and updated. Adopt cloud-based SIEM solutions that offer integrated logging, monitoring, and real-time analysis of security alerts: For AWS, refer to AWS Security Hub as well as AWS GuardDuty for intelligent threat detection. For Azure, Azure Sentinel provides extensive SIEM capabilities, offering integrated threat intelligence, real-time analysis, and rapid response.

12.Security Metrics & Key Performance Indicators

Why	Finally, establishing metrics and Key Performance Indicators (KPIs) for ongoing monitoring and improvement of security practices is a strategic way to close the loop, ensuring continuous assessment and enhancement of the security posture.
How	Implement metrics that highlight areas needing attention and track improvements over time. Dashboards should be utilized to make security metrics visible to project managers, emphasizing the importance of security within project KPIs. Tools like OWASP Defect Dojo can be used for tracking lower-level security issues, but it’s crucial to also aggregate these findings into higher-level metrics that can be visualized and monitored in broader management tools. This approach not only highlights vulnerabilities but also supports proactive security management by making (in)security a visible and quantifiable aspect of project performance. It’s crucial to develop KPIs from security metrics (such as findings from SCA, SAST, DAST, Penetration Tests, and Threat Modeling) that incorporate the time factor to monitor improvements or persistence of vulnerabilities over time. Metrics like the Average Time of Exposure of high-risk findings in production environments are especially valuable, as they provide clear indicators of how quickly security issues are being addressed and resolved.
Effort	Medium to High: Establishing and maintaining security metrics and KPIs requires a significant upfront effort to define the right metrics, integrate data collection tools, and set up dashboards. Once established, the ongoing effort required to analyze and update these key figures based on current data is reduced with automation of tools.
DevSecOps	Yes: Integrating security metrics and KPIs into DevSecOps involves incorporating automated scanners and tools to continuously track and report on these metrics throughout the CI/CD pipeline.
Cloud	Use cloud monitoring and management tools to track security metrics and KPIs. These should include measures of compliance with security policies, incident response times, effectiveness of security controls, and user access and activity monitoring: For AWS, CloudWatch provides detailed monitoring and analytics. For Azure, Azure Monitor is key for tracking security metrics and KPIs.

Defining and integrating metrics and KPIs into an organization's processes is a complex challenge, which I will explore in depth in my upcoming article on Process & Organization aspects.

Does the sequence matter?

The journey towards a Secure SDLC doesn’t require a simultaneous overhaul but a strategic, step-by-step approach. Each organization’s unique context, risk profile, and resource availability can influence the prioritization of these steps. However, for businesses operating with limited resources, progressing from step 1 to 12 provides a logical and efficient pathway:

Notably, the initial focus on Patch Management of Software & Dependencies, coupled with the Hardening of Infrastructure & Configuration, is strategically designed to mitigate the risk of the most common and easily exploitable vulnerabilities. These initial steps are crucial in defending against automated, untargeted attacks perpetrated by threat actors using automated exploit kits, effectively targeting the “lowest-hanging fruits” among potential vulnerabilities.

This approach not only prioritizes immediate defenses against prevalent risks but also sets a solid foundation for advancing through the subsequent leverage points with a progressively fortified posture.

Looking ahead

In embracing these initial steps, companies can not only elevate their security but also lay a foundational culture of security that permeates every aspect of the development lifecycle. For those seeking to navigate these waters with expert guidance, services such as Secure Software Development Training, DevSecOps Coaching, and Agile Threat Modeling offer a beacon, ensuring that your journey towards secure software development is both strategic and seamless.

Process and organizational aspects

While the above presented twelve technical adjustments can significantly enhance your security posture, they represent just one facet of a comprehensive security strategy. The more layered organizational and process-oriented enhancements will be the subject of a forthcoming article, providing a holistic roadmap to a mature, robust Secure SDLC.

Vendor partnerships and their unique challenges

Addressing vendor-related security within the software ecosystem necessitates a nuanced approach to the presented twelve steps, particularly when considering the diversity of vendor relationships:

Custom Development Vendors who program specifically for your needs and deliver the source code.
On-Premise Software Vendors supplying built software for deployment within your own infrastructure.
SaaS Providers offering solutions hosted on their platforms, accessible over the internet.

Each type of vendor engagement introduces distinct security considerations, from scrutinizing the delivered source code of custom development partners to ensuring the secure configuration and ongoing maintenance of on-premise solutions, and evaluating the data handling and storage practices of SaaS providers.

Given the intricacies and the importance of securing each touchpoint, a focused examination of Vendor Application Security Testing (VAST) becomes indispensable.

If this resonated…

Subscribe to my newsletter below to ensure you’re first to be notified of my follow-up articles.

Published at: https://christian-schneider.net/blog/12-steps-to-secure-software/

Micro Attack Simulations

Fri, 20 Oct 2023 09:00:00 GMT

Micro Attack Simulations

TL;DR

While organizations with mature security programs benefit from full Red Team and Purple Team exercises, many organizations building their security capabilities need a more accessible approach. Micro Attack Simulations fill this gap by validating specific security controls through targeted exercises—testing both technical defenses like intrusion detection and non-technical aspects like escalation procedures and crisis management. Combined with Attack Tree modeling, this approach provides organizations at any maturity level with actionable insights into their cyber resilience.

I was interviewed before my talk at DeepSec 2023 about the topic of my talk, about how to improve cyber resilience by adopting Micro Attack Simulations. This interview was also cross-published on the DeepSec blog: Improving Cyber Resilience Through Micro Attack Simulations.

Pre-Talk Interview

With the increasing adoption of Red Teaming and Purple Teaming in the cybersecurity industry, organizations that have achieved high levels of security maturity can greatly benefit from these activities. However, organizations at the onset of building a security program are often left out. This talk introduces Micro Attack Simulations, an innovative approach that allows organizations to validate specific security controls without waiting for full-blown Red Teaming exercises.

Micro Attack Simulations focus on assessing single or multiple security controls that are already implemented, providing a valuable approach for organizations aiming to bolster their cyber resilience. These simulations not only focus on technical aspects but also consider non-technical security controls such as escalation procedures and reporting paths during security incidents. As a result, organizations can derive specific Red Team unit tests and perform a gap analysis of existing security controls.

The talk will include an anonymized case study that shows the modeling of potential attack trees and the technical execution of a Micro Attack Simulation. The simulation’s goal was to validate security controls around a successful ransomware attack on the server infrastructure, including the encryption and exfiltration of sensitive customer data. The simulation involved actual data encryption, multi-node compromise using Cobalt Strike, separate custom-written out-of-band command-and-control channels, and even placing ransom notes and sending ransom emails to the organization’s official press and communication channels to test crisis management processes.

Please tell us the top 5 facts about your talk.

The talk introduces the novel concept of Micro Attack Simulations, a focused approach to validate individual or multiple security controls in an organization’s security setup, which is combined with Attack Tree modeling.
The simulations are designed to assess not only the technical security controls like firewalls and intrusion detection systems, but also non-technical aspects like escalation procedures and crisis management.
The simulation uses a multi-method approach, incorporating tools like Cobalt Strike and custom-written out-of-band command-and-control channels for a comprehensive assessment.
By combining the Micro Attack Simulations with an Attack Tree approach, the holistic view of an organization’s cybersecurity resilience is still maintained.
The talk will feature a real-world, anonymized case study involving an elaborate simulation of a ransomware attack, aiming to validate the security controls related to detection, response, data encryption, C2 and exfiltration.

How did you come up with it? Was there something like an initial spark that set your mind on creating this talk?

The initial spark came from observing a gap in the industry; while well-established organizations with mature security programs were benefiting from Red and Purple Teaming exercises, smaller organizations or those in the early or intermediate stages of building their security programs were often left behind. When combined with Attack Tree modeling, Micro Attack Simulations can bridge this gap and provide a tailored, modular approach to validate security controls even at the nascent stages of a security program

Why do you think this is an important topic?

The topic is crucial because as cyber threats evolve, so must our defensive strategies. Traditional security assessment methods often require a high level of maturity and resources, making them inaccessible for organizations that are still maturing their security posture. Micro Attack Simulations streamline the validation process, making it easier, quicker, and more cost-effective for organizations at varying levels of security maturity.

Is there something you want everybody to know – some good advice for our readers maybe?

Always consider security as a multi-faceted problem; it’s not just about technology but also about processes and people. One overlooked security control or a poorly designed escalation process (kicking in too late for effective defense) can render even the most advanced technical defenses useless. Never underestimate the importance of non-technical controls in your security architecture.

A prediction for the future – what do you think will be the next innovations or future downfalls when it comes to your field of expertise / the topic of your talk in particular?

In the future, I believe we’ll see a move towards automated Micro Attack Simulations, with machine learning algorithms helping to predict potential vulnerable spots and adjust security controls in real-time. However, the downfall could be an over-reliance on automated systems, which might lead to a lack of human oversight and potentially new, unanticipated types of vulnerabilities. As always, it keeps to be challenging.

Key Takeaways

Micro Attack Simulations bridge maturity gaps: Organizations don’t need to wait for full Red Team readiness—targeted simulations validate specific controls at any maturity level.
Technical and non-technical controls matter equally: A poorly designed escalation process can render advanced technical defenses useless.
Attack Tree modeling provides structure: Combining simulations with Attack Tree approaches maintains a holistic view while testing specific controls.
Real-world validation beats theoretical assessments: Actual data encryption, C2 channels, and crisis management testing reveal gaps that audits miss.
Future trend: automation with human oversight: Machine learning will help predict vulnerabilities, but over-reliance on automation creates new risks.

Interested in running your own simulation?

Ready to evaluate your cybersecurity resilience and preparedness? Explore how the Micro Attack Simulations can be a crucial part in assessing your defensive strategies.

Published at: https://christian-schneider.net/blog/micro-attack-simulations/