LinuxGuard Logo
FeaturesPricingServiceAboutPartnersBlogCareersContactSign In
LinuxGuard Logo

© 2026 LinuxGuard. All rights reserved.

Terms of ServicePrivacy PolicyDPALicenseDocumentationCareersSupport
Back to Blog

Copy Fail: The Exploit That Leaves No Trace on Disk

Peter CummingsPeter Cummings
•May 13, 2026•9 min read
Copy Fail: The Exploit That Leaves No Trace on Disk

In 1985 I was at school. That year, a logic flaw was introduced into what would later become the Linux kernel's crypto subsystem. In April 2026, it became a CISA KEV entry.

CVE-2026-31431 — the security community calls it Copy Fail — is a local privilege escalation in the kernel's algif_aead module. Any unprivileged local user. 732 bytes of Python. Root. No race condition. No special tooling. No network access required. Works unmodified across Ubuntu, RHEL, Amazon Linux, Debian, SUSE — every mainstream distribution built since 2017. The researchers who found it did so in under an hour using AI-assisted analysis.

If your security posture still relies on perimeter defenses and manual patching cycles, Copy Fail should be the wake-up call that changes how you think about Linux security entirely.


The Part That Matters Most

Here is the detail that every security team running file integrity monitoring, EDR, or SIEM-based detection needs to sit with:

The exploit is entirely in-memory. The binary on disk never changes.

Your file integrity tool sees nothing. When the system reboots, the page cache reloads clean. No artefact. No log. No alert. The physical file on disk is untouched from first packet to root shell.

This is the architecture of the modern privilege escalation problem on Linux. It doesn't beat your detection tools. It routes around them entirely — through the identity and privilege layer that those tools were never designed to govern.


What Copy Fail Actually Is

The flaw lives inside algif_aead — the kernel module that exposes AEAD (Authenticated Encryption with Associated Data) cipher operations to userspace via AF_ALG sockets. Back in 2017, a performance optimisation was committed to the kernel that allowed the cryptographic subsystem to perform in-place operations, reusing source memory as the destination. It was a reasonable efficiency improvement at the time.

What nobody noticed was the side effect: when userspace feeds page-cache pages into this pipeline via splice(), those pages end up in a writable destination scatterlist. An attacker can use this to perform a controlled 4-byte write into the kernel's page cache — targeting any readable file on the system, including setuid binaries like /usr/bin/su.

The exploit works in four steps:

  1. Bind an AF_ALG socket to authencesn(hmac(sha256),cbc(aes))
  2. Splice page-cache pages of /usr/bin/su into the crypto pipeline
  3. Write controlled shellcode bytes into the cached binary, 4 bytes at a time
  4. Execute su — the kernel loads the patched binary from page cache and yields a root shell

The HMAC check fails, as expected. The attacker doesn't care. The corruption persists in memory regardless. The physical file on disk is untouched. Traditional file integrity monitoring never sees it. When the system reboots, the page cache reloads clean. No artefact. No log. No alert.

CVSS score: 7.8 HIGH. Affected distributions: Ubuntu, Amazon Linux 2023, RHEL, SUSE, Debian, Arch, Fedora, Rocky, Alma, and practically everything else running a kernel built between 2017 and the patch date.


Why This Is Different From Most CVEs

Most privilege escalation vulnerabilities have friction. They require specific kernel versions, particular memory layouts, or timing windows that make reliable exploitation difficult. They're dangerous in theory; exploitable in practice only by skilled attackers with time and resources.

Copy Fail is the opposite. It is deterministic. The same 732-byte Python script — no modification, no tuning — produces a root shell on every major enterprise Linux distribution. Researchers at Theori verified it directly against Ubuntu 24.04 LTS, Amazon Linux 2023, RHEL 10.1, and SUSE 16, in a single take, unmodified.

And here's what elevates the threat from serious to critical in multi-tenant environments: the page cache is shared across the entire host. A container with enough primitives doesn't just compromise itself — it compromises the node. It crosses tenant boundaries. A CI runner executing an untrusted pull request becomes root on the build server. A Kubernetes pod becomes a node escape. Any shared dev box, jump host, or shell-as-a-service offering becomes a free pass to full system compromise.

Theori's AI system, Xint Code, found this in approximately one hour of automated scan time against the Linux crypto/ subsystem with a single operator prompt. No harnessing. No human-guided iteration.

Forty years. Hidden in plain sight. Discovered by an AI in sixty minutes.


The Uncomfortable Truth About Your Linux Blind Spot

Here's the conversation I hear most often from security leaders at mid-to-large enterprises:

"We have a vulnerability management program. We track CVEs. We patch on a schedule."

I don't doubt it. But I have a follow-up question: How long does it actually take for a critical kernel CVE to be patched across your entire Linux estate?

Not in theory. In practice. On the long tail of servers that aren't in the managed golden image. On the jump hosts that "someone else looks after." On the CI runners that spin up from a base AMI that hasn't been refreshed in four months. On the Kubernetes nodes where the node image is managed by a separate team.

The answer, for most enterprises I've spoken with, is measured in weeks — sometimes months. And during that entire window, Copy Fail means that any unprivileged shell access — a compromised service account, a leaked developer credential, a web RCE that lands in the context of www-data — becomes a guaranteed path to root. Entirely in memory. Entirely invisible to your FIM and EDR stack.

This isn't a criticism. It's a structural challenge that almost every organisation running Linux at scale faces. The problem isn't that security teams aren't trying. The problem is that most organisations have no real-time visibility into which of their Linux servers are running vulnerable kernels, who has local access to those servers, and how quickly they can act.


Patching Is Mandatory — But It Is Not Evidence

Patching this CVE is mandatory if you're in scope for CISA KEV or operating under DORA and NIS2 ICT risk obligations. But patching is not the same as knowing which accounts had local access to affected systems, and what paths to root existed before you applied it.

That evidence is what regulators are starting to ask for.

The upstream fix is mainline commit a664bf3d603d, which reverts the 2017 optimisation. Every major distribution is shipping the backport now. Patch your kernels. Before you can do that, disable the algif_aead module:

echo "install algif_aead /bin/false" > /etc/modprobe.d/disable-algif.conf
rmmod algif_aead

For untrusted workloads — containers, CI runners, sandboxes — block AF_ALG socket creation via seccomp regardless of patch state.

But the minimum viable response is exactly that: minimum. The deeper lesson is about what happens between disclosure and patch, what your identity and access posture looked like during that window, and whether you can prove it to an auditor. Because the next Copy Fail is already being found. The same AI system that uncovered this one is disclosing more high-severity bugs currently in coordinated disclosure.


What Good Looks Like After Copy Fail

Three things need to be true simultaneously for an organisation to handle vulnerabilities like Copy Fail with confidence — during the exposure window, not just after the patch is applied.

1. You need kernel-level visibility into who had local access to every server.

Copy Fail requires an unprivileged local user account. That means your exposure surface is defined entirely by your Linux identity posture — who has shell access, what accounts exist across your estate, which identities are dormant, which service accounts are running on which hosts. If you can't answer "who could reach a shell on my vulnerable servers during the exposure window," you cannot produce the evidence regulators under DORA or NIS2 are beginning to require.

At LinuxGuard, we call this Identity Intelligence — a real-time, cross-host map of every identity across your Linux estate, from human users to non-human identities (which outnumber humans 80:1 on most enterprise Linux estates), with risk signals including dormancy, shared credentials, and privilege scope. Our Authority Object Graph lets you ask: "Which identities had local access on servers running kernels in the Copy Fail vulnerable range?" — and get an answer in a single query, not a multi-hour SIEM investigation.

2. You need configuration-level response, not just detection.

Copy Fail can be mitigated at the kernel module level before a patch is available. But applying that mitigation — disabling algif_aead across thousands of servers — requires the ability to detect the exposure, generate a remediation proposal, get approval, and execute across the fleet safely, inside maintenance windows, with rollback if something goes wrong.

This is exactly the workflow LinuxGuard's Config Manager is designed to enable: configuration governance across 11 identity-relevant Linux configuration types, with workflow-driven remediation that requires approval, respects maintenance windows, and auto-rolls-back on validation failure. No "our SSH broke at 3 AM" moments.

3. You need vulnerability context tied to identity and access — not just a patch queue.

The scariest version of Copy Fail isn't the server in your managed production tier. It's the build server, the jump host, the internal notebook environment — the servers where "someone has access" means many people have access, and where patching is an afterthought because "it's internal."

When we talk about Vulnerability Control at LinuxGuard, we're talking about a triage queue ranked by actual risk: CVSS combined with CISA KEV status, EPSS score, and real-time exploitability — correlated against the identity and access picture on each server. A KEV-listed CVE on a multi-tenant jump host with 40 active identities is a different priority from the same CVE on a single-tenant production server accessed only by three engineers. The numbers should not look the same on your dashboard. And crucially — when you do patch, you retain the evidence of what the posture looked like before.


The AI Angle: When Attackers Move Faster Than Your Patch Cycle

Copy Fail was found by an AI system in approximately one hour, from a single prompt, with no human-guided iteration. The same scan surfaced multiple other high-severity bugs currently in coordinated disclosure.

This is not a warning about AI being dangerous. It's a warning about the asymmetry developing between how quickly vulnerabilities can be discovered and weaponised, and how slowly most organisations can respond.

If an AI can find a forty-year-old critical vulnerability in sixty minutes, the assumption that your Linux estate is "probably fine" between major disclosed CVEs is no longer a safe operating posture. The surface area of kernel code is vast. The tooling to explore it is becoming faster and more accessible. The lag between discovery and public weaponised exploit is shrinking.

The right response is not panic. It is building infrastructure that can respond faster — that can get from "CVE disclosed" to "all vulnerable hosts patched" in days, not weeks, with an auditable record of every action taken and every identity that was in scope during the window.


What You Should Do This Week

Immediate actions:

  • Audit which of your Linux servers are running kernels in the Copy Fail vulnerable range (built 2017 → April 2026)
  • Disable algif_aead on hosts that cannot be patched immediately
  • Apply seccomp restrictions on CI runners, Kubernetes pods, and containers to block AF_ALG socket creation
  • Patch to a distribution kernel that includes mainline commit a664bf3d603d
  • Treat any local code execution on unpatched hosts as a root-equivalent compromise — in memory, invisible to FIM and EDR

Structural questions to ask your team:

  • How long did it take for Copy Fail to propagate across our entire Linux estate, including long-tail infrastructure?
  • Do we know every identity with local shell access on our servers — including service accounts and non-human identities — during the exposure window?
  • Can we produce that evidence for a DORA or NIS2 ICT risk review?
  • Can we apply kernel module-level mitigations across the fleet without a multi-day change management process?
  • Are our CI runners and build systems treated as security-sensitive shared infrastructure, or as throwaway compute?

The Posture This Era Requires

Copy Fail is a landmark vulnerability not just for what it does, but for what it represents. The era of infrequent, high-effort kernel exploits may be ending. AI-assisted discovery is changing the economics. The attack surface of your Linux kernel is being audited continuously by systems that don't sleep, don't take holidays, and don't need a consultant's retainer fee.

And the exploits they find route around your existing tooling — not through it. FIM doesn't help when the disk never changes. EDR doesn't help when there's no process anomaly to detect. SIEM doesn't help when there's no log entry to correlate.

The posture this era requires isn't just better patching. It's real-time identity visibility, kernel-level telemetry, and the ability to move from detection to remediation faster than your adversary can weaponise the next finding — with the audit evidence to prove you were in control throughout.

That's what we're building at LinuxGuard. Copy Fail is exactly the kind of event that clarifies why it matters.

If you want to understand your current Linux identity and vulnerability exposure — including what the posture looked like during the Copy Fail window — book a conversation with our team. The 28-day LinuxGuard Linux Identity & Security Audit was designed precisely for moments like this one.


LinuxGuard delivers identity-centric Linux security built on eBPF telemetry — real-time visibility into who can do what, who is doing what, and where your Linux environment drifts from secure posture. No kernel modules. No heavy agents. Learn more at linuxguard.io.

Peter Cummings

Peter Cummings

Peter Cummings — IT Security & AI expert with 20+ years’ experience. Founder of LinuxGuard. Passionate about automation, least privilege, and scalable cloud solutions.

← Back to Blog