Copy Fail (CVE-2026-31431): a Linux kernel bug that turns any user into root, explained in plain English

|

May 8, 2026|10 min read

On April 29, 2026, a security firm called Theori publicly disclosed a Linux kernel vulnerability that has been quietly sitting in every mainstream Linux distribution for nine years. They named it Copy Fail. Its catalogue number is CVE-2026-31431.

The short version: any non-root user on a vulnerable Linux server can become root. The exploit is 732 bytes of Python, requires no compiler, no special skills, and works on the first try. CISA added it to its Known Exploited Vulnerabilities list on May 1, 2026, with a federal patching deadline of May 15, 2026.

Hostney has already patched all servers in the fleet. The rest of this article is for everyone else – what the bug is, why it matters, and what to do if you run Linux anywhere other than Hostney.

What Copy Fail actually does#

Linux has a feature called the Crypto API that lets programs ask the kernel to encrypt or decrypt data on their behalf. Most programs do not use it directly – things like SSH, disk encryption, and TLS use it through internal kernel paths that ordinary users cannot reach. But there is a userspace door into the same machinery, called AF_ALG , which any program can open.

In 2017, a kernel developer optimized one corner of this code. The change was small and looked correct: instead of copying user data to a scratch buffer before encrypting, the kernel would encrypt in place, reusing the input buffer as the output buffer. This saves a memory copy on every operation. Reasonable change, applied without controversy, sat in the kernel for nine years.

The problem: that optimization assumed the input data was a regular memory buffer the user owned. Linux has another feature called splice() that lets programs efficiently move data between files and sockets without copying. When a program splices a file into an AF_ALG socket, the kernel does not actually copy the file contents – it hands the cryptographic engine direct references to the page cache, which is the kernel’s in-memory copy of files.

You can probably see where this is going.

If a user splices /usr/bin/su into a crypto socket, the kernel’s encryption code now treats pages of the page cache as scratch buffers. One specific cryptographic algorithm, called authencesn , writes 4 bytes past the end of its output region as part of normal operation. With the 2017 in-place optimization, those 4 bytes get written into the page cache itself. The kernel just modified its own cached copy of /usr/bin/su .

The attacker controls those 4 bytes. They control which file. They control which offset. They repeat the operation a few times and have written shellcode into the in-memory copy of a setuid-root program. The next time anyone runs su , the kernel runs the shellcode with root privileges.

The on-disk file is never touched. File integrity monitoring tools like AIDE and Tripwire see nothing because, on disk, /usr/bin/su is bit-for-bit identical to the day it was installed. The corrupted version exists only in the kernel’s memory – which is also the version the kernel actually executes.

Why this is worse than most kernel bugs#

There are kernel privilege escalation vulnerabilities every year. Most never make news. This one matters more than usual for five reasons.

It is deterministic. Most kernel exploits involve a race – the attacker has to win a timing window, often retrying thousands of times before catching the kernel in the right state. Copy Fail does not race. The bug is a straight-line logic error: do these steps, get root. First try.

It is universal. The vulnerable code shipped in Linux 4.14 in 2017. Every kernel released since contains it. The proof-of-concept exploit, written by Theori, is 732 bytes of Python using only the standard library, no compiled components. It works without modification on Ubuntu, Debian, RHEL, SUSE, Amazon Linux, and pretty much every other Linux distribution from the last nine years. You do not need to know what kernel version is running. You do not need to leak any addresses. You just run the script.

It is stealthy. Because the corruption lives in the page cache and never gets written back to disk, conventional file integrity monitoring misses it. A reboot clears the corruption (page cache is volatile), which means a server compromised this way looks clean again after the first reboot – while the attacker still has whatever persistence they established during the window they had root.

It crosses container boundaries. On Kubernetes nodes, on Docker hosts, on any system where multiple workloads share a kernel, the page cache is shared. A compromised pod can corrupt the page-cache copy of a setuid binary used by another pod, or by the host. This makes Copy Fail effectively a one-shot container escape on a huge population of cloud platforms.

It is being exploited in the wild. CISA listed it as a Known Exploited Vulnerability on May 1. Microsoft’s security team reports observing exploitation attempts. Palo Alto’s Unit 42 has published detection signatures. This is not theoretical.

Who actually needs to worry#

Not everyone running Linux is in equal danger. Copy Fail requires the attacker to already have a normal user account on the target system. If you run a personal laptop with no other users and no remote shell access, the only way someone exploits this on your machine is if they are already running code on it – in which case you have a different problem.

The systems where this bug genuinely matters:

Multi-tenant hosting. Shared hosting providers, VPS providers, and any platform where unrelated customers share a server. A single compromised customer account becomes a path to compromise the entire host – which means every other customer on that host. Hostney is in this category, which is why we patched immediately.

Kubernetes and container platforms. Workloads in pods run with non-root users by design. Until now, that was a meaningful security boundary. Copy Fail collapses it. Any pod that gets compromised – through an unrelated application vulnerability, a supply-chain attack on a base image, anything – becomes a path to corrupting setuid binaries on the host node.

CI/CD systems. Build systems that run code from pull requests. The pull request runs as a normal user inside a container. Copy Fail makes that user root on the build host. Every secret on that host is exposed.

Cloud function platforms and SaaS. Anywhere customer code runs in a sandbox alongside other customer code on the same host kernel. The sandbox boundary leaks.

Single-tenant servers with multiple users. Old-school shared servers where a team has individual logins. One compromised developer account becomes a root compromise of everyone’s work.

If your Linux box is a single-tenant server you are the only user of, Copy Fail is still a vulnerability – any code an attacker manages to run on it becomes root code. But it is not the standalone-disaster scenario it is for the categories above.

What you should actually do#

The fix is straightforward in principle and slightly annoying in practice, depending on how you maintain your servers.

Step 1: Patch the kernel. The fixed kernel versions upstream are 6.18.22, 6.19.12, and 7.0. Every major distribution has either shipped patched kernel packages or has published a timeline. Run your distribution’s update tool, install the new kernel, reboot.

Check your distribution’s own advisory page for the current status:

Ubuntu: https://ubuntu.com/security/CVE-2026-31431
Red Hat: https://access.redhat.com/security/cve/CVE-2026-31431
SUSE: https://www.suse.com/security/cve/CVE-2026-31431
Debian: https://security-tracker.debian.org/tracker/CVE-2026-31431
Azure Kubernetes Service: see the AKS release notes for the latest patched node-image VHD.

These pages are the authoritative source – they update in real time as patches roll out, which is more reliable than any list a blog post can publish.

Step 2: If you cannot patch immediately, disable the vulnerable module. The bug lives in a kernel module called algif_aead . You can prevent the module from loading with two commands:

echo "install algif_aead /bin/true" > /etc/modprobe.d/disable-algif.conf
modprobe -r algif_aead || true

The first line tells modprobe to no-op instead of loading the module. The second unloads it if it is currently loaded; the || true keeps the command safe to run from a script even if the module is not present.

This disables AF_ALG access to the AEAD crypto path entirely. Some context for what you give up: almost nothing in normal use of Linux relies on this. Disk encryption (LUKS, dm-crypt), SSH, TLS, IPsec, and the kernel keyring all use the kernel’s internal cryptographic paths, not the userspace AF_ALG interface. The only programs that break are ones that explicitly use AF_ALG – which is rare in a default server install but not impossible. Some appliance images, hardware-accelerated crypto setups, libkcapi -based tooling, and a handful of IPsec userspace helpers do use it. On a general-purpose Linux server it is almost always safe; on an embedded or appliance system, check before disabling. In any case, the risk of disabling this module is much lower than the risk of leaving it enabled and unpatched on a multi-tenant host.

Step 3: For container platforms, add AF_ALG to your seccomp deny list. If you operate Kubernetes or any container platform where workloads should not be opening cryptographic sockets in the first place, deny socket(AF_ALG, ...) at the seccomp layer so even unpatched kernels are not exposed by user workloads.

Step 4: Treat anything that was compromised before patching as compromised, period. If you operate a shared platform and an attacker had a foothold during the window between disclosure (April 29) and your patch landing, assume they used Copy Fail to escalate to root. The deterministic, no-trace nature of this bug means you cannot tell from logs whether it was used. Recycling nodes is the safe response.

How a 9-year-old bug stayed hidden#

Three things had to be true for Copy Fail to exist, and all three were independently reasonable:

The authencesn algorithm was added years ago to support IPsec extended sequence numbers. It writes 4 bytes of scratch space past the end of its primary output. Correct, documented, fine.
Later, support for AEAD ciphers was added to algif . The userspace cryptographic interface gained the ability to ask for these algorithms. Correct, documented, fine.
In 2017, an in-place optimization was added to avoid an unnecessary memory copy in the AEAD path. The patch reviewers checked that the input buffer was the right size; nobody traced through what happened when the input buffer came from splice() -mapped page-cache pages with authencesn ‘s 4-byte scratch quirk. Correct in isolation, fine for every test case anyone ran.

Each commit, on its own, was right. The intersection of all three created the bug. This is the failure mode behind a meaningful percentage of long-lived security bugs – no individual change was wrong, but the combined behaviour of correct individual changes drifted into a place no human reviewer had a model for.

There is one more thing worth noting about how Copy Fail was found. According to Theori’s own writeup, the bug was discovered by an AI-assisted analysis tool called Xint Code in roughly an hour, after a single operator prompt. That detail matters because it suggests we are at the start of a wave of long-buried bugs being surfaced by tooling that can finally read enough of the kernel at once to spot the kind of cross-feature interaction that nobody noticed for nine years. Copy Fail probably will not be the last of these.

It is also why “we have read all the code” is not a substitute for fuzzing, sandboxing, and least-privilege defaults. Android, by way of comparison, is not vulnerable to Copy Fail – not because the Android kernel does not have the bug (it does), but because Android’s SELinux policy denies AF_ALG socket creation to every process except one specific debug helper. The bug exists; the door is bolted.

What we did at Hostney#

We patched all servers in the fleet. That is the entire Hostney-specific portion of this story.

If you are wondering whether your server is patched, the simplest check on most distributions is:

uname -r

Compare the output to your distribution’s advisory page. If the kernel version postdates the published fix, you are patched. If you have not rebooted since the kernel update, you are not patched yet – patched kernel packages on disk do not protect a running kernel from before they were installed.

The takeaway#

Copy Fail is not a sophisticated attack. It is a four-byte write into the kernel’s memory, in a place nobody thought a user could reach, that has been waiting in plain sight since 2017. Its lessons are old ones: defence in depth matters, default-deny matters, the page cache is an attack surface, in-place optimizations are dangerous in code paths where the input source can be controlled.

Patch your kernel. Reboot. If you cannot patch immediately, disable the module. If you operate anything where untrusted users share a kernel – hosting, containers, CI – this should be the highest-priority item on your list this week. The window of “patches available, exploitation tooling public” is the window in which most of the damage from any major vulnerability happens.

Copy Fail (CVE-2026-31431): a Linux kernel bug that turns any user into root, explained in plain English

What Copy Fail actually does#

Why this is worse than most kernel bugs#

Who actually needs to worry#

What you should actually do#

How a 9-year-old bug stayed hidden#

What we did at Hostney#

The takeaway#

How to send WordPress email via SMTP

How to show visitors currently online in WordPress

How to add Google AdSense to WordPress

How to tell if a site uses WordPress (and which theme)

WordPress posts vs pages: what’s the difference

Related articles