WordPress high availability and high traffic hosting

Apr 20, 2026|12 min read

Short answer: high availability (HA) and high traffic are two different problems, and most sites searching for HA actually need the high-traffic solution. High traffic is solved with aggressive caching, a CDN, and a well-tuned single node. High availability is solved with redundancy across multiple nodes so one failing does not take the site down. Confusing the two leads to overcomplicated setups that cost more and break more than they prevent.

This guide explains what each actually means, the WordPress stack components that need redundancy for true HA, when you actually need HA versus just better caching, and how a properly architected hosting stack covers most real-world scale without the cost and complexity of enterprise HA.

High availability vs high traffic#

These terms get used interchangeably in hosting marketing, but they describe different problems.

High traffic means your site gets a lot of requests per second. A news site during a breaking story, a store during a product launch, a blog post that hits the front page of Hacker News. The question is: can your infrastructure serve all those requests without timing out?

High availability means your site stays online even when something fails. A disk crashes. A network switch goes down. A bad deploy takes the origin server offline. The question is: does the site keep serving visitors while you fix it?

A site can be one without the other. A static marketing page served from a CDN is highly available (edge nodes everywhere, any one failing is transparent) but not handling meaningful traffic. A single VPS running an e-commerce store can handle tons of traffic if cached well, but it is not highly available – reboot the box and the store is down.

The two needs do overlap. True enterprise WordPress hosting (for sites like university portals, government services, Fortune 500 corporate sites) needs both. But for the vast majority of WordPress sites – even ones getting millions of page views per month – the answer is “solve the traffic problem, and redundancy at the CDN layer covers most availability needs.”

What "high traffic" actually looks like#

Before assuming you need HA infrastructure, measure what your site is actually doing and what it could do.

A well-tuned single node running OpenResty + PHP-FPM + object caching can serve:

Cached requests: 2,000 – 10,000 requests per second from the edge cache, depending on request size and hardware. Most hits never touch PHP.
Uncached PHP requests: 50 – 300 requests per second, depending on what the plugins are doing. WooCommerce checkout is slower than a marketing page.
Database-heavy requests: 20 – 100 per second before the database becomes the bottleneck.

At those numbers, a single server handles hundreds of thousands to millions of daily visitors without breaking a sweat – provided most traffic hits the cache. The most common mistake is measuring your traffic in visits per month, dividing by seconds, and getting a tiny number, then assuming caching will take care of everything. Traffic is bursty. A site averaging 10 req/s might spike to 200 req/s during a newsletter send or a viral share.

What kills sites during traffic spikes is almost never the hardware. It is:

A caching plugin misconfigured so logged-in users bypass cache (if 20% of traffic is logged in, you just lost 80% of your caching benefit)
WooCommerce cart fragments fired on every page (forces AJAX on every request, bypasses page cache)
Third-party scripts that block rendering (not a server issue, but users bounce before the spike matters)
Database queries that were fine at 10 rows and are catastrophic at 10,000 (your traffic grew but the queries did not scale)

Fix those before buying bigger servers. See WordPress database optimization and how to speed up WordPress for the pre-HA optimization work.

The HA stack: what needs to be redundant#

If you have actually outgrown a single-node setup, here is what “high availability WordPress” means in infrastructure terms. Every one of these components is a single point of failure in a vanilla WordPress install, and each needs its own redundancy strategy.

Web server + PHP#

Multiple web nodes sit behind a load balancer. Any one node failing drops capacity but does not take the site down. The load balancer itself needs to be redundant (or managed by a cloud provider that handles that for you – AWS ALB, Cloudflare Load Balancer, etc.).

The gotcha: WordPress was designed around “one machine runs everything.” Sticky sessions are not required for most WordPress traffic, but plugins that rely on PHP session files (some form builders, some e-commerce add-ons) need session data shared across nodes. Usually via Redis, sometimes via shared storage.

Database#

This is the hardest part of HA WordPress. The database stores all the content, all the user data, all the orders, all the options. A single MySQL server going down takes the site down even if you have ten web nodes.

Options in increasing order of complexity:

Primary-replica replication: one writeable primary, one or more read replicas. If the primary dies, promote a replica. Downtime during promotion, typically 30 seconds to a few minutes depending on tooling. WordPress core does not support read-replica splitting out of the box – plugins like HyperDB can, but they add complexity.
Galera Cluster / Group Replication: synchronous multi-primary replication. Any node can accept writes, nodes stay in sync. More expensive (latency on writes, you pay for consistency) but no failover downtime. Operationally demanding – split-brain scenarios are a real risk.
Managed database service: let a cloud provider handle it (Amazon RDS, Google Cloud SQL). Simpler but expensive, and you still need to configure WordPress to talk to the failover endpoint.

For most WordPress sites, a single well-backed-up database with a good restore procedure is enough. HA database starts to matter when uptime SLAs are contractual and minutes of downtime cost real money.

File storage (wp-content/uploads)#

Multiple web nodes means the media library needs to be on shared storage. Three common approaches:

Network filesystem: NFS, GlusterFS, CephFS. Works but introduces latency on every media load, and the filesystem itself becomes a single point of failure unless it is clustered.
Object storage: upload to S3 (or compatible – R2, Wasabi, Backblaze B2), serve from a CDN. Requires a plugin to redirect uploads there, but scales nearly infinitely and has built-in redundancy.
Replicated storage: something like SeaweedFS or MinIO between nodes. More control, more complexity.

Object storage is usually the right answer for HA WordPress. The migration from “uploads on disk” to “uploads in S3” is a one-time project that simplifies everything downstream.

Session and object cache#

If you are running Redis for object cache (and you should be for any busy site), Redis itself needs HA. Redis Sentinel for failover, or Redis Cluster for sharding. Managed Redis services (AWS ElastiCache, Google Memorystore) handle this for you.

WordPress will work with a single Redis instance, but if that instance goes down, the site either slows to a crawl (if it falls back to database queries) or errors out (depending on the object cache plugin configuration).

DNS and the CDN layer#

DNS needs redundancy too, though most managed DNS providers (Cloudflare, Route53, NS1) handle this by default. The CDN in front of the origin is often the most effective availability layer – even if the origin goes completely down, cached pages continue serving from the edge for the duration of their TTL. A well-configured CDN with long cache TTLs and stale-while-revalidate can hide minutes to hours of origin downtime from most visitors.

The simpler path: caching plus a CDN handles most "HA" needs#

Here is what most sites are missing when they think they need HA infrastructure.

A site with:

A well-tuned origin (OpenResty + PHP-FPM + Redis object cache + MySQL)
Aggressive page caching with long TTLs
A CDN in front (Cloudflare, Fastly, Bunny) caching the HTML output
Stale-while-revalidate configured so expired cache is served while the origin regenerates
Automated backups tested regularly

…survives almost everything that is not a full-region cloud outage. If the origin server goes down, the CDN continues serving cached content for hours. During that window, you fix the origin. Visitors never notice unless they try to log in, submit a form, or perform some other uncached action.

This is not “real” HA – the origin is still a single point of failure, and uncached traffic dies until you fix it. But for most sites, “most visitors see the cached version during an origin outage” is functionally indistinguishable from HA, and it costs a fraction of what a multi-node setup costs to operate.

The traffic threshold where this approach stops being enough is surprisingly high. A single well-tuned origin with a CDN layer can handle traffic that requires enterprise-grade HA in a do-it-yourself scenario. The CDN absorbs the cacheable requests, the origin handles the uncached ones, and the combination is robust enough for almost every WordPress use case.

When you actually need real HA#

HA infrastructure starts to make sense when any of the following is true:

Contractual uptime SLA of 99.99% or higher. That is 52 minutes of downtime per year. A single-node setup cannot realistically guarantee that – one kernel panic or disk failure blows the budget for a decade.
Revenue loss per minute of downtime is in the thousands of dollars. A major retail site during Black Friday, a SaaS product used by enterprise customers, a payment processor. If a 10-minute outage costs more than the annual HA hosting budget, HA pays for itself.
Most traffic is uncached. Logged-in WordPress users (membership sites, LMS, multisite admin areas), WooCommerce checkouts, real-time personalization. If the majority of requests cannot be served from cache, your origin is doing all the work and origin redundancy becomes essential.
Write traffic is high and bursty. A comment system, a live-blog, a voting platform. When writes are constant and cannot be served from cache, database HA starts to matter.
Compliance or regulatory requirements. Some frameworks (HIPAA, certain government contracts, financial services) require documented redundancy and failover procedures.

If none of the above applies, a well-tuned single-node WordPress with a CDN is almost certainly what you want. Do not pay for HA you will never use.

Hosting options for HA WordPress#

Based on what you need, the options fall into a few camps.

Self-managed#

You rent VMs (DigitalOcean, Linode, AWS EC2), configure the stack yourself, manage failover yourself. Most flexibility, cheapest at scale if you have in-house expertise, highest operational burden. Projects like HyperDB or wp-cli db plugins help, but you are on the hook for every failure mode.

Managed HA-WordPress platforms#

Vendors like Pantheon, WP Engine (for large plans), Kinsta, and Pressable offer multi-node WordPress hosting with database failover built in. You pay more per site, but the HA infrastructure is the product. Suitable for sites that need “our HA is someone else’s problem.”

Enterprise cloud providers#

AWS, Google Cloud, Azure – you build WordPress on top using managed services (RDS, ElastiCache, S3, ALB). Most flexibility at the cost of complexity. Usually overkill for standalone WordPress sites, but the right fit for sites that are part of larger infrastructure footprints.

How Hostney handles high traffic#

Every site on Hostney runs in its own isolated container on a well-tuned web node. The stack is built for sustained throughput without the complexity overhead of multi-node orchestration:

OpenResty with edge caching serves cached requests without touching PHP. This is what absorbs traffic spikes – a viral post hitting the homepage is served from the edge at wire speed.
PHP-FPM per-account containers with per-version isolation. A runaway plugin in one account cannot affect any other site on the server.
HTTP/3 (QUIC) across all web servers. Reduced connection overhead matters most on mobile and international traffic, which is where real-world spikes tend to hit.
Redis object caching handles WordPress sessions, transients, and frequently-queried data. This is what keeps logged-in users fast when the page cache does not apply.
Edge bot protection runs in OpenResty itself, filtering malicious traffic (scrapers, exploit scanners, brute-force probes) before it ever reaches PHP. When a spike is the result of an attack rather than real users, the origin never sees it.
DDoS hardening and rate limiting at the nginx layer.
HUC proxy layer handling load distribution across the infrastructure.

This stack handles the traffic patterns of real-world WordPress sites – publishers with viral posts, stores running campaigns, membership sites with logged-in sessions – without needing to spin up dedicated clusters. Edge caching plus proper origin tuning scales further than most sites assume. The architecture prioritizes the 99% case: get the edge right, isolate the origin properly, and the server handles everything most sites will ever throw at it.

What is next: origin plus edge caching servers#

The next step in the Hostney architecture is multi-region availability: an origin server backed by multiple caching edge servers. Writes and PHP execution stay on the origin. Edge servers in different locations serve the cacheable reads with synchronized caching, so visitors get responses from whichever edge is closest. An origin blip continues to serve from every edge location while the issue is resolved.

This matches how WordPress traffic actually splits – heavily read-dominated, with writes concentrated in admin and checkout flows. It delivers the availability benefit of multi-node without the operational complexity of cross-region database replication, and it extends the edge-first philosophy the platform already runs on.

Common mistakes when planning for scale#

Buying HA infrastructure before optimizing the origin. A single well-tuned node with proper caching outperforms three undertuned nodes every time, and costs less to run. Measure what your current setup can actually handle before assuming you need more nodes.

Assuming traffic means “HA needed.” High traffic is solved by caching. High availability is solved by redundancy. If the origin node rarely fails and visitors rarely see the origin anyway (because everything is cached), you do not have an availability problem.

Overpaying for managed HA-WordPress tiers. Enterprise-grade HA WordPress hosting is genuinely expensive because the infrastructure is genuinely expensive. If your site does not have the uptime requirements or revenue-per-minute that justifies the cost, a standard managed host plus a CDN is a better value.

Ignoring the CDN layer. A $20/month Cloudflare Pro plan or equivalent delivers more availability benefit than upgrading from a $50 VPS to a $200 VPS. Fix the edge first, then worry about origin redundancy.

Using “HA” to mean “good backups.” Backups are not HA. HA keeps the site running during a failure. Backups let you recover from a disaster but always involve downtime. You need both, for different reasons.

Thinking “the cloud” is automatically HA. Running WordPress on a single AWS EC2 instance is not any more available than running it on a single VPS anywhere else. HA requires multi-node architecture, which you have to build (or pay someone to build).

Summary#

High traffic and high availability are different problems. Most sites chasing HA actually need caching plus a CDN, and the combination handles nearly any traffic pattern a WordPress site will realistically encounter. True HA – multi-node web servers, database failover, shared storage, session sharing – is genuinely useful for sites with strict uptime SLAs, high write traffic, or enterprise compliance requirements. For everyone else, it is overkill.

Hostney is built around edge-first architecture: OpenResty caching at the edge, isolated PHP-FPM containers per account, HTTP/3, Redis object caching, and bot protection that filters attack traffic before it ever reaches the origin. That combination handles the traffic patterns of real-world WordPress sites, and the origin-plus-edge-caching-servers model on our roadmap extends it further into multi-region availability.

If you are in the early stages of planning for scale, start by optimizing what you have. Fix caching, add a CDN, clean up the database, profile the slow plugins. You will discover your existing infrastructure can handle much more than you assumed, and when you eventually do need the next tier of redundancy, you will know exactly what the bottleneck is rather than buying infrastructure to solve imaginary problems.

WordPress XML-RPC: what it is and how to disable or secure it

WordPress maintenance plans: what they include and what to charge