Rate limiting is the first tool most people reach for when automated traffic becomes a problem. Set a threshold, count requests per IP, block anything that exceeds it. It is simple, it is well understood, and every web server supports it out of the box.
It is also not enough. Not even close.
Rate limiting solves one specific problem: a single IP sending too many requests too quickly. That describes the simplest possible bot attack. The kind that a teenager with curl could run. The attacks that actually waste server resources in production look nothing like that.
This post explains what rate limiting actually catches, why it misses the majority of real-world bot traffic, and what it takes to fill the gaps. We are going to be specific because vague claims about “advanced bot protection” are not useful to anyone.
What rate limiting does well
Before we explain where rate limiting fails, it is worth being clear about where it works. Rate limiting is effective against a specific class of attacks:
Single-IP floods. An IP sending hundreds of requests per second to your site. This is the simplest bot pattern and rate limiting handles it perfectly. Set a threshold, exceed it, get blocked. Done.
Brute force from a single source. An IP hammering your login page with credential attempts at high speed. If your web server enforces a strict rate limit on the login endpoint, say 6 requests per minute, the attacker gets through 6 attempts and then hits a wall.
Simple denial of service. A single machine trying to overwhelm your server with volume. Rate limiting caps the damage any one source can do.
These are real attacks and rate limiting stops them. If your web server does not have rate limiting configured, you should fix that. It is table stakes.
But it is also where the protection ends.
The three problems rate limiting cannot solve
Rate limiting operates on a single dimension: requests per time window, per IP. That design creates three structural blind spots that no amount of threshold tuning can fix.
Problem 1: Distributed attacks fly under every threshold
A botnet with 1,000 compromised devices, each sending 2 requests per minute to your site, generates 2,000 requests per minute of malicious traffic. Your rate limit is set to 120 requests per minute per IP. No single IP comes anywhere close to the threshold.
From rate limiting’s perspective, nothing is happening. One thousand IPs, each making a perfectly reasonable number of requests, each appearing to be a normal visitor. The aggregate load is crushing your server, but no individual source is misbehaving.
This is not a theoretical concern. Distributed attacks are the norm, not the exception. Botnets consist of thousands of compromised residential devices. Each one has a legitimate-looking IP address from a normal ISP. Each one sends a small number of requests. Rate limiting sees 1,000 well-behaved visitors. Your server sees 2,000 malicious requests per minute.
You cannot fix this by lowering the threshold. If you set the rate limit to 10 requests per minute per IP, you start blocking legitimate visitors who click through your site at a normal pace. Your contact form submissions fail. Your WooCommerce checkout breaks. Your admin dashboard becomes unusable. You have traded a bot problem for a usability problem.
The fundamental issue is that per-IP thresholds cannot detect coordination. They see each IP in isolation. Coordinated attacks are designed to look normal at the individual level.
Problem 2: Slow bots stay under the radar indefinitely
An attacker sending one request every 10 seconds to your login page generates 6 requests per minute. That is well within any reasonable rate limit. But over the course of a day, that is 8,640 login attempts from a single IP address.
Rate limiting does not care. The requests arrive slowly enough that no threshold is exceeded at any point. The attacker can run this for weeks. The requests consume PHP workers, generate database queries, and trigger password hashing operations, all at a rate that rate limiting considers normal.
This pattern is common with credential stuffing. Attackers have lists with billions of leaked passwords. They are not in a hurry. A slow, steady drip of login attempts from a single IP is invisible to rate limiting but still accomplishes the attacker’s goal over time.
You might think you can catch this with a longer time window. Instead of 120 requests per minute, set it to 500 requests per hour. But now you have introduced a different problem. A legitimate visitor who spends 20 minutes browsing your WooCommerce store, adding items to a cart, checking out, and viewing order confirmations might generate 200-300 requests in an hour. Bursty, normal human behavior starts hitting hourly limits that were designed to catch bots.
Longer windows do not help because human traffic is bursty and bot traffic can be arbitrarily slow. There is no time window where the two do not overlap.
Problem 3: Intent is invisible to request counting
Rate limiting measures volume. It cannot measure intent.
An IP that sends 5 requests in a minute could be a human browsing your blog. It could also be a scanner that just checked /.env, /.git/config, /wp-config.php.bak, /phpinfo.php, and /server-status. Both look identical to rate limiting. Five requests. Well under the limit. No action taken.
But those five requests tell you everything you need to know. No human types those paths into a browser. A scanner probing for configuration files and exposed admin panels is not a visitor. It is a threat. And it revealed itself completely in five requests, all of which rate limiting allowed through without a second thought.
The same applies to login-focused traffic. If 100% of an IP’s requests over two hours are POST requests to /wp-login.php, that is a single-purpose brute forcer. A human logging into WordPress visits the login page, enters credentials, and then navigates the dashboard. They request CSS files, images, admin pages. A bot does none of that. It hits the login endpoint and nothing else.
Rate limiting cannot distinguish between “5 normal page views” and “5 vulnerability probes.” Both are 5 requests. The difference is what they are requesting and why, which is not something request counting can see.
What actually catches the traffic rate limiting misses
The attacks that bypass rate limiting are not exotic edge cases. They are the majority of real-world bot traffic. Distributed scanning, slow credential stuffing, and targeted probing are the baseline, not the exception.
Stopping them requires looking at dimensions that rate limiting ignores: what is being requested, how the client behaves, whether it looks like a real browser, and what it is doing across multiple sites over time.
Behavioral signals: what the client reveals about itself
A real browser and a bot script make the same HTTP request, but they do not look the same at the connection level.
Cookie behavior. Every response from our servers includes a small security cookie. A real browser stores it and sends it back on subsequent requests automatically. Most bots do not bother with cookies. An IP that sends 50 requests and never returns the cookie is almost certainly not a browser. Rate limiting cannot see this. It counts requests, not cookie headers.
Request headers. Browsers send a consistent set of headers: Accept-Language, Accept-Encoding, a Referer on navigational clicks. Bots often send minimal headers or inconsistent combinations. An IP sending hundreds of requests with no Accept-Language header and no Referer on any of them is following a pattern that humans do not produce.
TLS fingerprinting. The TLS handshake that establishes an encrypted connection includes details about the client’s capabilities: supported cipher suites, extensions, compression methods. Different HTTP libraries produce different TLS fingerprints. A request claiming to be Chrome 120 but presenting a TLS fingerprint that matches Python’s requests library is lying about what it is. Rate limiting never sees the TLS handshake at all.
None of these signals is definitive on its own. Someone using a privacy-focused browser might strip headers. A monitoring service might not return cookies. But when multiple behavioral signals align, they paint a picture that request counting never could.
Intent signals: what the request is actually doing
Some requests reveal intent immediately, regardless of volume.
Honeypot paths. No legitimate visitor requests /.env, /.git/config, /wp-config.php.bak, or /c99.php. These paths do not exist on any normal website. If an IP requests them, it is scanning for vulnerabilities. One request is enough to know. Rate limiting would need hundreds of requests from this IP before it intervened, by which point the scanner has already found what it was looking for and moved on.
Login-only traffic. An IP whose entire request history consists of POST requests to login endpoints, across multiple unrelated websites, is not a human who happens to be logging into a lot of sites. It is working through a credential list. The signal is not the rate of requests but the complete absence of any other activity.
Path entropy. A scanner cycling through a list of known exploit paths produces requests with high path diversity and high 404 rates. It requests /solr/admin, /actuator/health, /cgi-bin/test.cgi, /.aws/credentials, none of which exist on the target site. The combination of many unique paths and a high percentage of 404 responses is a strong signal of automated reconnaissance. Rate limiting sees the 404s but does not analyze the pattern.
Cross-site visibility: what one site cannot see alone
This is the gap that no single-site solution can fill.
An attacker sending 5 login attempts to your site does not trigger any alarm. But the same IP sending 5 login attempts to each of 200 different sites on the same hosting platform has made 1,000 login attempts total. From any individual site’s perspective, the traffic is trivial. From the platform’s perspective, it is an obvious credential stuffing campaign.
The same pattern applies to vulnerability scanning. A scanner that sends 3 requests to each of 500 sites has probed 1,500 targets while staying invisible to every single one of them individually. Only a system that aggregates signals across all targets can see the coordination.
This is why platform-level detection matters. A WordPress security plugin running on your site can only see traffic to your site. It cannot know that the same IP just scanned 499 other sites in the last hour. Rate limiting at the server level has the same limitation. It counts requests per IP to your server, not per IP across the fleet.
On our platform, when an IP accumulates signals across multiple sites, those signals compound into a threat score that reflects the full picture. An IP with a low score on any individual site can still reach enforcement thresholds when its behavior across all sites is considered together.
Same-endpoint detection: catching what rate limiting structurally misses
Standard rate limiting counts all requests from an IP together. But some attack patterns are only visible when you look at requests to a specific URL.
An IP making 15 requests per minute spread across different pages on your site is normal browsing. The same IP making 15 requests per minute all to /wp-login.php is a brute force attack. Rate limiting treats both identically because the total request count is the same.
Our servers track per-IP, per-URL request patterns separately from global rate limiting. If an IP hits the same endpoint repeatedly within a short window, regardless of its overall request rate, that pattern triggers escalation. This catches the slow credential stuffing that global rate limiting misses: an attacker sending one login attempt every four seconds flies under a per-IP rate limit but accumulates 15 hits to the same URL in a minute, which is a different and more specific signal.
This same mechanism catches content scrapers. A bot fetching the same product page 60 times in two minutes is not a customer comparing prices. It is harvesting your content. The overall request rate might be modest, but the per-URL concentration is abnormal.
The false positive problem
Rate limiting’s biggest practical weakness is not that it misses bots. It is that it hits humans.
Shared IP addresses are everywhere. Corporate offices route hundreds of employees through a single IP. Mobile carriers assign the same IP to thousands of devices. University networks, coffee shop WiFi, VPN exit nodes, all of these concentrate many real users behind one address.
Set your rate limit to 120 requests per minute per IP. An office with 50 people browsing your WooCommerce store at lunchtime generates 300 requests per minute from one IP. Rate limiting blocks the office. Your biggest potential customer gets a 429 error page instead of your product catalog.
Lower the limit to catch more bots and you block more humans. Raise it to reduce false positives and you let more bots through. There is no sweet spot because the problem is not the threshold. The problem is that request counting cannot distinguish between one bot making 200 requests and 50 humans making 4 requests each from the same IP.
Multi-signal detection does not have this problem in the same way. Fifty humans browsing your store from a shared office IP will all accept cookies, send normal browser headers, request a variety of pages, and present normal TLS fingerprints. A bot making 200 requests from the same IP will likely ignore cookies, send minimal headers, hit the same endpoints repeatedly, and present a non-browser TLS fingerprint. The request count is similar but everything else is different.
This is why behavioral signals reduce false positives instead of increasing them. They add dimensions that separate bot traffic from human traffic in ways that pure request counting cannot.
The escalation model: rate limiting as one layer, not the only layer
Rate limiting works best when it is the first layer, not the only one.
Here is how the layers interact on our servers when a bot starts attacking:
Layer 1: Rate limiting. The web server enforces per-IP request limits. The fastest, dumbest attacks get stopped here instantly. A bot blasting 500 requests per second gets a 429 response with no processing overhead. This layer is cheap to run and catches the obvious cases.
Layer 2: Per-endpoint detection. Requests that pass rate limiting are evaluated for per-URL concentration. An IP that stays under the global rate limit but hits the same endpoint repeatedly gets flagged and challenged with a proof-of-work puzzle. This catches slow brute force and content scraping that rate limiting misses.
Layer 3: Behavioral analysis. Every request is evaluated for browser-like behavior: cookie handling, request headers, TLS fingerprint, path patterns. An IP that accumulates enough non-browser signals gets challenged regardless of its request rate. This catches sophisticated bots that deliberately stay under rate limits and target different URLs.
Layer 4: Platform-wide scoring. All signals from all servers are aggregated into a per-IP threat score that updates on a regular cycle. An IP that looks fine on any single site but is scanning across dozens of sites gets caught here. Scores compound over time: an IP that was challenged and failed to solve it picks up more points than one that was never challenged.
Each layer catches what the previous layer misses. Rate limiting catches the fast floods. Per-endpoint detection catches the slow hammering. Behavioral analysis catches the sophisticated mimicry. Platform-wide scoring catches the distributed coordination.
Removing any one layer does not break the system, but it creates a gap that specific attack patterns can exploit. Rate limiting alone leaves gaps 2, 3, and 4 wide open.
What this means in practice
Consider three attack scenarios and how they play out with rate limiting alone versus a multi-layer approach:
Scenario 1: Distributed credential stuffing. A botnet of 2,000 residential IPs, each sending 3 login attempts to your WordPress site.
With rate limiting alone: 6,000 login attempts, zero IPs blocked. Every request is under the threshold. Your login page processes all 6,000 attempts, consuming PHP workers and database connections for each one.
With multi-layer detection: The IPs are flagged for login-only traffic patterns. The ones that fail to return the security cookie pick up behavioral points. The ones that are also targeting other sites on the platform pick up cross-site signals. Within one or two scoring cycles, the majority are challenged with proof-of-work. The ones that cannot solve it (most of them) are blocked across all servers.
Scenario 2: Vulnerability scanning. A scanner probing 50 known exploit paths on your site at a rate of one request every 5 seconds.
With rate limiting alone: 12 requests per minute from one IP. Completely normal from a rate perspective. The scanner checks every path on its list without triggering any limit. If it finds an exposed configuration file or unpatched plugin, it reports back to the attacker.
With multi-layer detection: The first request to /.env or /.git/config hits a honeypot trap. The IP is immediately challenged. If it continues probing and hits more trap paths, the score compounds. Within minutes, the scanner is blocked without ever reaching a real page on your site.
Scenario 3: Content scraper. A bot fetching every product page on your WooCommerce store at a moderate rate, building a copy of your catalog.
With rate limiting alone: The scraper makes 30 requests per minute, well under any reasonable limit. Over the course of a day, it downloads your entire product catalog including descriptions, images, and pricing. Your competitor now has a copy of your store.
With multi-layer detection: The scraper hits the same URL patterns repeatedly with identical behavior. It does not accept cookies. It does not request CSS or JavaScript (because it does not render pages). It has no referrer headers. The per-URL concentration triggers endpoint detection. The behavioral signals compound. The scraper gets challenged and then blocked.
Rate limiting is necessary but not sufficient
We are not arguing against rate limiting. We use it. It is configured on every server we run, with specific limits for login endpoints, API routes, and general traffic. It stops the simplest attacks cheaply and efficiently.
But treating rate limiting as your complete bot protection strategy is like locking your front door and leaving the windows open. It covers the most obvious entry point and ignores everything else.
The attacks that waste real server resources in production, that slow down real sites and cost real money, are the ones designed to bypass request counting. They distribute across thousands of IPs. They throttle to stay under thresholds. They target specific endpoints at rates that look human. They mimic browser behavior just well enough to avoid simple filters.
Stopping those attacks requires looking at what rate limiting cannot see: behavior, intent, cross-site patterns, and the difference between a browser and a script pretending to be one.
On our platform, rate limiting is one layer of a multi-signal detection system that evaluates every request across multiple dimensions. It runs automatically on every server. If you want to see what it catches on your site, the data is in the Firewall section of your control panel.
If you are relying on rate limiting alone, whether through your hosting provider, a WordPress plugin, or a custom configuration, it is worth understanding what is getting through. The answer is probably more than you think.