What Is Rate Limiting and Why Does It Matter
Rate limiting is a control that puts a ceiling on how many times a client can call your API within a set time window. Think of it as a turnstile: it lets people through at a normal pace, but slams shut when someone tries to sprint through 10,000 times in a minute. Without it, your endpoints are a wide-open door to every automated attack tool on the internet.
Authentication endpoints are the highest-value target. Your /api/auth/login or /api/users/login endpoint does exactly one thing: it accepts a username and password and decides whether to let that person in. That single responsibility makes it the most abused endpoint in any application. An attacker who can submit unlimited login attempts has, in practice, unlimited time to find the right combination.
Two attacks that rate limiting stops cold
Brute force is the simplest form. An automated tool cycles through every possible password combination for a target account — short passwords, common substitutions, dictionary words — until one works. Without rate limiting, a single server with a standard internet connection can submit roughly 10,000 to 50,000 requests per minute to a typical REST endpoint. At 10,000 attempts per minute, an 8-character lowercase password falls in hours. A 6-character password with no special characters falls in seconds.
Credential stuffing is more sophisticated and far more common in 2025. Attackers buy or download lists of username-password pairs leaked from previous data breaches — billions of them are freely available. They then test those exact credentials against your login endpoint. Because a large percentage of users reuse passwords across sites, the success rate on a credential-stuffing run against an unprotected endpoint typically sits between 1% and 3%. Against a user base of 10,000, that means 100 to 300 accounts compromised in a single automated pass.
A standard auth endpoint without rate limiting can receive and process 10,000+ requests per minute. A credential-stuffing tool running at 2,500 attempts per minute — a conservative estimate — can test 37,500 username-password combinations in 15 minutes. Even at a 0.5% hit rate, that is 187 compromised accounts before your morning coffee.
When an account is compromised, the consequences cascade fast. Your customer's data is exposed. If your app stores payment methods or personal information, liability follows. Your support queue fills with account-recovery requests. Your brand reputation takes the damage — not the attacker's. And in regulated industries, a single confirmed account takeover may trigger breach-notification obligations under GDPR, CCPA, or sector-specific rules.
Rate limiting does not require a significant engineering investment. For the most common stacks, it is three to five lines of configuration. The return on those five lines is complete protection against the class of attacks described above.
How Our Scanner Checks for Rate Limiting
The scanner makes a controlled series of rapid consecutive requests to your auth endpoint and inspects both the HTTP response codes and the response headers. There is no login to your application, no credentials exchanged, and no code access required. We need only your app's public URL.
Specifically, we check for three signals:
- 429 responses. A correctly implemented rate limiter returns HTTP 429 (Too Many Requests) once a client exceeds the configured threshold. If we send 50 consecutive requests and every one returns 200 or 401, that is a finding.
- Rate limit headers. Well-behaved APIs include headers that tell the client its current quota status:
Retry-After,X-RateLimit-Limit,X-RateLimit-Remaining,X-RateLimit-Reset. Absence of these headers alongside absence of 429s is a clear signal of no limiting in place. - Backoff behavior. We measure whether response times increase as request volume rises — a sign of server-side throttling even without explicit 429s. Throttle without correct status codes is a partial implementation, not a pass.
We check the most common auth endpoint paths used by vibe-coded and framework-scaffolded apps: /api/auth/login, /api/login, /auth/signin, and /api/users/login. If your application uses a non-standard path, you can specify it manually on the scan form.
We do not attempt authentication. We do not submit real passwords. We do not store or log the test request payload. The scan is structurally equivalent to what a load testing tool would do in a controlled QA environment — directed at your login endpoint, not at your application logic.
Results are returned in under 2 minutes. If a rate limit is detected, we confirm what type and at what threshold. If no rate limit is found, we surface it as a P0 Critical finding with the full CVSS breakdown and a recommended fix for your specific stack.
What "No Rate Limiting" Means in CVSS v3 Terms
CVSS — the Common Vulnerability Scoring System — is the industry-standard framework for quantifying how severe a security flaw is. Version 3.1 scores vulnerabilities across six vectors and produces a base score from 0 to 10. Here is how a missing rate limit on an authentication endpoint scores.
A score of 9.1 lands firmly in the Critical band (9.0–10.0). The reason it scores this high is the combination of three factors: the attack is network-accessible (no physical proximity needed), requires no privileges whatsoever, and results in complete compromise of the affected accounts when successful. The only reason it does not score a perfect 10.0 is that availability impact is rated Low rather than High — a rate-limit attack degrades service somewhat but does not typically cause complete downtime.
For context: a 9.1 CVSS score is in the same tier as Log4Shell (10.0) and Heartbleed (7.5, though that predates CVSS 3.x). Any P0 finding in our system triggers immediate CTO escalation and is flagged in your dashboard within 4 hours. Missing rate limiting on auth endpoints always classifies as P0.
For the full security audit breakdown on vibe-coded apps, including how we score each dimension out of 100, see our Cursor security audit guide.
What Our Scanner Returns for a Rate Limit Finding
Every finding we surface follows a structured format. This is the exact output you would receive for a missing rate limit on a Node.js/Express auth endpoint.
The finding ships with this structure plus a detailed recommended fix block, stack-specific code, and an estimated remediation time. For Code Care customers on a Growth Retainer or Scale Retainer, our engineering team opens a PR with the fix already written — you review and approve, we deploy on your signal.
How to Fix Rate Limiting on Your Stack
For the three stacks that appear most frequently in the apps we scan — Node.js/Express, Supabase edge functions, and Vercel edge middleware — here is exactly what to add.
Node.js / Express
// npm install express-rate-limit
const rateLimit = require('express-rate-limit');
const loginLimiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 10, // 10 attempts per window per IP
standardHeaders: true, // Return X-RateLimit-* headers
legacyHeaders: false,
message: {
error: 'Too many login attempts. Try again in 15 minutes.'
}
});
app.use('/api/auth/login', loginLimiter);
This is the industry baseline: 10 attempts per 15-minute window per IP address. For higher-security applications, drop max to 5 and combine with a CAPTCHA trigger after the third failure.
Supabase Edge Functions
// In your Supabase edge function (Deno runtime)
import { Ratelimit } from '@upstash/ratelimit';
import { Redis } from '@upstash/redis';
const ratelimit = new Ratelimit({
redis: Redis.fromEnv(),
limiter: Ratelimit.slidingWindow(10, '15 m'),
analytics: true
});
const identifier = req.headers.get('x-forwarded-for') ?? 'anonymous';
const { success, reset } = await ratelimit.limit(identifier);
if (!success) {
return new Response('Too many requests', {
status: 429,
headers: { 'Retry-After': String(Math.floor((reset - Date.now()) / 1000)) }
});
}
Vercel Edge Middleware
// middleware.ts — runs at the edge before your route handler
import { Ratelimit } from '@upstash/ratelimit';
import { Redis } from '@upstash/redis';
import { NextResponse } from 'next/server';
import type { NextRequest } from 'next/server';
const ratelimit = new Ratelimit({
redis: Redis.fromEnv(),
limiter: Ratelimit.slidingWindow(10, '15 m')
});
export async function middleware(req: NextRequest) {
if (req.nextUrl.pathname === '/api/auth/login') {
const ip = req.ip ?? '127.0.0.1';
const { success } = await ratelimit.limit(ip);
if (!success) {
return NextResponse.json(
{ error: 'Too many requests' },
{ status: 429 }
);
}
}
return NextResponse.next();
}
export const config = { matcher: ['/api/auth/login'] };
Run our free scanner again after deploying. The rescan confirms your 429 responses are returning correctly, the Retry-After header is present, and the X-RateLimit-* headers are populated. A finding that was P0 with a score contribution of −12 points typically becomes a clean pass on rescan.
Real-World Scan Example
The score improvement below is representative of what rate limiting alone can contribute to an overall Launch Readiness Score.
/api/auth/login. After adding express-rate-limit with a 10-attempts/15-minute window, a rescan 48 hours later scored 88/100. The P0 was cleared. Three P1s relating to missing security headers were addressed in the same PR batch.
The founder had shipped the app in six days using Cursor. The auth endpoint had been live for three weeks before the scan. In that window, the server logs showed 847 requests to /api/auth/login from a single IP over a 40-minute period — a credential-stuffing attempt that went undetected because there was no alerting in place either. The full Cursor security audit covers how we approach apps built with AI-assisted IDEs.
Rate limiting took approximately 25 minutes to implement and deploy. The full rescan was delivered in under 2 minutes. The app moved from a score that would concern an enterprise procurement team to one that passes standard vendor security questionnaires.