Rate Limiting in Express That Actually Works (And 3 Ways It Usually Doesn't)
express-rate-limit is the right starting point. Here's how to configure it for production, when its in-memory store breaks, and the three subtle mistakes that make rate limits useless.
Rate Limiting in Express That Actually Works (And 3 Ways It Usually Doesn't)
express-rate-limit is the standard rate-limiting library for Express, and it works correctly when configured correctly. The problem is that the default configuration is wrong for most production deployments, and the failures are silent — your rate limiting "works" but is either trivially bypassed or accidentally blocking legitimate users.
Here's the production-grade setup and the three mistakes to avoid.
The basic setup
npm install express-rate-limit
import rateLimit from 'express-rate-limit';
const limiter = rateLimit({
windowMs: 60 * 1000,
max: 100, // 100 requests per minute per IP
standardHeaders: true,
legacyHeaders: false,
});
app.use(limiter);
This works for a single Express process serving requests directly. Most production deployments aren't either of those things.
Mistake 1: trusting the wrong IP
By default, express-rate-limit reads the IP from req.ip. This is req.socket.remoteAddress in stock Express — the address of whoever connected to your server. If you're behind a load balancer, CDN, or reverse proxy, that's the proxy's address, not the user's.
Result: every request from any user behind your CDN gets rate-limited as one address. You either set the limit so high it's pointless, or you accidentally rate-limit your entire user base when one user goes over.
The fix:
app.set('trust proxy', 1); // or true, depending on your topology
This tells Express to read X-Forwarded-For and use the real client IP. Validate the topology — you should trust proxy only as many hops as you actually have, otherwise attackers can spoof the header.
For Cloudflare specifically, use req.headers['cf-connecting-ip'] instead, since Cloudflare provides it definitively:
const limiter = rateLimit({
windowMs: 60 * 1000,
max: 100,
keyGenerator: (req) => req.headers['cf-connecting-ip'] || req.ip,
});
Mistake 2: in-memory storage with multiple processes
express-rate-limit's default store is an in-memory MemoryStore. If you run more than one Express process (PM2 cluster mode, Kubernetes pods, multiple servers), each process has its own counter. An attacker hitting your service can spread their requests across processes and effectively multiply their limit by the process count.
The fix: use a shared store. Redis is the standard:
npm install rate-limit-redis ioredis
import rateLimit from 'express-rate-limit';
import { RedisStore } from 'rate-limit-redis';
import Redis from 'ioredis';
const redis = new Redis(process.env.REDIS_URL);
const limiter = rateLimit({
windowMs: 60 * 1000,
max: 100,
store: new RedisStore({
sendCommand: (...args) => redis.call(...args),
}),
});
If your app already uses Redis (most Express apps do for sessions or caching), this is a 5-line addition. If it doesn't, the cost of adding Redis just for rate limiting is real — consider whether infrastructure-layer rate limiting (AWS WAF, Cloudflare) is enough.
Mistake 3: one global limit for all endpoints
A 100/min limit makes sense for /api/users but is way too low for /api/autocomplete and way too high for /auth/login. One global limit is always wrong for one half of your endpoints.
The fix: per-endpoint limits, with stricter limits on auth and write endpoints:
const generalLimiter = rateLimit({
windowMs: 60 * 1000,
max: 100,
});
const authLimiter = rateLimit({
windowMs: 5 * 60 * 1000,
max: 5, // 5 attempts per 5 minutes for auth
skipSuccessfulRequests: true, // only count failures
});
const autocompleteLimiter = rateLimit({
windowMs: 60 * 1000,
max: 1000, // legitimate users do hit autocomplete a lot
});
app.use(generalLimiter);
app.post('/auth/login', authLimiter, loginHandler);
app.get('/api/autocomplete', autocompleteLimiter, autocompleteHandler);
The auth limit with skipSuccessfulRequests: true is specifically a credential-stuffing defense — only counts the bad attempts, so legitimate users who type their password right aren't penalized.
Per-user vs per-IP limiting
For authenticated endpoints, per-user limiting (by user ID) is more accurate than per-IP. A whole office network NAT'd to one IP shouldn't share a single rate limit:
const userLimiter = rateLimit({
windowMs: 60 * 1000,
max: 200,
keyGenerator: (req) => req.user?.id || req.ip,
});
For unauthenticated endpoints, you have to use IP — there's no other identifier. For authenticated ones, prefer user ID.
What rate limiting doesn't solve
Rate limiting handles volume from a single source. It doesn't handle:
- Distributed attacks. Many sources, each below threshold. Need different detection (see credential stuffing detection).
- Bot traffic. Bots can stay under your rate limit and still cause real cost. Block them at the IP layer using reputation feeds — see block bot traffic in Express.
- Application-layer abuse. Cart cycling, coupon stacking, business-logic attacks. Rate limiting doesn't see them as abuse.
Rate limiting is one layer. The full stack also includes IP-reputation firewall, bot blocking, and behavioral detection.
What this costs
express-rate-limit is free. Redis costs whatever your hosting charges (often $5–$30/month for a managed instance, $0 if self-hosted). For most Express apps, total rate-limiting cost is under $10/month.
For a fully managed alternative that bundles rate limiting with bot blocking and IP reputation: see the free SecureNow Firewall. The firewall preload handles the IP-reputation layer; pair with express-rate-limit for application-aware limits.
A working production config
Putting it together:
import rateLimit from 'express-rate-limit';
import { RedisStore } from 'rate-limit-redis';
import Redis from 'ioredis';
app.set('trust proxy', 1);
const redis = new Redis(process.env.REDIS_URL);
const store = new RedisStore({ sendCommand: (...args) => redis.call(...args) });
const general = rateLimit({ store, windowMs: 60_000, max: 100 });
const auth = rateLimit({ store, windowMs: 5 * 60_000, max: 5, skipSuccessfulRequests: true });
const write = rateLimit({ store, windowMs: 60_000, max: 30 });
app.use(general);
app.post('/auth/login', auth, loginHandler);
app.post('/api/orders', write, ordersHandler);
That's the production-ready version. Add the firewall preload for IP-layer protection and you're covered.
Related
Frequently Asked Questions
What's wrong with the default express-rate-limit setup?
Three things: it uses in-memory storage by default (breaks across processes), it doesn't account for proxies (rate-limits everyone behind your CDN as one IP), and the default limit (5/15min) is too aggressive for legitimate web users.
Should I use Redis or in-memory?
Redis if you have more than one Express process. In-memory works for single-instance deployments but rate limits become per-process if you scale horizontally — defeating the purpose.
What about distributed rate limiting at the infrastructure layer?
AWS API Gateway, Cloudflare, NGINX — all do rate limiting at higher layers. They're appropriate for coarse limits (1000 req/min/IP). Application-layer rate limiting handles the per-endpoint, per-user nuances.
Can rate limiting break legitimate traffic?
Yes if misconfigured. The two common failures: limiting based on the proxy IP instead of the real client IP, and limiting too aggressively on endpoints that legitimate users hit in bursts (autocomplete, polling). Test with synthetic real-user traffic before deploying.
Recommended reading
If your team uses Sentry for frontend errors and needs backend distributed tracing without doubling the Sentry bill, here's the OpenTelemetry path that doesn't make you choose.
May 9Five approaches to bot blocking in Express, ranked by effort vs. effectiveness. From a 5-line allowlist to a full IP-reputation firewall — all without Cloudflare, AWS WAF, or any new infrastructure.
May 9Fastify hooks (onRequest) and the SecureNow preload both work cleanly. Here's the production setup for IP blocking and user-agent filtering.
May 9