Rate Limiting in Express That Actually Works (And 3 Ways It Usually Doesn't)

express-rate-limit is the right starting point. Here's how to configure it for production, when its in-memory store breaks, and the three subtle mistakes that make rate limits useless.

Lhoussine

May 9, 2026·7 min read

Rate Limiting in Express That Actually Works (And 3 Ways It Usually Doesn't)

express-rate-limit is the standard rate-limiting library for Express, and it works correctly when configured correctly. The problem is that the default configuration is wrong for most production deployments, and the failures are silent — your rate limiting "works" but is either trivially bypassed or accidentally blocking legitimate users.

Here's the production-grade setup and the three mistakes to avoid.

The basic setup

npm install express-rate-limit

import rateLimit from 'express-rate-limit';

const limiter = rateLimit({
  windowMs: 60 * 1000,
  max: 100, // 100 requests per minute per IP
  standardHeaders: true,
  legacyHeaders: false,
});

app.use(limiter);

This works for a single Express process serving requests directly. Most production deployments aren't either of those things.

Mistake 1: trusting the wrong IP

By default, express-rate-limit reads the IP from req.ip. This is req.socket.remoteAddress in stock Express — the address of whoever connected to your server. If you're behind a load balancer, CDN, or reverse proxy, that's the proxy's address, not the user's.

Result: every request from any user behind your CDN gets rate-limited as one address. You either set the limit so high it's pointless, or you accidentally rate-limit your entire user base when one user goes over.

The fix:

app.set('trust proxy', 1); // or true, depending on your topology

This tells Express to read X-Forwarded-For and use the real client IP. Validate the topology — you should trust proxy only as many hops as you actually have, otherwise attackers can spoof the header.

For Cloudflare specifically, use req.headers['cf-connecting-ip'] instead, since Cloudflare provides it definitively:

const limiter = rateLimit({
  windowMs: 60 * 1000,
  max: 100,
  keyGenerator: (req) => req.headers['cf-connecting-ip'] || req.ip,
});

Mistake 2: in-memory storage with multiple processes

express-rate-limit's default store is an in-memory MemoryStore. If you run more than one Express process (PM2 cluster mode, Kubernetes pods, multiple servers), each process has its own counter. An attacker hitting your service can spread their requests across processes and effectively multiply their limit by the process count.

The fix: use a shared store. Redis is the standard:

npm install rate-limit-redis ioredis

import rateLimit from 'express-rate-limit';
import { RedisStore } from 'rate-limit-redis';
import Redis from 'ioredis';

const redis = new Redis(process.env.REDIS_URL);

const limiter = rateLimit({
  windowMs: 60 * 1000,
  max: 100,
  store: new RedisStore({
    sendCommand: (...args) => redis.call(...args),
  }),
});

If your app already uses Redis (most Express apps do for sessions or caching), this is a 5-line addition. If it doesn't, the cost of adding Redis just for rate limiting is real — consider whether infrastructure-layer rate limiting (AWS WAF, Cloudflare) is enough.

Mistake 3: one global limit for all endpoints

A 100/min limit makes sense for /api/users but is way too low for /api/autocomplete and way too high for /auth/login. One global limit is always wrong for one half of your endpoints.

The fix: per-endpoint limits, with stricter limits on auth and write endpoints:

const generalLimiter = rateLimit({
  windowMs: 60 * 1000,
  max: 100,
});

const authLimiter = rateLimit({
  windowMs: 5 * 60 * 1000,
  max: 5, // 5 attempts per 5 minutes for auth
  skipSuccessfulRequests: true, // only count failures
});

const autocompleteLimiter = rateLimit({
  windowMs: 60 * 1000,
  max: 1000, // legitimate users do hit autocomplete a lot
});

app.use(generalLimiter);
app.post('/auth/login', authLimiter, loginHandler);
app.get('/api/autocomplete', autocompleteLimiter, autocompleteHandler);

The auth limit with skipSuccessfulRequests: true is specifically a credential-stuffing defense — only counts the bad attempts, so legitimate users who type their password right aren't penalized.

Per-user vs per-IP limiting

For authenticated endpoints, per-user limiting (by user ID) is more accurate than per-IP. A whole office network NAT'd to one IP shouldn't share a single rate limit:

const userLimiter = rateLimit({
  windowMs: 60 * 1000,
  max: 200,
  keyGenerator: (req) => req.user?.id || req.ip,
});

For unauthenticated endpoints, you have to use IP — there's no other identifier. For authenticated ones, prefer user ID.

What rate limiting doesn't solve

Rate limiting handles volume from a single source. It doesn't handle:

Distributed attacks. Many sources, each below threshold. Need different detection (see credential stuffing detection).
Bot traffic. Bots can stay under your rate limit and still cause real cost. Block them at the IP layer using reputation feeds — see block bot traffic in Express.
Application-layer abuse. Cart cycling, coupon stacking, business-logic attacks. Rate limiting doesn't see them as abuse.

Rate limiting is one layer. The full stack also includes IP-reputation firewall, bot blocking, and behavioral detection.

What this costs

express-rate-limit is free. Redis costs whatever your hosting charges (often $5–$30/month for a managed instance, $0 if self-hosted). For most Express apps, total rate-limiting cost is under $10/month.

For a fully managed alternative that bundles rate limiting with bot blocking and IP reputation: see the free SecureNow Firewall. The firewall preload handles the IP-reputation layer; pair with express-rate-limit for application-aware limits.

A working production config

Putting it together:

import rateLimit from 'express-rate-limit';
import { RedisStore } from 'rate-limit-redis';
import Redis from 'ioredis';

app.set('trust proxy', 1);

const redis = new Redis(process.env.REDIS_URL);
const store = new RedisStore({ sendCommand: (...args) => redis.call(...args) });

const general = rateLimit({ store, windowMs: 60_000, max: 100 });
const auth = rateLimit({ store, windowMs: 5 * 60_000, max: 5, skipSuccessfulRequests: true });
const write = rateLimit({ store, windowMs: 60_000, max: 30 });

app.use(general);
app.post('/auth/login', auth, loginHandler);
app.post('/api/orders', write, ordersHandler);

That's the production-ready version. Add the firewall preload for IP-layer protection and you're covered.

Frequently Asked Questions

What's wrong with the default express-rate-limit setup?

Three things: it uses in-memory storage by default (breaks across processes), it doesn't account for proxies (rate-limits everyone behind your CDN as one IP), and the default limit (5/15min) is too aggressive for legitimate web users.

Should I use Redis or in-memory?

Redis if you have more than one Express process. In-memory works for single-instance deployments but rate limits become per-process if you scale horizontally — defeating the purpose.

What about distributed rate limiting at the infrastructure layer?

AWS API Gateway, Cloudflare, NGINX — all do rate limiting at higher layers. They're appropriate for coarse limits (1000 req/min/IP). Application-layer rate limiting handles the per-endpoint, per-user nuances.

Can rate limiting break legitimate traffic?

Yes if misconfigured. The two common failures: limiting based on the proxy IP instead of the real client IP, and limiting too aggressively on endpoints that legitimate users hit in bursts (autocomplete, polling). Test with synthetic real-user traffic before deploying.

Rate Limiting in Express That Actually Works (And 3 Ways It Usually Doesn't)

Rate Limiting in Express That Actually Works (And 3 Ways It Usually Doesn't)

The basic setup

Mistake 1: trusting the wrong IP

Mistake 2: in-memory storage with multiple processes

Mistake 3: one global limit for all endpoints

Per-user vs per-IP limiting

What rate limiting doesn't solve

What this costs

A working production config

Related

Frequently Asked Questions

What's wrong with the default express-rate-limit setup?

Should I use Redis or in-memory?

What about distributed rate limiting at the infrastructure layer?

Can rate limiting break legitimate traffic?

Recommended reading