Your pm.max_children Math Is Wrong: Why Averages Kill Production Stability

Your pm.max_children Math Is Wrong: Why Averages Kill Production Stability TL;DR: Don't set pm.max_children based on average worker memory — use P95 RSS instead. Measure it under peak traffic, apply a 1.2x safety factor, then cap workers by CPU (8–12/core for IO-bound, 2–4/core for CPU-bound). Set pm.max_requests to recycle workers, request_terminate_timeout to kill stuck ones, and monitor the FPM status page for saturation signals. Most engineers configure pm.max_children by dividing available RAM by average worker memory. Simple division, clean number, ship it. But production traffic doesn't follow averages — a handful of heavy requests can push workers well past that number, and suddenly your "safe" config is swapping to disk or getting OOM-killed. Instead of sizing for the average, size for what actually breaks things: P95 memory usage and how many workers your CPUs can realistically handle. The Trap of Average-Based Sizing Not all requests are equal. Most of your traffic is probab

Your pm.max_children Math Is Wrong: Why Averages Kill Production Stability

Related Articles

How to Use Claude Code for Free — No Subscription, No Tricks

Nobody Warned Me About This Part of Being a Junior Developer

Talent gets the spotlight. Discipline builds the legacy.

Coding in the Age of Co-Pilots: Why Developers Who Think Will Win

Two more EVs for the trash heap: Volvo EX30 and Honda Prologue

Related Articles

How-To
How to Use Claude Code for Free — No Subscription, No Tricks
Medium Programming • 7h ago

How-To
Nobody Warned Me About This Part of Being a Junior Developer
Medium Programming • 8h ago

How-To
Talent gets the spotlight. Discipline builds the legacy.
Medium Programming • 9h ago

How-To
Coding in the Age of Co-Pilots: Why Developers Who Think Will Win
Medium Programming • 10h ago

How-To
Two more EVs for the trash heap: Volvo EX30 and Honda Prologue
The Verge • 11h ago