Back to articles
Your pm.max_children Math Is Wrong: Why Averages Kill Production Stability
How-ToSystems

Your pm.max_children Math Is Wrong: Why Averages Kill Production Stability

via Dev.toMahmoud Alatrash

Your pm.max_children Math Is Wrong: Why Averages Kill Production Stability TL;DR: Don't set pm.max_children based on average worker memory — use P95 RSS instead. Measure it under peak traffic, apply a 1.2x safety factor, then cap workers by CPU (8–12/core for IO-bound, 2–4/core for CPU-bound). Set pm.max_requests to recycle workers, request_terminate_timeout to kill stuck ones, and monitor the FPM status page for saturation signals. Most engineers configure pm.max_children by dividing available RAM by average worker memory. Simple division, clean number, ship it. But production traffic doesn't follow averages — a handful of heavy requests can push workers well past that number, and suddenly your "safe" config is swapping to disk or getting OOM-killed. Instead of sizing for the average, size for what actually breaks things: P95 memory usage and how many workers your CPUs can realistically handle. The Trap of Average-Based Sizing Not all requests are equal. Most of your traffic is probab

Continue reading on Dev.to

Opens in a new tab

Read Full Article
0 views

Related Articles