
Production Maintenance Drill: An OPS Checklist in Action Validating Production Readiness Through Hands-On Practice
In this assignment, I stepped into the shoes of a production support engineer and performed a comprehensive maintenance drill on a live EC2 instance running an Nginx web server. The goal was simple but critical: validate that the production server is healthy, secure, and reliable — and document every step of the process. This drill simulated exactly what real DevOps and SRE teams do when they investigate production issues, perform routine maintenance, or respond to incidents. 📋 Phase 1: Network & Connectivity Checks Before diving into application-level checks, I verified that the server could actually communicate with the outside world. Network issues are often the root cause of "it's down" scenarios. Check Network Interfaces bash echo "Manjay Verma - Maintenance Drill" ip a What I observed: Active network interfaces with a valid private IP. Why it matters: If no interface is up, the server cannot communicate with anything — including you. Verify Default Gateway bash echo "Manjay Verma
Continue reading on Dev.to JavaScript
Opens in a new tab

