
Question for teams doing chaos engineering: how do you choose experiment targets?
While working on a side project related to service reliability, I ran into a question that I’m curious about from people actually running chaos experiments. Most chaos engineering discussions focus on the types of experiments (latency injection, pod failure, network faults, etc.). But something less obvious is how teams choose where to run experiments in the first place.In a system with many microservices, there are lots of possible targets. Do teams typically: rotate through services over time prioritize ones that caused incidents focus on critical dependency paths rely on platform/SRE intuition something else? Interested to hear how this works in real environments.
Continue reading on Dev.to
Opens in a new tab

