Server and Application Reliability
“The loveliest trick of the Devil is to persuade you that he does not exist!" – C. Baudelaire, 1864
In IT, the hardest system issue to solve is the one you don’t know (or won’t admit) you have. I’m not saying that IT professionals willfully ignore or explain away instability. Far from it! We tend to work tirelessly to create environments that are robust, that can survive the greatest reasonable amount of perturbation, and keep delivering consistent services.
It’s the “tirelessly” part that does us in. Our willingness to take any given system failure, (along with the associated tickets, group calls, management updates, and post-mortems) and not only work doggedly to figure out what happened, but also work doggedly time after time after time to repeat the same fix when it crops up again later.
In fact, many of us (myself included) take it as a point of pride when we can say, “I’ve seen this problem before! I know JUST how to fix it!”
That’s the devil in our midst: the belief that manual inspection and intervention are the only options. That’s what keeps us working long hours, burning weekends and holidays, and generally putting the word “tireless” to shame.
The reality is that monitoring tools are improving all the time, keeping pace with advancements in enterprise architectures and application scalability.
And that’s what I like about the story this infographic speaks to. It’s not just that SolarWinds has a set of tools to help you solve a problem, it’s simpler than that: TOOLS EXIST to help you solve this issue. You don’t have to run commands manually, or write your own scripts, or figure out which parameters and counters are relevant. It’s been done.
So break out of the endless cycle of working longer and harder to keep things running.
Because you know what? I bet you’re tired and could use a break.
Improving Service Reliability with Unified Visibility into Servers and Applications