- The absence of points near the centerline
- The absence of points near the control limits
- The presence of points outside the control limits
- Other unnatural patterns (systematic (autocorrelative), repetition, trend patterns)
4 Statistical Process Control Rules That Detect Anomalies in Systems
June 26, 2013
Database
Statistical Process Control (SPC), or using numbers or data to study the characteristics of our process to make it behave the way we want it to behave, has been around since 1924 and was invented by Walter Shewhart. It has been widely adopted in manufacturing, logistics, and other operationally focused parts of the business world; but seems less well known in the world of systems administration and web operations. It’s part of the core curriculum in every business school in the world. Not that it really matters for IT folks, but it strikes me as strange that it’s not more common in the monitoring world.
Etsy’s release of Skyline uses algorithms common to SPC is a huge win for filtering out noise from metrics and threshold-based systems. In computer systems, improbable things happen more than you’d predict, and SPC methods aren’t good enough at preventing false positives. They're useful for filtering out the noise, but many users will experience many false positives even with SPC. Many of the SPC formulas are written for a process with a stable mean, meaning they measure the output of an end product (e.g., a drill hole size, or a defect rate), not dynamically changing metrics. The choice of metrics becomes particularly important when you apply SPC to IT monitoring because there isn’t a good analogy that translates to the computer world.
Nonetheless, Western Electric published a book in 1956 called the “Statistical Process Control Handbook,” which has some very useful rules for detecting anomalies and might be helpful to those implementing SPC in IT systems.
The rules attempt to distinguish unnatural patterns from natural patterns based on several criteria: