I Can Do That, Dave: Exploring AIOps
The movies are filthy with examples of artificial intelligence. Some, like the first Terminator, are evil. Some, like the Star Wars droids, work for the good guys. And so many of them are flatly iconic—Blade Runner (both of them), 2001: A Space Odyssey, War Games, Westworld (the movie and the HBO series), Matrix[i] … the list keeps going and going.
We aren’t at the point yet when we can get R2 to talk the Millennium Falcon—or the data center—and find out what’s wrong. But the prospect of having that kind of partner excites our imagination, and already we’re taking steps in that direction.
In this last installment of a three-part series, we’ll explore the budding area of artificial intelligence in IT ops—AIOps.
What Is AI Anyway?
If you remember from part 2 of this series, I defined automation as a list of instructions given to a device to be executed at regular time intervals or in response to a defined event. In contrast, artificial intelligence and its narrower subset, machine learning, takes the list away. AI designs machines to carry out tasks in a “smart” fashion, reasoning out the answer to a problem for itself. No more step-by-step instructions. Machine learning is one way of teaching the machine how to reason out its task, by giving the machine access to data and allowing it to learn from the data[ii].
As IT departments get tsunami-ed under the overwhelming amounts of data now being collected about network performance, it becomes easier and easier to see how AI would be helpful. One definition of AIOps even takes this ability to digest data into account in its definition:
AIOps refers to the use of artificial intelligence (AI) and machine learning to ingest and analyze large volumes of data from every corner of the IT environment, reducing its complexity by bringing data silos together with the means to filter them, detecting patterns, and clustering meaningful information for more efficient actioning[iii].
AIOps Use Cases: I Can Do That, Dave.
Just like automation, there are some jobs for which AIOps is or can be particularly suited, and some jobs it just shouldn’t. Like automation, AI can tackle jobs in which there’s just too much data to humanly process, help reduce downtime, and streamline IT operations. Unlike automation, AI hopefully can be proactive to changes in the network, rather than merely respond to a defined stimulus.
Some broad areas in which AIOps is seen as having great potential include:
- Capacity planning—mapping workloads to the proper compute configuration based on application need
- Resource utilization—refining already in-use predictive scaling to reconfigure network infrastructure to handle anticipated use
- Storage management—adjusting storage capacity proactively
- Anomaly detection—detecting and analyzing patterns in network performance to identify the root cause
- Threat analysis—riffing on AIOps’ ability to detect anomalies, AIOps can stand watch over the network to guard against potential intrusions and malicious activity[iv]
Of these five major areas, anomaly detection and threat analysis have the greatest potential. Current human ability to spot patterns degrades when there’s too much data, but AI handles terabytes of data like a fish in deep water. So, future AIOps tools will be able to augment human weaknesses.
Of course, AIOps is still in its infancy. And unlike automation, AIOps is simply not a DIY project for today’s IT department. But still, it’s great to look to the future, and dream of the day when R2 can tell you why your own Millennium Falcon—err, data center—can’t go to hyperspace[v].
That concludes this three-part series on ITOM, automation, and AIOps.
[v] Yes. I know Star Wars took place a long time ago in a galaxy far, far away. What kind of geek-girl do you think I am? Just roll with it.