IT is ever-changing. The underlying technology demands change, but even more important is the change of the role of information technology itself.
For a long time, the IT department existed to support the business, and it was merely a cost center.
But modern IT should be driving change and innovation, and it should create a competitive advantage for the organization as a whole.
Sure, it’s a process and a transition, and no matter what step IT is in right now, there are different challenges.
One of the pitfalls of the transitioning process is proving the value of the department. Let’s take the five nines as an example. This means stuff runs flawlessly most of the time, and no one notices there are people doing their job on an exceptional level. But give it more than eight hours of downtime in a year, and the grief starts to grow.
Therefore, it’s important to measure the efficiency of the department. When looking at managing infrastructure—physical or virtual—organizations have collected availability statistics for many years just to prove a point: it works, and we did our job.
Unfortunately, this is no longer sufficient. IT is a strategic part of the company and therefore a strategic resource, and efficiency needs to be measured differently.
The general rule of thumb is this: the bigger and older the organization, the more complex the IT infrastructure. There’s loads of legacy systems IT can’t replace easily, and crucial infrastructure is all over the place.
Collecting data from every single element is possible but not necessarily meaningful.
How does the CPU co-stop value of a host translate to the success of the business?
The tools used to manage infrastructure
should show key performance indicators with statistical significance.
But unfortunately, the tools can’t know what’s significant for each organization.
If the business is an automotive supply chain, for example, the value is probably “goods produced per minute”—but for a travel agency, it’s “hotel rooms booked per minute.”
Proper management tools don’t come with a crystal ball to read these values, but they do offer open interfaces to pull data from various other systems and bring it into relation to other data points collected.
And this is where IT can show its real value. If this magic number goes down, it’s not only important to understand the reasons why but to know about it as soon as possible.
If the supporting data doesn’t show why there’s an issue, you’re probably collecting the wrong metrics.
An application is usually distributed across different layers and locations. The right management tool needs to be able to monitor the individual elements in the delivery chain and the connectivity between them—this is something people often forget.
As an example, employees use a web application and only interact with a browser. This can happen from the office or from home, and this is the only user interface they ever see. But most likely, the front end is running in a public cloud to increase reach and availability while the data is stored in a secured private cloud or even a local data center. On the “other side,” customers access a subset of the data from a different user interface, which is likely running in the cloud too, though not necessarily from the same provider.
Overall, this is quite a fragile construct, and a failure in one part of the chain will bring down the whole system—and probably the business with it. And though it’s true the expectation of failure should be part of every smart IT strategy, it’s also true that mean time to resolve is another significant performance indicator.
So how can a tool increase the value IT brings for the business?
Bad news first: yes, it’s necessary to collect loads of performance indicators, even if they don’t have much meaning from an isolated point of view. This also includes the CPU co-stop. But put this data together with other values, and they begin to make sense and can show the overall condition of health and performance. Once more advanced relationships are built, let’s say between different applications speaking to each other, those values become crucial. Suddenly, they show the hotel room bookings per minute or whatever is important to the business.
It’s also important to collect key performance indicators for a longer time frame to take guesswork out of the equation and replace it with educated knowledge. Modern tools help with automated baselines, machine learning, and anomaly detection.
Such technologies are invaluable in a modern IT environment, as the number of data points is no longer manageable for a human.
The more fragmented the infrastructure, the more complex it is to normalize data from different sources. In many organizations, we still see a version of an AS/400 as a workhorse, and as long as there are IT professionals available who can work with them, they have a reason to exist.
But management tools need to be able to pull data from it the same way as they do from a modern multi-cloud environment.
In IT monitoring, there’s no place for vendor lock-in.
On the opposite end of technological advancement, it’s necessary for the same tool to understand modern technologies like containers. If an application is delivered based on multiple dynamic containers, it’s no longer important for the IT pro to know which container is running right now or where it’s running. But management solutions need to understand the severity of a situation when n+1 is no longer given, and they should warn the responsible party.
There are loads of challenges in IT infrastructure monitoring
, and it’s important for tools to advance at the same pace as the underlying technology. What works fine today might not be able to deal with the change tomorrow brings.
Considering the increased visibility of the IT department as a whole, it’s crucial to prove its integral meaning for the business.