Three Ways to Become Data-Centric
January 10, 2019
Database
The conservation of quantum information is a theory that information can neither be created nor destroyed. Stephen Hawking used this theory to explain how a black hole does not consume photons like a giant cosmic eraser. It is clear to me that neither Stephen Hawking, nor any quantum physicist, has ever worked in IT.
Outside the realm of quantum mechanics, in the physical world of corporate offices, information is generated, curated, and consumed at an accelerated pace with each passing year. The similarity between the physical corporate world and the quantum mechanics realm is that this data is never destroyed.
We are now a nation, and a world, of data hoarders.
Thanks to popular processes such as DevOps, we are obsessed with telemetry and observability. System administrators are keen to collect as much diagnostic information as possible to help troubleshoot servers and applications when they fail. And the Internet of Things has a billion devices broadcasting data to be easily consumed into Azure and AWS.
All of this data hoarding is leading to an accelerated amount of ROT (Redundant, Outdated, Trivial information).
Stop the madness.
It’s time to shift our way of thinking about how we collect data. We need to become more data-centric and do less data-hoarding.
Becoming data-centric means that you define goals and problems to be solved BEFORE you collect or analyze data. Once these goals or problems are defined, you can begin the process of collecting the necessary data. You want to collect the right data to help you make informed decisions about what actions are necessary.
Here are three ways for you to get started on becoming more data-centric in your current role.
Start with the question you want answered. This doesn’t have to be a complicated question. Something as simple as, “How many times was this server rebooted?” is a fine question to ask. You could also ask, “How long does it take for a server to reboot?” These examples may seem like simple questions, but you may be surprised to find that your current data collections do not allow for an easy answer without a bit of data wrangling.
Have an end-goal statement in mind. Once you have your question(s) and you have settled on the correct data to be collected, you should think about the desired output. For example, perhaps you want to put the information into a simple slide deck. Or maybe build a real-time dashboard inside of Power BI. Knowing the end goal may influence how you collect your data.
Learn to ask good questions. Questions should help to uncover facts, not opinions. Don’t let your opinions affect how you collect or analyze your data. It is important to understand that every question is based upon assumptions. It’s up to you to decide if those assumptions are safe, and an assumption is considered safe if it is something that can be measured. For example, your gut may tell you that server reboots are a result of O/S patches being applied too frequently. Instead of asking, “How frequently are patches applied?” a better question would be, “How many patches require a reboot?” and compare that number to the overall number of server reboots.