Home > Improving Data Quality Through DataOps

Improving Data Quality Through DataOps

Consider this scenario. Pull up your bank’s website and find the screen showing how much money is in your bank account. It takes 30 seconds to return the balance. You’re a bit annoyed at not having your balance immediately, and upon reflection, you think it might be a bit low, and hit refresh in case you misread the number. It comes back 45 seconds later with a different number. You know you haven’t had any transactions in the last minute. Uh oh. Something’s not right. You call the bank to see what’s going on, and customer service admits the system is running slowly after getting a recent upgrade. The rep sounds annoyed. You ask for your balance, and they give you a third number. From this point forward, you’re probably shopping for a new bank to move your money to. The largest challenge for data engineering is ensuring the data presented by the application to the end user is both accurate and delivered quickly enough to keep the application’s responsiveness acceptable. Delivering inaccurate or inconsistent results in your queries is possibly the fastest way to degrade your end users’ trust in your applications. Data errors not only cause small-scale issues like the scenario just discussed but can be larger in scope and cause unexpected application outages. This trust factor may never be fully restored afterwards. The performance of data delivery to the user is also a significant concern. Data continues to become more complex. At the same time, however, end user expectations have changed a lot. Whereas legacy applications might return data in a rather lengthy period of time, and the users were accustomed to that response time, modern application frameworks and mobile devices return data virtually instantly, and end users have grown accustomed to immediate gratification. A small delay in delivery can cause the end user to perceive a problem, whereas in the past it might have just been shrugged off. While software development methodologies have improved their process and quality controls tremendously in the past decade, database engineering has not kept up. Many data integration and verification processes are both manual in nature and inconsistently implemented, if any such processes even exist. Other times, the tooling designed to make a specific process more efficient strengthens the isolation between IT silos, which hurts collaboration. It’s time for your organization to implement a modern methodology called “DataOps” to improve the quality of the data throughout the entire data life cycle. DataOps core principles are designed to help organizations achieve several goals, including:
  • Solidify the data platform foundation
  • Provide reusable and repeatable components for data integration
  • Automate the validation process confirming data queries are delivering accurate data
  • Identify change and its impact
  • Monitor the end-to-end process for key metrics to help ensure the quality and speed of innovation and delivery are being improved.
DataOps provides fine-grained detail about application behavior immediately back to developers and testers at every stage of the data engineering process, as shown in Figure 1. Figure 1: Data engineering workflow is improved with DataOps DataOps is difficult to do without help, though, which is why specific tooling can be introduced at key steps to improve the feedback and quality of each process. For example, SolarWinds has a complete suite of products designed to make DataOps an almost frictionless upgrade to your environment.
  • SolarWinds Database Mapper can map and record the state of the data platform and notify administrators if a data platform component changes unexpectedly.
  • SolarWinds Task Factory can introduce reusable and predictable data integration frameworks to streamline the data integration processes, reducing the human error in manual tasks and guaranteeing expected results.
  • SolarWinds SQL Sentry monitors the performance of the SQL Server data platform and queries through the entire process to make sure performance is maintained.
Data quality is at the core of using these business-critical applications to improve your organization’s competitive edge. Embracing the key principals and tools of DataOps, and thus modernizing your data engineering processes to streamline your data operations, is vital to helping your organization provide more value and for your IT organization to move at the speed of your business.
David Klee
David Klee is a Microsoft MVP, VMware vExpert, SQL Server performance enthusiast, and virtualization architect. His areas of expertise are virtualization and performance, data center…
Read more