Balancing Stability and Agility
“The price of reliability is the pursuit of the utmost simplicity.” C.A.R. Hoare, Turing Award lecture.
Software and computers in general are inherently dynamic and not of a state of stasis. The only way IT, servers, software, or any other thing that has 1s and 0s on it can be perfectly stable is if they exist in a vacuum. If we think about older systems that were offline, we frequently had higher levels of stability–the exchange for that is fewer updates, new features, and longer development and feedback cycles, which meant you could wait years for simple fixes to a relatively basic problem. One of the goals of IT management should always be to keep these forces in check–agility and stability.
AGILE’S EFFECT ON ENTERPRISE SOFTWARE
Widespread of adoption of Agile frameworks across development organizations has meant that even enterprise focused organizations like Microsoft have shortened release cycles on major products to (in some cases) less than a year, and if you are using cloud services, as short as a month. If you work in an organization that does a lot of custom development, you may be used to daily or even hourly builds of application software. This causes a couple of challenges for traditional IT organizations in supporting new releases of enterprise software like Windows or SQL Server, and also supporting developers in their organization who are employing continuous integration/continuous deployment (CI/CD) methodologies in their organizations.
HOW THIS CHANGES OPERATIONS
First, let’s talk about supporting new releases of enterprise software like operating systems and relational database management systems (RDBMS). I was recently speaking at a conference where I was asked, “How are large enterprises who have general patch management teams supposed to keep up with a monthly patch cycle for all products?” This was a hard answer to deliver, but since the rest of the world has changed, your processes need to be changed along with them. Just like you shifted from physical machines to virtual machines, you need be able to adjust your operations processes to deal with more frequent patching cycles. It’s not just about the new functionality that you are missing out on. The array and depth of security threats means software is patched more frequently than ever, and if you aren’t patching your systems, your systems are vulnerable to threats from both internal and external vectors.
HOW OPERATIONS CAN HELP DEV
While as an admin I still get nervous about pushing out patches on the first day, the important thing is to develop a process to apply updates in near real-time to dev/test environments, with automated error checking, to then relatively quickly move the same patches into QA and production environments. If you lack development environments, you can patch your lower priority systems first, before moving on to higher priority systems.
Supporting internal applications is a little bit of a different story. As your development teams move to more frequent build processes, you need to maintain infrastructure support for them. One angle for this can be to move to a container-based deployment model–the advantage there is that the developers become responsible for shipping the libraries and other OS requirements their new features require, as they are shipped with the application code. Whatever approach you take, you want to focus on automating your responses to errors that are generated by frequent deployments, and work with your development teams to do smaller releases that allow for easier isolation of errors.
The IT world (and the broader world in general) has shifted to a cycle of faster software releases and faster adoption of features. This all means IT operations has to move faster to support both vendor and internally developed applications, which can be a big shift for many legacy IT organizations. Automation, smart administration, and more frequent testing will be how you make this happen in your organization.