Having and Testing Backups is Key to Running Systems
The title of this post says it all. In IT, we’ve said for years you need to take and test backups regularly to ensure your systems are being backed up correctly. There’s an adage in IT: “The only good backup is a tested backup.”
Why Do You Need Backups?
I bring this up because I have people coming to our consulting firm after some catastrophic event has occurred, and the first thing I ask for are the backups. Then the conversation usually gets awkward as they try to explain why there are no backups available. The reasons usually run the gamut from “we forgot to set up backups” to “the backup system ran out of space” to “the backup system failed months ago, and no one fixed it.”
Whatever the reason, the result is the same: there are no backups, there’s a critical failure, and no one wants to explain to the boss why they can’t recover the system. The system is down, and the normal recovery options aren’t available for one reason or another. In these cases, when there’s no backup available, the question becomes, “How critical is this to your business staying operational?” If it’s a truly critical part of your infrastructure, then the backups should be just as critical to the infrastructure as the system is. Those backups need to be taken and then tested to ensure the backup solution meets the needs of the company (and that the backups are being taken).
When planning the backups for a key system, the business owners need to be involved in setting the backup and recovery policies; after all, they’re responsible for the data. In IT terms, this is the Recovery Point Objective (RPO) and the Recovery Time Objective (RTO). In layman’s terms, these are how much data can the organization afford to lose and how long it takes to bring the system back online. These numbers can be anything they need to be, but the smaller they are, the larger the financial cost will be in completing this request. If the business wants an RPO and RTO of 0, and the business is willing to pay for it, then it’s IT's job to complete the request even if we don’t agree with it. And that means running test restores of those backups frequently, perhaps very frequently, as we need to ensure the backups we’re taking of the system are working.
Why Is It Important to Test Backups?
Testing backups should be done no matter what kind of backups are being taken. If you think you can skip on test restores because the backup platform reports the backup was successful, then you’re failing at taking backups. One of the mantras of IT is “trust but verify.” We trust that the backup software package did the backup it says it was going to do, but we verify it did the backup by testing it. If there’s a problem where the backup can’t be restored, it’s much better to find out when doing a test restore of the backup than when you need to restore the production system. If you wait until the production system needs to be restored to find out the backup failed, there’s going to be a lot of explaining to do as to why the system can’t be restored and what sort of impact that might have on the business—including, potentially, the company going out of business.