Storage, Replication, and Backups
September 20, 2019
Database
This is the fourth post of my series on hyperconverged infrastructure (HCI) architectural design and decision-making. For my money, the differences between these diverse systems is a function of the storage involved in the design. On the compute side, these environments use x86 and a hypervisor to create a cluster of hosts to support a virtual machine environment. Beyond some nuances in the hardware-based models, networking tends toward a similar approach in each. But often, the storage infrastructure is a differentiator.
Back in 2008, LeftHand Networks (later acquired by HPE) introduced the concept of a virtual storage appliance. In this model, the storage would reside within the ESX servers in the cluster, become aggregated as a virtual iSCSI SAN, and allow for redundancy through the nodes. Should an ESX host crash, with the standard function of the VMs rebooting on a different host in the cluster, the storage would allow for consistency regardless. By today’s standards, it’s not at all inelegant, but lacks some of the functionality of, for example, vSAN. VMware vSAN follows a similar model, but can also incorporate deduplication, hybrid or all solid-state disc, and compression. To me, vSAN used in vSAN-ready nodes, also a component of Dell/EMC VxRail product, is a modernized version of what LeftHand brought to the table some 11 years ago. It’s a great model and eliminates the need for a company to build a virtualized infrastructure to purchase a more traditional SAN/NAS infrastructure to connect the virtualized environment. Cost savings and management make this more cost-effective.
Other companies in the space have leveraged the server-based storage model. The two that spring most rapidly to mind are Nutanix and Simplivity, who have built solutions based on packaged single SKUed boxes built around a similar model. Of course, the way to manage the environments are different, but support the goal of managing a virtual landscape with some aspects of differentiation (Nutanix supports their hypervisor, Acropolis, which nobody else does). From a hardware perspective, the concept of packaged equipment sized to manage a particular environment is practically the same: x86 servers run the hypervisor, with storage internal to each node of the cluster.
I’ve talked previously about some of the scalability issues that may or may not affect end users, so I won’t go deeper into it here. Feel free to check out some of my previous posts about cluster scalability issues causing consternation about growth.
But storage issues are still key, regardless of the platform you choose. I believe it’s one of only two or three issues of primary concern. While compression, deduplication, and the efficiency of how SSD is incorporated are key to using storage, there’s more. One of the keys to backing up the data in a major use-case for HCI, the hub-and-spoke approach in which the HCI sits on the periphery and a more centralized data center resides as the hub, is the replication of all changed data from the remote to the hub, with storage awareness.
I feel many of the implementations I’ve been part of have had the HCI as ROBO (remote office/back office), VDI, or a key application role and require a forward-thinking approach to the backup of these datasets. If you, as the decision-maker, value that piece as well, look at how the new infrastructure would handle the data and be able to replicate it (hopefully with no performance impact to the system) so all data is easily recoverable.
When I enter these conversations, if the customer doesn’t concern themselves with backup or security from the ground-up, mistakes are being made. I try to emphasize this is likely the key consideration from the beginning.