Designing a Monitoring Solution: Where Are Your Eyes?
Contemporary monitoring solutions are no different from any other distributed application in your environment. Even a small-scale deployment can include multiple components: often, these include a Web server, an application server, and a database server. And as with any distributed application, you need to consider placement for each component. More specifically, when you are charged with designing a monitoring solution for a virtualized environment, you need to consider whether you’ll virtualize your monitoring solution.
When you design a monitoring solution, you need to consider more than just what you’re monitoring. You need to consider where you’re monitoring. And where you monitor depends on why you’re monitoring.
Monitoring for Performance
If your primary reason for deploying a monitoring solution is to collect information on the performance of your virtualization infrastructure and applications, consider placing your monitors within the environment. That is, virtualize the servers that will poll for data, and mix them in with your other VMs. In this scenario, the performance monitors will enjoy all of the benefits associated with a virtual machine hosted in a cluster (high availability, quick configuration changes, and rapid backup and recovery, to name a few). And depending on your virtual networking topology, the collectors may have much faster, lower-latency access to the systems they’re monitoring.
It’s worth noting that when you place your monitor within the virtual environment, your observed metrics will be affected by the monitor itself. (Wave function collapse, anyone?) In other words, the load created by your monitoring solution will artificially inflate the utilization of your virtualization infrastructure. Aside from the novelty, this condition can be safely ignored.
Monitoring for Infrastructure Availability
When your primary goal of monitoring is to measure the availability of your infrastructure, place your pollers outside of the virtualized environment. The purpose of this topology is to isolate your monitoring solution from any problems within the virtualization infrastructure.
To use an extreme example, let’s say that your vSphere® hosts rely on a FibreChannel fabric to connect to shared storage. You’ve got 500 VMs spread across 12 hosts, and your systems monitor is virtualized in the same cluster. A rogue SAN engineer botches a config update on your switches, and suddenly none of your vSphere hosts can connect to the datastores. The VMs are stunned, then go dark. Normally, alarms would bathe your operations dashboard in a sea of crimson, but the outage just took out your monitoring capability. Of course, your
customers will serve as your backup monitoring solution.
In some cases, vSphere administrators choose to deploy a management cluster to address this condition. A management cluster is a cluster that hosts administrative VMs, and can be completely isolated (save for network access) from your production VMs. The gamble here is that any catastrophic outage that may interrupt your production environment will not affect your management cluster. In this case, your monitoring systems will continue to function as designed, will detect the outage on the production side, and log the events for notification and escalation accordingly.
Monitoring for Service Availability
If your main objective is to monitor the services you provide to your users and customers, consider deploying a cloud-based monitoring solution. While these solutions are relatively new to the market, they fill a vital gap in traditional monitoring: measuring service availability from a user’s perspective.
For example, you’re hosting several dozen websites that are globally available. You’ve got an internal systems monitoring tool in place, and your infrastructure dashboard is nothing but a forest of glowing green dots. But at your network edge, a firewall ruleset was put in place that blocked all of your Web traffic, and your sites are down. If you rely solely on your internal monitoring, you’d be under the impression that all systems were healthy. And yet the sole purpose for your infrastructure, to serve content to your users via the Web, is broken. Good luck telling your CIO that your servers were up even though the sites were down. A SaaS monitoring tool is blissfully unaware of the minutia of your infrastructure. It’s functionally a user. It attempts to connect to a Web page; that attempt is either successful or not. SaaS monitoring provides a user-centric perspective into your services, not a technology-centric view into your infrastructure.
Where Are Your Eyes?
It’s often said that a monitoring tool is the eyes of (and on) your operation; the placement of these eyes will determine what they’re able to see. Likewise, the placement of your monitoring solution will determine what it’s able to measure. Clearly and accurately define the primary role of your monitoring solution to determine the ideal topology.