Applications

Getting the On-Prem Part of Hybrid Cloud Right

Getting the On-Prem Part of Hybrid Cloud Right

In my previous articles, I’ve looked at the challenges and decisions people face when undertaking digital transformation and transitioning to a hybrid cloud model of IT operation. Whether it’s using public cloud infrastructure or changing operations to leverage containers and microservices, we all know not everything can or even should move to the public cloud. There’s still a need to run applications in-house. Yet everyone still wants the benefits of the public cloud in our own data center. How do you go about addressing these private cloud needs?

Let’s take an example scenario. You’ve decided it’s time to upgrade the company’s virtual machine estate and roll out the latest and greatest version of hypervisor software. You know that there are heaps of great new features in there that will make all the difference in your day-to-day operations, but that doesn’t mean anything to your board or to the finance director, who you need to win over to get a green-light to purchase. You need to present a proposal containing a set of measurables indicating when they are going to see a return on the money you’re asking them to release. In the immortal words of Dennis Hopper in Speed, “…what do you do? What do you do?”

First, you need a plan. The good news is, you can basically reuse the same one time and again. Change a few names here and a few metrics there, and you’ll have a winner straight out of the Bill Belichick of IT’s playbook.

The framework you should build a plan around has roughly nine sections you must address.

  • First, you need to outline the Scenario and the problems you’re facing.
  • This leads you to the Solution you’re proposing.
  • Then the Steps needed to get to this proposed solution.
  • Next, you would outline the Benefits of this solution.
  • And any Risks that might arise while transitioning and once up and running.
  • You would then summarize the Alternatives, including the fact that doing nothing will only exacerbate the issue.
  • After that, you want to profile the costs and compare it to the previous system for Cost Comparisons, detailing as much as possible on each, highlighting the TCO (Total Cost of Ownership). You may think you can finish here, but two important parts follow.
  • Highlight the KPIs (Key Performance Indicators).
  • Finally, the Timeline to implement the solution.

KPIs, or Key Performance Indicators, are a collection of statements related to the new piece of hardware, software, or even the whole system. You may say that we can reduce query time by five seconds during a normal day, or we will reduce power consumption by 10KWh per month. They have to be measurable and determinable via a value. You can’t say “it will be faster or better” as you cannot quantify these. Your KPIs may also have a deadline or date associated with them, so you can definitively say whether there’s been a measured improvement or not.

Sometimes it can be hard or near impossible to pull some of the details in the plan together, but remember, the finance department will have any previous purchases of hardware, software, professional services, or contractors in a ledger. Hopefully you know the number of man-hours per month it takes to maintain the environment, along with the application downtime, profit lost to downtime, and so on. Sometimes there will be points that you need to put down to the best of the knowledge at hand. Moving forward, you’ll start to track some of the figures that the board wants, showing a willingness to try seeing things from their perspective as the nuances of new versions of hardware and software can be lost on them.

Once you have a good understanding of the problem(s) you’re facing, you need to look at the possible solutions, which may mean a combination of demonstrations, proof of concepts (PoCs), evaluations, or try-and-buys for you to gain an insight into the technology available to solve the problem.

Next, it’s time to size up the solution. One of the hardest things to do when you adopt a new technology is to adequately size for the unexpected. Without properly understanding the requirements your solution needs to meet, how can you safely say the proposed new solution is going to fix the situation? I won’t go into how to size your solution as each vendor has a different method. The results are only as good as the figures you put in. You need to decide how many years you want to use the assets for, then figure out a rate of growth over that time. Don’t forget, updates and upgrades can affect the system during this timeframe, and you may need to take these into account.

Another known problem in the service provider space is the fact that you may size a solution for 1,000 users and you move 50 on to begin with, and sure enough, they get blazing speed and no contention. But as you begin to reach 500, the original users start to notice that tasks are taking longer than when they first started using the new system. You want to try to avoid this. You’ll start to get your original users complaining that they are not getting the 10x speed they had when they first moved, even though you spec’d it for a 5x improvement and they’re still getting 6-7x improvement over the legacy system. This pioneer syndrome needs some form of quality of service to prevent it arising—a “training wheels protocol,” if you will.

Now that you’ve identified your possible white knight, it’s time to do some due diligence and verify that it will indeed work in your environment, that it’s currently supported, and so on before going off half-cocked and purchasing something because you were wooed by the sales pitch or one-time only special end of quarter pricing.

I think too many people purchase new equipment and software and then rush to use the shiny new toy that they forget some of the most important steps: benchmarking and baselining. I refer to benchmarking as the ability to understand how a system performs when a known amount of load is put upon it. For example, when I have 50 virtual machines running, or I have 100 concurrent database queries, what happens when I add another 50 or 100? I monitor this increase and record the changes. Keep adding in known increments until you see the resources max out and adding any more has a degrading effect on the existing group. Baselining is getting measurements once a system goes live and seeing what a normal day’s operation does to a specific system. For example, you may have 500 people log on at 9 a.m. with peak load around 10:30. It then tails off until 2 p.m. when people start to come back from lunch, and spikes again around 5 p.m. as they clear their desks, finally dropping to a nominal load by 7 p.m. Only by having statistics on hand as to how everything in this system performs during this typical day can we make accurate measurements. Setting tolerances will help you decide if there’s actually a problem, and if so, narrow the search down when a user opens a support ticket. This process of baselining and benchmarking will ultimately help you determine SLA (Service Level Agreements) response times and define items that are outside the system’s control.

The point is, you’ll need a standard of measurement and documentation; and like any good high school science experiment, you’re probably going to need a hypothesis, method, results, conclusion, and evaluation. What I’m trying to say is you need to understand what you are measuring and its effect on the environment. Yes, there are some variables everyone’s going to measure: CPU and memory utilization, network latency, and storage capacity growth. But your environment may require you to keep an eye on other variables, like web traffic or database queries, and knowing a good value from a bad is critical in management.

The system management tools you’ll use should be tried and tested. If you cannot receive the results you need easily with your current implementation, it may be time to look at something new. It may be something open-source or it may be a paid solution with some professional services and training to maximize this new investment. As long as you’re monitoring and recording statistics that control your environment, you should be in a great position to evaluate new hardware and software options.

You may have heard comments around the “cloud being a great place for speed and innovation,” which is truer now more than ever with the speed at which they start to monitor and possibly bill you for your usage. I believe that to be a proper private cloud or on-premises part of a hybrid cloud, you need to be able to monitor detailed usage growth and have the potential to begin to show or charge departments for their IT usage. By monitoring hybrid cloud metrics, you can make a better-informed decision around moving applications to the cloud. As with any expenditure, you should also look at adding in functionality that starts to give you cloud-like abilities on-premises. Maybe begin by implementing a strategy to show chargeback to different departments or line of business application owners. Start making the move from keeping the lights on to innovation, and have IT lead the drive to a competitive advantage in your industry.

Moving from a data center to a private cloud isn’t as simple as changing the name on the door. It takes time and planning by implementing new processes to achieve goals and providing some form of automation and elasticity back to the business, along with ways to monitor and report trends. Like any problem, breaking it up into bite-size chunks not only gives you a sense of achievement, but also more control on the project going forward.

Whether you’re moving to or even from the cloud, the above process can be applied. You need to understand the current usage, baselines, costs, and predicted growth rates, as well as any SLAs that are in place, then how this marries up with the transition to the new platform. It’s all well and good reaching for the New Kid on the Block when they come around telling you that their product will solve everything up to and possibly including “world hunger,” but let’s be realistic and make sure you’ve done your homework and you have a plan of attack. Predetermined KPIs and deliverables may seem like you’re adding shackles to the project, but it helps keep you focused on the goal and delivering back results to your board.

Does investment equal business value? Spending money on the new shiny toy from Vendor X to replace aging infrastructure doesn’t always mean you’re going to improve things. It’s about what business challenges you’re trying to solve. Once you have set your sights on a challenge, it’s about determining what success is for the project, and what KPIs and milestones you’re going to set and measure. What tools you’ll use and how you can prove the value back to the board, so the next time you ask for money, it’s released a lot easier and faster.


Ruairi is a technical individual with over 14 years experience in the IT industry, and likes to think of himself as a “jack of all trades.” With experience in everything from networking, security, end user compute, and enterprise application deployments, as well as dabbling in the cloud, he's happy discussing many different areas within modern IT architecture. Ruairi's current main focus is with storage products and associated software, and understanding their associated strengths and weaknesses. He currently enjoys working with the reseller community within the U.K. channel to make sure they have all the necessary technical skill sets to improve and differentiate their business.