Earlier this year I found our organization in a, presumably, not uncommon situation: we have a VMware ESX 3.5[i] environment that consists of three clusters, approximately 15 hosts/200 guests and the hardware is end of life. We're far behind in preparing hardware and software platform roadmaps but needed to determine a way to refresh the hardware while providing significant expansion opportunities with the eventual introduction of vSphere in the next 6mo or so.
To start, here were some of our driving factors and considerations:
- While we have multiple hardware manufacturers today, we now have standards to which we should adhere. We don't want to maintain more makes or models than is absolutely necessary.
- Our current environment is over-subscribed (peak avg utilization per farm is about 60% CPU and memory) with our expected N+N fault tolerance
- We're in the middle of a network upgrade which will take our core, through access layers, to 10GB.
- We have invested in blade infrastructure that remains viable and exudes operational efficiency
- We operate at a 300:1 logical server to admin ratio and believe it can be much higher
I set out to answer three things:
- What capacity do we demand today, blending average clusters and guests?
- What hard or soft limits or best-practices should be respected in engineering a replacement?
- Given our needs and possible hardware spend to satisfy a refresh, what is the average cost of a virtual machine? Cost must include infrastructure, server hardware, hypervisor, storage, and ancillary virtualization support products such as monitoring.
Answering the first question was relatively easy with a quick custom performance report from MOM. I blended average and peak utilization of existing VM CPU and memory in a MOM report.
I realize this alone would be an oversimplification of virtualization candidacy or demand; however, blending over the variety of guests we have and being comfortable with both our network and storage subsystems, I'm not particularly concerned with IO. Also, I am similarly confident that our high performance and demand systems are not yet virtual. The vast majority of our systems fit the mold of a single vCPU and 512-1024MB memory. I would never advocate using this methodology for the virtualization of specific workloads, but to maintain our existing systems, this will be adequate.
Answering #2, was out of our depth so we called on VMware and our local engineer to fill in these gaps. The two primary soft restrictions we placed on our calculations were to limit each physical host to 50% CPU at peak and no more than 30 guests per host. Admittedly, these numbers should be revised, but for the sake of time, I'll allow you, the trusted reader, to adjust to what is currently supported or recommended.
The final question can be assisted using this spreadsheet as a starting point. It was built for our environment, but modifying a few sheets or formulas should allow it to be tailored to anyone's environment, hardware configuration & prices, and compute capacity limits. System specifics were stripped and other elements made a bit more generic. Some quick reference notes:
- Blade RU were calculated as bladeRU=(Chassis RU/slots) * (slots per blade)
- Chassis column represents the per slot cost of chassis infrastructure. So calculate the total cost of a chassis with no servers and divide it by the number of slots in the chassis, multiply that by the number of slots consumed by the particular model's chassis footprint.
- Qty column is free to help you guage total capacity achieved and potential wasted given soft/hard limits
- Current VM usage sheet identifies model farm configurations and utilization
- For whatever reason, the stoplight conditional formatting never saves properly.... I'm no excel guru
The financials included in this document were based on HP's online pricing data many moons ago and should only be used as an illustration. I suggest consulting your select hardware vendors or VARs to price specific configurations.
I have to admit that I was VERY surprised at the results of the exercise (and how frustrated I became with my beloved Excel). I would never have guessed the BL490s with solid state drives and 10GB to the host would have provided us with the best per-VM cost. Did we go with those? No, but never should price alone dictate direction.
For reasons I'll not divulge in this post, we went a very different direction due to some specific requirements that I did not include above for confidentiality reasons, but that does not negate the results illustrated in the spreadsheet. The document is not an exact blueprint, but if you have some intermediate Excel skills and know your way around virtualization requirements, this may provide you with a launching point to tailor a document that will help you with a similar refresh or new deployment.
Up next: fixing our chargeback model, working to develop a subscription based utility compute offering, self-service, and a monitoring overhaul.



