With Virtual Machines, GHz are King (or Queen)!
Posted 7/1/2010 at 11:32 AM by Jason HallWhether we like it or not (I’m not going to approach that political battle), virtualization is becoming a mainstay in not only our development environments, but also production. Whether or not you agree with virtualizing production servers, you are eventually going to have to manage and tune them, either at your current job or your next. I want to take this opportunity to explain and show the best way to analyze CPU utilization in a virtual environment. The examples we show here are using VMWare vSphere 4 but the concepts apply regardless of your virtual platform.
Historically, when we look at a physical machine, the simple metric of CPU % Used is a pretty good measure of how busy a SQL Server is. A server that is at one time showing 20% CPU utilization and then at another time is showing 80% CPU utilization is generating 4X as much work during the later time period. In an operating system that is being hosted by a VM, we lose the luxury of knowing exactly how much CPU horsepower we have at any given time. The are a few factors that attribute to this “grey area”:
- A VMWare admin has the ability to set an upper limit on the amount of CPU that is available to your VM. They can also set a reservation to guarantee an amount of CPU to your VM.

I will not see this limit in My Computer -> Properties nor will I see any representation of this limit in task manager or perfmon. Let’s assume that throughout the morning I have 2GHz available to my VM and am showing 20% CPU utilization. Later in the afternoon, my VMWare admin needs to free up resources so they put a 1 GHz cap on my available CPU. Now the exact same workload will show 40% CPU utilization. Nothing has changed on the OS or in my SQL Server workload, yet I am showing twice the CPU %. See where this gets confusing? - Even if the VMWare admin hasn’t set any resource cap on your VM’s available CPU, the ESX host could simply become overloaded. Let’s say an ESX host has 8GHz of total processing power, and that host has 5 VM’s running on it. Normally each VM uses about 1GHz of processing power, but all of a sudden, each VM needs 2GHz. Like fitting 10 pounds of feathers into a 5 pound bag, something has to give. What gives, is that ESX has to dynamically scale down each VM’s available CPU to account for the increased workload. As a result, you may see 80% CPU utilization when looking at perfmon or task manager, but you have no idea what that 100% is of.
- A virtual machine may not be tied to a single ESX host. For DR or performance reasons, a VMWare administrator can move your VM from ESX host to ESX host without you knowing. These ESX hosts also need not offer the same performance as one another. You could be chugging along just fine during the morning with 4 GHz of processing power, and then in the afternoon be switched to an ESX host with 3 GHz of processing power. Not only did you not know that this occured, but your VM’s CPU % will go up, even though the workload is unchanged.
Because of this, it is absolutely critical that you not simply look at CPU % as a measure of how busy SQL Server is or how much CPU it is using. Percentages are always relative to a ceiling, and when that ceiling can move up or down at will (or whenever a VMWare admin decides that your ceiling is too high), the percentage itself loses meaning. CPU% analyzed in conjunction with GHz used will allow you to paint the full picture of a VM’s CPU requirements. Unfortunately, this data is not available by looking purely at the OS. You will need metrics from the virtual layer as well. That data is readily available in the built in VMWare client tools (vSphere), but you’ll either need to have access to the ESX or vCenter instance to view them, or a tight relationship with the VMWare admin who can send them to you.

For a better way to have this information at your fingertips, stay tuned…

