Yusuf Ozturk » CAU 2014

Blogroll

Badges

Community

Cluster-Aware Updating (CAU): Re-imagined by Real Life Situations

Posted in Windows Powershell, Windows Server | No Comment | 13,349 views | 22/06/2014 17:35

My name is Yusuf Ozturk. I’m a PowerShell MVP and a system engineer at the Cloud team of a private bank where we have biggest Hyper-V production environment in Turkey. Managing large virtualization environment could be a headache if you don’t do your growing plans well or if you are growing quickly than you are expected. Patching Microsoft Updates to your Hyper-V Clusters could be problematic if you face with following issues:

1. Don’t have enough memory
2. Don’t have enough disk space

That means, your cluster is over-committed..

Also this issues could increase your pain in this scenario:

1. VMs with large memory (like 64GB memory or more)
2. VMs with vHBA
3. VMs with Passthrough Disks

Because you should do a good planing if you have large VMs in a over-committed environment because migrating that VMs won’t be an easy job than you think. Also misconfigured SAN switches, WWN configurations etc could be a problem for VMs with vHBa. Also drivers and firmwares could cause some issues for vHBA. As my experience, best way for migrating a VM with vHBA is keeping that offline. Otherwise you can lose disk connectivity anyway and it could damage your system, even you can lose data. So using clustered vHBA VMs can increase your uptime but also prevents data lose.

For the Passthrough disks, Hyper-V always get some issues with PT disks, couldn’t handle it well. So Microsoft doesn’t recommend it anymore maybe they will stop support with vNext, who knows..

So with this scenarios, current built-in Cluster-Aware Updating (CAU) has some problems:

1. CAU doesn’t know your VMs with vHBA or PT, because Cluster is handling them..
2. CAU doesn’t know your memory issue
3. CAU doesn’t know your disk issue
4. CAU doesn’t know if VM has a large memory

That’s why you can see that, Cluster tries to send a VM with 64 GB memory into a Hyper-V host with 32 GB free memory. Because Cluster only checks free memory in Cluster nodes. So i re-imagined CAU and wrote my own CAU process. What I do in my scenarion:

1. I can suspend VMs with vHBA and PT disk unless Cluster Node is online. (This is my choice, you can always use Live migrations)
2. I know which Cluster node has largest free memory. But what I know more is, which VM has largest memory as well. So I start with largest memory footprint VM, migrate it to best Cluster node. After that I can start migrating low memory VMs into that Host because there are still some memory for VMs like 2 GB, 4 GB etc..
3. I don’t leave migrations to Cluster Management. I do my own migrations. So that will give me my own control mechanism and that will give me flexibility. I can set timeout for VM migrations, so if a VM could not migrate in timely fashion, I can migrate it offline..

This is my own steps. You can always edit my script for your own requirements. Now lets see my new CAU script: