Tuesday, May 12, 2009

Two years with Integrity VMs

I've been running Integrity VMs for two years now. I started with HPVM 3.0, moved up to 3.5, then 4.0.

Here are the essentials of what has happened in that timeframe:
  • There has been more and more demand for HP-UX VMs from my users, as they can be installed very quickly.
  • The old "host consolidation" way of thinking is now dead and buried; instead of consolidating, let's just boot off a new VM! Hardware gets consolidated, but not operating systems. This requires management tools and procedures... as well as a lot more IP addresses.
  • A few production, but not mission-critical, systems have been installed in HPVMs since I migrated to 4.0
  • Performance with 3.0 was subpar, but with 3.5 came AVIO and this helped a lot. AVIO rocks. The performance with it is excellent.
  • 4.0 introduced the new storage stack, with native multipathing and built-in APA
  • I initially used a combination of the VxFS backend (slow) and LVM backend (painful to manage) but switched mostly to raw devices for increased performance and to benefit from my SAN features such as cloning and snapshots.
  • I've had a few VM guest crashes. HP Support is good at troubleshooting the dumps quickly. Be sure to have a /var/adm/crash ready, or at least free space in /var, to be able to have dumps.
  • Each release of HPVM seems to be rushed, as there is often an HPVM CORE patch available almost the same day as when the new revision was released to the general public. You have to search for it in the ITRC when you install the VM Host.
  • We've had a small RHEL deployment and I asked an intern to install Red Hat Linux in an HPVM to evalutate it, it works, but we had to use an outdated version and it was deemed too exotic as a platform so I didn't pursue this project and had it deployed on Proliants and VMware VMs instead.
Here is what I would like to see in the future:
  • NPIV support. VMware has had this since last year. This will make VMs truly transparent to the SAN administrator, and prevent potential mistakes on the VM host.
  • A better VM Manager. Frankly, the bottom line is that most SMH-based tools truly suck and VM Manager goes into the lot. I've had so much problems with it that I rarely use it and became proficient with the CLI.
  • Better integration with GlancePlus. The only way to have statistical data on VMs is to create an "Application" and view it from GlancePlus or Performance Manager. It works, but you have to think of configuring the Performance Agent each time you create or delete a VM. It would be nice for this to be done automatically. hpvmsar is a start, but how about a simple "esxtop" clone.
  • Clustering of VMs with Online Guest Migration has to be easy to do. If it uses ServiceGuard, that is fine, but it has to be EASY. Virtual Center makes clustering VMs a two-minute job. I'd expect the same with HPVM.
  • How about an "HPVM cluster in a big box", with a fully configured blade chassis full of clustered bl860c's... all that with a lean HP-UX host distribution that is completely flashed on SSDs, similar to ESXi, and for which we don't have to do anything under the hood. That would be really cool.
That's it for now.