Thursday, October 30, 2008

Igniting Integrity VMs

For the last year, my VMs under IVM 3.0 and 3.5 were mostly installed one by one. But since I installed a huge IVM 4.0 server for more critical environments, I've started seriously using Ignite-UX to install VMs.

I was surprised: I think I can beat my Windows administrator colleague by deploying HP-UX VMs quicker than he can do Windows VMs under ESX. I counted 30 minutes from the inital hpvmcreate to the last boot.

The core media way
This one is simple, but installations are long. They will take at least 2 hours since installing from core media uses straight SD packages and they're slow to install.
1. Copy the core media of your HP-UX release to a depot. Take a recent one - it will have the VM libraries, and AVIO drivers as well. It's well documented on the Ignite-UX quick start guide.
2. Build a Ignite-UX boot helper using make_media_image. Don't burn it - just keep the iso and transfer it to your VM Host. I prefer using the boot helper since DHCP can't work across subnets, and it's more complex to setup than just use a boot helper (furthermore all our subnets are managed by a Windows DHCP server, and I can't fudge into booting Integrity servers which don't work with PXE yet for HP-UX. Yuck.)
3. Configure your VM with AVIO if possible. Boot your VM with the boot helper, contact the Ignite-UX server and install from there.

The Golden Image way
This one is pretty fast, assuming you have a gigabit network.
1. Create a small VM to build your image - I aptly named it "genesis". You can install it using the above method.
2. Configure it to your taste,
3. Add the latest VM Guest package and AVIO drivers (they are available from the software depot)
4. Use make_sys_image to build your golden image, and setup your configuration files. It's well documented in the Ignite-UX documentation

To deploy a VM, boot it with a .iso boot helper (see above), and ignite with your Golden Image. Use AVIO for lan and disk. It's so damn quick that I didn't even have time to finish my lunch when I tried it today.

Good luck

Wednesday, October 29, 2008

Quick review of Integrity VM 4.0

I've been a user of IVM since 3.0, and I'm about to finish putting in production a fairly big server that will host a bunch of VMs.

One of the big drawbacks of versions prior to 3.5 was the lack of a built-in MPIO. You had to either purchase the expensive SecurePath, or use PVLinks which forced you to use the LV backend. I used PVLinks, but the concept of having to manage VGs both inside my VMs, and one level upwards on the host, was complex. I wouldn't suggest it to anyone who is not familiar with LVM. On the upside, using VGs on the host can prevent mistakes since PVs are harder to corrupt than raw disks.

Furthermore, to benefit from network redundancy, APA had to be purchased seperately, which also increased costs. So of course the big advantage of 4.0 is the 11iv3 host, that lets you use its built-in MPIO. Furthermore, the VSE-OE now includes APA for free (It was about time). So these two items are covered. And did I say that APA was now very easy to configure? I'm not fond of the System Management Homepage, but the APA configuration in it is now dead easy, and quick. Only a linkloop feature is missing.

The agile addressing still seems weird to me, it's not as simple as usingSecurePath, but I'm catching on slowly. Actually finding the LUN of a device is a hard task, I'll have to rewrite for 11iv3 for this matter.

ESX administrators are used to managing files. They're easy to move around, and you can "see" them, which prevents mistakes. It's a similar paradigm as a DBA preferring files to raw devices. In this area, there is one improvement: AVIO is now supported with a file datastore. Even with a tuned VxFS, I found the files datastore to be slow when I did tests with 3.0 last year, you can be sure I'll try again this time.


Monday, October 27, 2008

Understanding all the RSP components

N.B. My updated diagram from December 2009 is here

This blog entry is updated regularly. Latest updates:

  • November 4th 2008
  • November 19th 2008
  • December 10th 2008
  • December 16th 2008
  • Feburary 20th 2008

Having read (diagonally) over 1000 pages of documentation related to every component that RSP includes, here are my notes that might be of help. This is definitely not all accurate. When I find inconsistencies, I'll update this blog post.

The bottom line is that you no longer have a simple ISEE client running on your HP-UX host anymore. It's now much more complex than this.

There's a bunch of "new" tools that will become part of your life. In fact these are "old" tools that have been available for years. They're now tightly welded together, run on a central server (CMS) instead of locally on each monitored host, and for the most part do not need to be configured independently, but it's important to understand what each one does.

SysFaultMgmt (System Fault Management) - runs on the HP-UX server
It's the "new generation" of EMS, that speaks WBEM. Using WBEM, it can be integrated easily in SMH (System Management Homepage) and SIM (Systems Insight Manager). SysFaultMgmt used to work in parallel with traditionnal EMS monitors, but since HP-UX 11iv3 March 2008, it seems to switch off EMS and replaces it completely. EMS will be eventually EOL'd.

EVWeb - runs on the HP-UX server
A companion to SysFaultMgmt which is a GUI that lets you query and manage WBEM subscriptions. There's also an evweb CLI, which will let you extract events and see their contents (they look similar to EMS's event.log file). The CLI has a man page, it's not hard to use. Be careful: I've played with evweb from SMH, sometimes it crashed, and it resulted in some evweb CGI's spinning endlessly, taking 100% CPU. The CLI is probably more robust.

System Insight Manager agent - runs on Proliants running VMware ESX and probably Windows as well

This agent includes a good-old System Management Homepage, along with hardware diagnostics agents. If the agents detect that something goes wrong, they are configured to send an SNMP trap to the CMS.

OSEM - runs on the CMS
OSEM is an agent that analyzes SNMP events that are sent to it. It filters them, and translates them to a human-readable form which can be sent by e-mail and/or to ISEE. By filtering, I mean that will be find out if an SNMP trap send by a device is actually an important one, and decide if it's necessary to generate a service event for it.

OSEM supports mostly systems that reports their events using SNMP:

  • Proliant servers running Linux, Windows or VMware ESX.
  • Integrity Servers running Linux
  • SAN switches
  • MSA enclosures
  • Bladesystem chassis (simply configure the OA to send SNMP traps to the CMS)

WEBES - runs on the CMS
WEBES is an analyzer that processes events in a similar fashion to OSEM that are sent to it from these primary sources:

  • Event log on a Windows Server
  • WBEM subscriptions
  • Interactions with Command View to gather data for EVAs

From my understanding, it does not "translate" the WBEM events to a readable form as OSEM does, since the WBEM events already contain the information.

WEBES supports mostly:

  • Integrity servers running HP-UX, through WBEM subscriptions
  • EVAs by reading the event long on the Storage Management Server through ELMC, and by logging directly into Command View

Now there seems to be some places where WEBES and OSEM overlap each other, and I haven't understood yet to what extent these tools talk to each other. From the OSEM documentation, it seems that WEBES sends events to OSEM, and OSEM then manages the notification.

Why is there OSEM and WEBES? I'm not sure but it looks like OSEM has a Compaq history, while WEBES comes from Digital. ISEE in itself is HP. The tools have not been merged yet, are still actively developped and they will probably complement each other for a while.

ISEE - runs on the CMS
The new 5.x ISEE client is a new version of the 3.95 client, which is now integrated into SIM. Most of the configuration settings you used to put in the ISEE client are now configured there, from the Remote Support menu entry.

SIM - runs on the CMS
SIM is used to actually manage your servers, and WEBES and OSEM automatically sync their configuration with SIM. For instance, if you set yourself as the contact person for a server, both OSEM and/or WEBES configuration will be populated with what you put in SIM. So SIM is the only place where you actually need to do some manual configuration.

Basically, if you think that SIM takes care of handling events, you're wrong. It just _reports_ the events it receives directly and gathers from WEBES/OSEM. It also reports what ISEE does with the events. The exact way it gets the information from these agents is beyond me, I don't know how yet. SIM doesn't send any events to ISEE; RSP and OSEM do. SIM also receives SNMP traps and subscribes to WBEM events. But since they are not filtered, it will only log and "raw" events.

That's what I understand out of this for now. Hope that helps.

Friday, October 10, 2008

Bl495: a perfect fit for virtualization

Bundled OEM "value-added" software that comes with a subpar digital camera or printer is usually not useful, bloated, proprietary and hard to uninstall. And RSP (see my previous post) makes this kind of software look rather elegant.

Yet despite my rant on some HP management tools which are really not worth getting excited for, they do design some pretty interesting hardware, such as the bl495.

I almost wet my pants when I saw these. Now that's the kind of blade I was waiting for -- lots of CPU, even more lots of RAM, two SSDs, and a small footprint. They're just perfect for running an ESX cluster. ESXi can be burned in using the HP USB key, but I'd still prefer ESX for now. Combine this with an EVA4400 and you're on for a helluva ride.

The only thing that's missing in my opinion is an additionnal two LAN ports, which are available on an optional mezzanine. The bl495s include two built-in 10GbE ports which has plenty of bandwidth, but it's complicated to isolate the Service Console, VMkernel and various Vswitches without using tagged Vlans (especially through a Virtual Connect). I prefer having different, physical interfaces for this, especially considering the fact that 10GbE is still too modern for out 1GbE catalysts.

You can easily replace 6 or more full racks of less-critical Wintel server with a 10U chassis full of these. With technologies like this, that can be done. Think about all the space you'll save, and let's not forget about cooling, SAN ports, LAN ports...

Way to go HP!