The Born-again Sysadmin: June 2009

Tuesday, June 23, 2009

Quick review of Openview Performance Manager 8.0

Version 8.0 of Openview Performance Manager has been released, what, two years ago now, and I never thought of upgrading it since versions 5, 6 and 7 were basically identical from my perspective and I expected the same with 8.0. So I didn't bother upgrading until I saw screenshots from 8.0 in slides that were released at the HPTF.

I missed a lot of things. Version 8.0 is better integrated, actually easier to use for newcomers, and the graphs you can make are much more readable. Exit is the old "web forms" interface, everything is Java-based, and surprisingly, it works! While previous versions didn't allow you to easily print out graphs from the Java interface, this one has a handy Print function that will pop a standard browser with a nice looking graph you can print such as this one:

And take a look at the CPU gauge which has changed for the better:

There are some things that are less fun, though. For instance, while it used to be very easy to unselect metrics to be displayed in the graph in the previous version by simply clicking on them in the legend, this no longer works. You have to edit the graph and remove the manually, and it takes more effort. The older systemcoda.txt no longer works either to quickly add managed nodes - you need to import them using a tedious process, or edit and XML file directly, and restart OVPM each time you change it.

The graphic designer has also changed a lot its interface, so users who tediously migrated from Perfview to OVPM <=7.0 will again have to change the way they work. Considering it's a complex process, that's too bad.

Overall 8.0 is interesting if you're looking into making sexier graphs, but functionality might be a problem if you're used to the older versions. In my case, since I'm a casual user, I'm glad to have migrated to 8.0.

O.

Monday, June 22, 2009

Virtual Connect performance monitoring and profiling with HP-UX and Windows

I presented on this subject at HPTF2009, and the slides are out on the session scheduler. For those who did not attend to the event, I have a copy of my presentation here: http://www.mayoxide.com/presentations/

If I had to do this all over again, you can be sure I would have presented with a subject on RSP this year. But abstracts are due in January, and it was too early back then to know if I would have enough dough to make a presentation on RSP for the HPTF. Perhaps for 2010? Maybe, if I can go once more. I've assisted to the event since 2006 and I'm not sure if my management will let me go again.

Saturday, June 20, 2009

Back from HPTF 2009

Once again, I've had the privilege of attending the HPTF and, once again, I was not disappointed.

Although I didn't see the special events, I assisted to all the technical sessions I could find and the content was similar to previous years. But it shows that travel expenses at HP have been cut off; I've had at least one session where the speaker was covering for some of his colleagues, and one particular session where the speaker, an HP employee, wasn't up to the "I like HP" thing as he could have been.

Here are the highlights:

1. At the HP-UX panel, where customers have a chance of speaking to a panel of HP-UX experts, there was a guy responsible for Systems Insight Manager. A lady went ballistic against RSP and the System Fault Management agents. The SIM guy replied that "the team developping the agents have come under lots of scrutiny recently" but he wasn't aware of all the problems (which I've had too, by the way). I felt bad for him, as RSP and the agents was not his product. As a matter of fact, I think the agents, which are very bad, aren't anyone's product at HP. Then two other attendees, and me, chipped in to vent ourselfs on the ISEE-to-RSP migration process. The lady was very upset all that time and the moderator eventually put the discussion to a halt.

2. Speaking about RSP, I've been able to meet its Product Manager face-to-face, all-around a nice guy who was glad to hear from a customer. I didn't spend too much time on my past issues, as they have been resolved, and instead gave some suggestions on improving the product. He should follow-up on this. I was very aware that some of the worst products, the SysFaultMgmt agents the lady I mentionned earlier was pissed about, aren't his products so I didn't spend too much time with them.

3. Connect organized activities named "Meet the experts" which were round-table discussions with select people within HP that cater to a particular product. I've been able to meet Bdale, who's in charge of Open Source at HP, I was alone, and very surprised that no one else showed up. I didn't have much to say actually, but asked him if HP intended to eventually open up HP-UX. The answer was no, as there are too many proprietary technologies in it. Another "meet the experts" session I assisted to was with Bruce and Bob, who are based in Fort Collins, and at this one we were three users. We've gave them some input on what we would like to see in the OS in the future. Hands off to Bruce and Bob, by the way. I've seen them showing up at many conferences for the whole three days, and they're very implicated with the community of users.

4. The closing keynote with Dr. Michio Kaku was very interesting. Great choice. In my opinion, for a tech conference it's much better to have someone like him rather than a comedian. Before him we had to endure the usual corporate mumbo-jumbo from Intel, a Microsoft researcher and Brocade, and as usual it wasn't very good. They at least gave me time to follow-up on my e-mails. That triple--dipping didn't leave much time for Dr. Kaku, he got unpolitely shoved off the stage once his time was up, and that sucked.

5. Like last year, the CDA area wasn't very good, very few prototypes and actual hardware. It would be better off not doing it at all. But there were some CDA sessions and although they didn't deep-dive very far, it was better than nothing.

6. Of yes, and Brian Cox handed out HP-UX 25th anniversary champagne glasses at the HP-UX kickoff. For a geek like I am, that was a great "show-up" present.

That's it for now. More to come.

Friday, June 12, 2009

Olivier's no-nonsense procedure to upgrade vPars using Golden Images

When comes the time to update servers, I'm not a proponent of update-ux. Why? Because all my data is stored outside of vg00 and my 50+ servers are 100% identical, so using Golden Images is actually the best way to update quickly and cleanly. Yet, the only official way that HP documents most upgrades seems to always revole around update-ux, leaving me on my own to upgrade systems when starting from a Golden Image.

This blog post will be the first part of a series that should span well until 2010. I have almost everything to upgrade, from low-life, disposable VMs to a cutting-edge Metrocluster. VM's are pretty straightforward to do, so today I'll start with vPars.

How do you update vPars to 5.xx / 11.31 without using update-ux?

The new 5.xx monitor series support mixed environments., so you need at least one vPar running 11.31, while the others can stick to 11.23. Here is the quickest way I found to do this.

1. Run "vparstatus -p partition -v" on each of your original vPars, and record all the information (especially boot disks)

2. Plan in advance what vparcreate syntax you'll need to do to recreate them. For example:
# vparcreate -p bonyeune -a cpu::1 -a mem::4096 -a io:1.0.4 \
-a io:1.0.4.1.0.4.0.5.0.0.0.0.1:BOOT

3. Put your server un nPars mode, and reinstall your "master" vPar using your Golden Image. The "master" vpar is the one on which you will boot the new 5.x vpmon. Add to your new server package T1335DC, i.e. the product name of the 5.05 Virtual Partitions product. Once this is done, recreate all your vpars, including the master vpar, using vparcreate (don't forget the boot disks).

4. Reboot the server in vPars mode, and launch the 5.x vpmon manually. Then try booting all your vPars one by one, starting with the master which runs 11.31, then the others which should still be at 11.23.

5. You can then upgrade at your will the remaining vPars. How do I do this now? Using vpardbprofile(1m) which has been added in recent releases. Using vpardbprofile, you can emulate EFI's dbprofile function, which lets you boot on a cross-subnet Ignite server easily.

Good luck

Wednesday, June 3, 2009

Hacking Ignite-UX's expert recovery mode to scrub disks offline

Introduction

SAN arrays can be scrubbed independently (remember dilx?) but internal disks are more complicated to do. Many suggestions I've seen in the ITRC forums consist of logging into HP-UX, then wipe the disks using "dd /dev/zero" or similar tools. From my experience, this is risky. I used to do this over 10 years ago when decomissioning workstations, and the operating system eventually stopped working while the disks were scrubbing, leaving no proof that the they had indeed been wiped completely.

With a Proliant, no problem. Just boot up a Linux live CD such as System Rescue CD and it will come with a scrubber. Case closed. But with an Integrity server, it's more complicated as there aren't Linux live CDs available. Maybe with some elbow grease I could make one with CentOS but I don't know Linux enough to take on the challenge.

The Ignite-UX expert recovery mode

Ignite-UX comes with a rarely-used mode named "Expert Recovery" which puts on a RAM disk many tools you need to recover an unbootable system. Instead, we'll use the expert recovery shell to actually wipe out disks! While we could compile a statically-linked open source scrubbing tool such as diskscrub, to save time we'll bring in mediainit(1m) which, since the March 2009 release of 11.31, has a new scrubbing option. Since the man page does not describe precisely the algorithm used, I checked what mediainit actually writes, and from my findings, it follows the DoD 5520.22-M standard which consists of writing one character, the complement, then a random character. It should be enough for most people... but for classified stuff, nothing beats a drill press.

Steps

a/ Start by putting a working mediainit under /var/opt/ignite/scrub on your Ignite server. It has to come from a March 2009 11.31 release, as it's the earliest to support scrubbing. You'll also need to add the libpthread library since mediainit has a dependency on this library, and it's not included in the expert recovery environment.

Example:
ignite-server# mkdir /var/opt/ignite/scrub
ignite-server# cp /usr/bin/mediainit /var/opt/ignite/scrub
ignite-server# cp /usr/lib/hpux32/libpthread.so.1 /var/opt/ignite/scrub

b/ Ignite the server you want to wipe out. Igniting is beyond the scope of this howto, I personally use dbprofile and lanboot at the EFI to do this. If you are given choices between igniting 11.23 or 11.31, choose 11.31. Of course, if you're on a K class or some other older hardware that doesn't support 11.31, you're out of luck. Stop here, and try to compile diskscrub to add it to your 11.23 or 11.11 Ignite server.

Obtaining size of AUTO (226 bytes)
Downloading file AUTO (226 bytes)
1. target OS is B.11.23 IA
2. target OS is B.11.31 IA
3. Exit Boot Loader

Choose an operating system to install that your hardware supports: 2

c/ When you get to the Welcome to Ignite-UX screen, choose Run an Expert Recovery Shell. Configure your network and click OK. A RAM disk will be created, and some useful commands will be pulled from the Ignite Server. You'll be presented with a menu, where you must choose X - exit to shell.


            HP-UX NETWORK SYSTEM RECOVERY
                       MAIN MENU


   s.  Search for a file
   b.  Reboot
   l.  Load a file
   r.  Recover an unbootable HP-UX system
   x.  Exit to shell

This menu is for listing and loading the tools contained on the core media.
Once a tool is loaded, it may be run from the shell. Some tools require other
files to be present in order to successfully execute.

Select one of the above: x

Type 'menu' to return to the menu environment. Do not use 'exit'.

#

d/ mediainit and libpthread are missing from the environment, so they must be pulled from the Ignite server. To do this, we'll use tftp which is the protocol used to download software from the Ignite server.

First get mediainit and put it in /usr/bin:

# cd /usr/bin
# tftp ignite_server_ip_address
tftp> get /var/opt/ignite/scrub/mediainit
Received 88405 bytes in 0.0 seconds
tftp> quit
# chmod 755 /usr/bin/mediainit

Then get libpthread.so.1 and put it in /usr/lib/hpux32:

# cd /usr/lib/hpux32
# tftp ignite_server_ip_address
tftp> get /var/opt/ignite/scrub/libpthread.so.1
Received 1521497 bytes in 0.6 seconds
tftp> quit

e/ You're done! You now have downloaded a workable scrubber in the Expert Recovery Shell.

# /usr/bin/mediainit
usage: mediainit [-vrn] [-f fmt_optn] [-i interleave] [-p partition_size] pathname
usage: mediainit -S [-t scrub_count] [-c scrub_character] special_file

f/ The last step is identifying your disk devices under /dev/rdsk, and wipe them using the -S option:

# /usr/bin/mediainit -S /dev/rdsk/c0t0d0
WARNING: You have invoked the disk scrub option.
Using this option will completely destroy the data
on the specified disk. All the signals except SIGINT(ctrl-c)
will be disabled during disk scrub.
Are you SURE you want to proceed? (y/n) y

Disk scrub:PASS 1

Disk scrub:PASS 2

Disk scrub:PASS 3
...
mediainit: Disk scrubbing successful
#

With these default options, mediainit will write these hex characters in order to follow the DoD spec: First '0x30', then '0x66' , finally '0xc6'. 66 is the complement of 30, while c6 is the "random" character which is actually hard coded in mediainit. It's actually more interesting to use -S alone rather than with the -c (scrub character) and -t (number of times to scrub) options since these two options do not alternate between different characters and you must reinvoke mediainit manually to change them.

Good luck

Tuesday, June 2, 2009

Migrating LIVE from one datastore to another in an Integrity VM

Did you know that it is possible to easily migrate data using LVM's pvmove in Integrity Virtual Machines running HP-UX ? I tried it today and not only does pvmove works well, it's almost as fast as if I was on physical hardware. For those who are familiar with ESX's Storage VMotion, this is as close as you can get to achieving similar results. Using pvmove is not as slick as an svmotion since you need to do it directly on the guest. But it works, and that's good enough for me.

Why would you want to do this? Here are some examples:

Migrating devices from scsi to avio_stor, without rebooting the VM (assuming it has the avio drivers, of course)
Switching from a datastore to another. For example, moving from a flat file to a raw disk
Moving data from a RAID-5 volume to RAID-1
Migrating from one disk array to another

As an example, here is a sample procedure to move data from an LV datastore to a raw disk datastore, without having to copy data manually or turn off your VM.

Add your new raw device to your VM:


vmhost# hpvmmodify -P myvm -a disk:avio_stor::disk:/dev/rdisk/disk39
vmhost# hpvmstatus -P myvm

[Storage Interface Details]
Guest                                 Physical
Device  Adaptor    Bus Dev Ftn Tgt Lun Storage   Device
======= ========== === === === === === ========= =========================
disk    avio_stor    0   1   0   0   0 file      /ivm/myvm/disk1.vm
disk    avio_stor    0   1   0   3   0 lv        /dev/vg_myvm/rlv_vgdata
disk    avio_stor    0   1   0   4   0 disk      /dev/rdisk/disk39

In the VM itself, make an ioscan to discover the new device


myvm# ioscan
myvm# insf -eC disk   # insf required only on 11iv2
myvm# ioscan -kfnC disk

Class     I  H/W Path     Driver  S/W State   H/W Type     Description
=======================================================================
disk      0  0/0/1/0.0.0  sdisk   CLAIMED     DEVICE       HP      Virtual FileDisk
                    /dev/dsk/c0t0d0     /dev/dsk/c0t0d0s2   /dev/rdsk/c0t0d0    /dev/rdsk/c0t0d0s2
                    /dev/dsk/c0t0d0s1   /dev/dsk/c0t0d0s3   /dev/rdsk/c0t0d0s1  /dev/rdsk/c0t0d0s3
disk      3  0/0/1/0.3.0  sdisk   CLAIMED     DEVICE       HP      Virtual LvDisk
                    /dev/dsk/c0t3d0   /dev/rdsk/c0t3d0
disk      4  0/0/1/0.4.0  sdisk   CLAIMED     DEVICE       HP      Virtual Disk
                    /dev/dsk/c0t4d0   /dev/rdsk/c0t4d0

See that new "Virtual Disk" device? Just pvcreate it and add it to your the VG on which you need to migrate data. Here I'm using legacy devices since my VM runs 11iv2, but you can do this with agile devices if you wish.


myvm# pvcreate /dev/rdsk/c0t4d0
myvm# vgextend /dev/vgdata /dev/dsk/c0t4d0
myvm# vgdisplay vgdata

--- Physical volumes ---
PV Name                     /dev/dsk/c0t3d0
PV Status                   available
Total PE                    1249
Free PE                     0
Autoswitch                  On
Proactive Polling           On

PV Name                     /dev/dsk/c0t4d0
PV Status                   available
Total PE                    1279
Free PE                     1279
Autoswitch                  On
Proactive Polling           On

Then, use pvmove to move your physical extents from one datastore to another using pvmove, as you would on a physical server.


myvm# pvmove /dev/dsk/c0t3d0 /dev/dsk/c0t4d0
Transferring logical extents of logical volume "/dev/vgdata/lv_mydata"...

Here, LVM will be moving LIVE all your data from c0t3d0 (an LVM datastore) to c0t4do (a raw disk datastore). The process can take a while, since it's going slow voluntarily to prevent the process from interfering with normal operations.

When it's over, remove the empty PV from the VG


myvm# vgreduce vgdata /dev/dsk/c0t3d0
myvm# pvremove /dev/rdsk/c0t3d0
myvm# rmsf -H 0/0/1/0.3.0

Then, remove it from the VM host:


vmhost# hpvmmodify -P my_vm -d disk:avio_stor::lv:/dev/vg99/rlv_vgdata

And you're done. You've just migrated storage live in your VM.

Can the pvmove be done at the VM Host level?

Here we did the pvmove at the VM Guest level, i.e. inside the VM Guest. Can this be done on the VM Host itself? The answer is yes, but only if you're using LVM datastores. Just pvmove the PVs on the VM Host, and you can migrate data transparently.

I did useLVM datastores for a while, but they're a pain in the butt to manage, as you have two levels of volume groups to consider: one on the host, then one on the guest. There is also a lot of flexibility that is lost: you can't easily export and reimport the guest volume groups from virtual to physical servers, back and forth, when using anything except raw disks.

Yet there is one place where LVM datastores could be useful: to enable a Storage VMotion look-alike for VM guests that do not have any volume manager that can easily do this. I'm thinking here of VMs running Windows, for example.

Bye

Monday, June 1, 2009

R.I.P. ISEE

Today, ISEE closes for non-CS customers. I don't think I'll blog much about RSP much from now on, since I've completed my migration and it now works.

I'm scheduled to meet the Product Manager for RSP in a few weeks at the HPTF, and I'll give him some of my comments as there are still a few things in it that I would like to see to make RSP better.

I posted most of my experiences in the last few months, but here is recap. A 100% successful RSP installation boils down to this:

If possible, start from scratch with a fresh CMS. SIM is easy to backup/restore so you won't loose any data.
Don't be tempted to install anything else on the CMS.
Use RSP 5.20, as 5.10 has a dumbed-down Software Manager that doesn't tell you what it's doing.
If you have EVAs, reinstall the SMS from scratch too, with SmartStart or the Proliant Support Pack, and ensure that it has only the required software to manage the EVA and link it to RSP
Proliants running Windows and ESX are easy to configure, just use the SMH to send traps the the CMS.
As for HP-UX... well, while the CIM-related tools are not terrific, SFM itself is worse. I've had lots of problems with it. You've been warned.

Good luck