Thursday, February 19, 2009
Olivier's hot tips to monitor HP-UX servers with SIM and RSP
1. Configure WBEM on your server
SIM and WEBES subscribe to WBEM events on your server in order to receive events. But you need to put root credentials in SIM's Global Protocol Settings for this to work. Whatever you do, don't add root's credentials anywhere. You should never have to hand out the root password to some slimy application unless you really know what you're doing. Create a dedicated WBEM user for this instead.
Add a user with "adduser", I named it hpwbem:
# useradd -u 505 -g users -s /bin/false -c "HP WBEM provider" -m -k /etc/skel hpwbem
Then use passwd to input a password of your choice.
Enable non priviledged users in the CIM:
# cimconfig -s enableSubscriptionsForNonprivilegedUsers=true -p
# cimconfig -s enableNamespaceAuthorization=true -p
# cimserver -s
Then add rights for hpwbem to the CIM:
# cimauth -a -u hpwbem -n root/cimv2 -R -W
# cimauth -a -u hpwbem -n root/PG_InterOp -R -W
# cimauth -a -u hpwbem -n root/PG_Internal -R -W
# cimauth -a -u hpwbem -n root/cimv2/npar -R -W
# cimauth -a -u hpwbem -n root/cimv2/vpar -R -W
Configure the hpwbem user, and its password, in SIM's Global Protocol Settings.
Now, have SIM subscribe to WBEM events for your server. It doesn't by default. On your CMS, type:
C:> mxwbemsub -a -n server_name
Once this is done, check on your server if you have SIM subscriptions by using evweb:
# evweb subscribe -L -b external
You should see three subscriptions named HPSIM_*.
2. Configure your system properties in SIM
Get into the System Properties of your server in SIM, then confirm that a serial and product number has been discovered. Sometimes the PN is missing for Integrity servers, so add it manually. Just to be sure, also recopy the SN and PN in the Customer-Entered serial number and product number fields in the Entitlement Information area. You'll be sorry if you don't do this. Next, set your Country code. If you don't do this, ISEE/RSP won't work. The other fields in the Entitlement Information area can normally be left blank.
Assign a site name, and at least a primary customer contact to your server. It's important, else I think no ticket will be generated by ISEE since there will be nobody to contact.
3. Configure RSP entitlement
Go in the ISEE client (Remote Support Configuration and Services under the Options menu) and confirm that your server is entitled. If it isn't, you can try clicking on the entitlement icon, and have it send a new entitlement request. As long as you're not entitled, RSP will not forward service calls to HP so it's critical that you get this fixed. Be sure you set the system properties correctly as mentioned above.
4. Configure WEBES
Get into WEBES (localhost:7906) and confirm that your server is in the Managed Entities list. Of course, there's no search feature, you'll probably have to check multiple pages in the Full List to find it. If your server appears, confirm that its system type is ManagedSystem - HPUX. If the server is of the wrong type, delete it, as it could stay that way for a while -- better be safe than sorry.
WEBES synchronizes its entity data with SIM, but it does this through telepathy or some other magic, I couldn't find out how it's done and if it can be forced (and nobody replied to me in the forums to help me...). Restarting desta doesn't do the trick.... the real trick is actually waiting, sometimes for a loooooong time, until your server appears as a managed entity. I suggest you wait until the next day.
Once your server is in WEBES, run evweb (see above) and confirm that there's a subscription named HPWEBES_*. You need to have one, else hardware events will not be caught by WEBES and forwarded to RSP...
5. Generate test events to confirm it actually work
Generate a test event with EMS:
# /etc/opt/resmon/lbin/send_test_event ia64_corehw
...then cross your fingers, hoping it will be reported. The following should happen:
a) the event will be shown in SIM, in the event tab of the server (this is what the SIM WEBM suscription is used for)
b) the event will be trapped by WEBES, and sent to ISEE (this is what the WEBES WBEM subscription is used for)
c) ISEE will send the event to HP, and you'll see in the server event log messages such as A service incident has been reported (this is what all the entitlement hassle is used for)
If you went down to step c, you're done. If it didn't work, go to step 1 and start again. I had to to this quite a few times. There's an old song in Quebec French named "refelemele". It basically means "doittomeagain". Chances are you'll be singing this along for a few days.