Readers of my blog will know that I spent quite some time integrating HP-UX 11.23/11.31 with SIM and Remote Support on each and every of my servers, including even the older, neglected test/QA servers no one usually cares about.
I'm sure some must have thought I was crazy investing so much time on a feature that doesn't bring back much, because, they'll say, hardware doesn't break. It is partly true. Hardware doesn't break a lot with the exception of disks, fans and power supplies which can experience a higher failure rate than, say, anything else that's based on transistors. So, most efforts should be prioritized towards monitoring devices which have a lot of these, and this mostly applies to disk arrays.
There is, however, a hidden gem in using Remote Support pack with HP-UX, and it's the monitoring of system panics.
That's right, panics! I don't hear about the term as much as I used to in the old days, but the fact remains that they still happen, and can either be the result of a software bug or even an untrapped hardware problem. With HPVM guests, I've had my share of panics, too.
Remote Support comes to great help with panics. When a panic occurs, once rebooted the monitoring agents will notice it, WEBES will gladly flag it as important, and an event will be logged at the response center. If the panic happens overnight or when the sysadmin is not there (and it WILL happen - most of us are in the office only a small amount of time), hours will be saved in the process as someone will probably have already contacted the system contact about the issue.
There is not yet a feature to send to HP details on the crash dump when the event is opened, and it must be done manually. But I wouldn't be surprised this will come in the future. Wouldn't it be great, for example, if upon rebooting the server crashinfo could be ran automatically and send details to the engineer? One can only hope this will come in the future, to reduce even further the response time.