Tuesday, August 23, 2016

Running "MRPE" check_mk scripts asynchronously on Windows

I have a corner case on Windows where I need to execute classic Nagios NRPE scripts within check_mk, but in asynchronous mode. These scripts can, in certain circumstances such as a network timeout, take a significant time to execute and they cannot be run from the check_mk agent.

It's possible to have honest-to-goodness check_mk scripts execute asynchronously, using the async directive in check_mk.ini. I tried it, it works. However, this is not supported by the agent with classic nagios plugins.

So, I wrote a wrapper named mrpe_async_wrapper that does just that. It's not rocket science; the wrapper is simply a Windows batch file that:

  1. Creates a scheduled task (on its first run) that executes the check script at 5 minutes intervals;
  2. The scheduled taks instructs mrpe_async_wrapper to run the check script and save its output in a status file;
  3. When run directly, mrpe_async_wrapper reports the contents of the status file instead of executing the script. It does it quickly. So, you can run it each minute if you want, but it will only report the status within up to the last 5 minutes. 

This lets you run slow or unpredictable NRPE scripts from check_mk without fear. I've been running this for a few days and it seems to do the job for me.

To configure it, simply add a directive to the [mrpe] section of check_mk.ini like this (on the same line)

check = check_gizmo C:\tools\mrpe_async_wrapper.bat check_gizmo C:\tools\check_gizmo.bat

This defines an MRPE check named "check_gizmo", which instructs the wrapper to create a scheduled task named "check_gizmo" that runs c:\tools\check_gizmo.bat asynchrnously.

Here is the code for the wrapper:

Have fun.

Monday, June 27, 2016

Getting UFO2 failover status from an OSIsoft PI Interface


I'm currently deploying an OSIsoft PI Interface node at my workplace.

Being a "Systems" Administrator, and not a "PI" Administrator per se, I was looking for a way to get high-availability status directly from that interface node. My objective was to provide IT Operations with an easy-to-use procedure that answers the following question: Which interface node is currently active and which one is currently in standby?... It is useful for them to know the answer to this when scheduling maintenance such as Windows patches.

Unfortunately, there is no easy way to find out which of the two interfaces is currently active. I've looked everywhere in OSIsoft's KB and I guess nobody asked. :-)

Some information on UFO 

Many, if not all, PI interfaces are based on UniInt (Universal Interface). UniInt supports two failover levels named UFO (UniInt FailOver):

  • UFO phase 1 (UFO1) which is based on PI points
  • UFO phase 2 (UFO2) which uses a shared file located on a separate file server

Not all interfaces support UniInt failover; check your Interface documentation. Mine only supports UFO2.

You can look at the following KBs for more information:

UFO2 is preferred to UFO1, and KB00446 even mentions that UFO1 is deprecated. That might be due to the fact that I see one major drawback with UFO1: if one node looses access to the PI Server, it cannot know the status of the other node. Using a shared file on a file server (a highly-available one, that is!) is deemed more reliable.

Finding what interface is active, the PI Admin Way

There seems to be one official way, the "PI Admin Way", which involves looking up points stored in the PI Server.

While my interface is UFO2, it seems to create PI points anyway. These points are created directly from ICU, and they all have "UFO2" in their names. It is therefore trivial to check their values from the PI SDK Utility tool. For example:

PRO TIP: It's also possible to find out these values at the command line using apisnap.

While this is sufficient from a PI admin perspective, from a systems administrator perspective, it's not great. For instance, it's not an easy task for IT Operations to fire up that tool and query PI points, it cannot be automated in a script (except if using apisnap) and lastly it will not work at all if the nodes cannot speak to the PI server. It is thus preferable to ask them to run a simple command.

Finding what interface is active, the born-again Sysadmin Way

It was a simple task to somewhat reverse-engineer the binary UFO2 .dat file created by the interface and write a simple program to extract basic data. I've named it readdat.

C:\tools>readdat \\myfileserver\myfile.dat

Active Node (0 = None, 1 = Node 1 is primary, 2 = Node 2 is primary)
Active ID: 1

Device Status (0 = Good, 99 = OFF, any value in between results in a failover)
Node 1: 0
Node 2: 0

Works good enough for me. Readdat.exe can then wrapped in a batch file or a powershell script to make it easier to use.

As a bonus, you can run it like this:
C:\tools>readdat \\myfileserver\myfile.dat -activeid

This will set ERRORLEVEL to the ID number.

The source code for readdat is here:

Here is also a Win32 executable:

Good luck!

Monday, September 14, 2015

Configuring vsftpd to support proxy FTP

I've had to deal with a legacy application that is hard coded to use proxy ftp sessions. These are initiated by using the "proxy" command in a stock ftp client.

It was giving us trouble with vsftpd refusing to transfer files when using "proxy get" to initiate a passive session between the vsftpd server and another server.

What is a proxy FTP? In a nutshell, a proxy session lets you open a connection to a second FTP server, so that you can transfer files between both servers from instead of between the primary server and your FTP client.

The ftp(1) man page documents what "proxy" does. It is important to read it and understand what happens when you use this:
     proxy ftp-command
                 Execute an ftp command on a secondary control connection.
                 This command allows simultaneous connection to two remote ftp
                 servers for transferring files between the two servers.  The
                 first proxy command should be an open, to establish the sec-
                 ondary control connection.  Enter the command "proxy ?" to
                 see other ftp commands executable on the secondary connec-
                 tion.  The following commands behave differently when pref-
                 aced by proxy: open will not define new macros during the
                 auto-login process, close will not erase existing macro defi-
                 nitions, get and mget transfer files from the host on the
                 primary control connection to the host on the secondary con-
                 trol connection, and put, mput, and append transfer files
                 from the host on the secondary control connection to the host
                 on the primary control connection.  Third party file trans-
                 fers depend upon support of the ftp protocol PASV command by
                 the server on the secondary control connection.

So how does this impact vsftpd when using it to handle the primary control connection?

The first thing that might happen is that if you issue a proxy get, it might  fail with the following message:
500 Illegal PORT command

This is fixed by adding the following parameter to vsftpd.conf:
What this parameter does is authroize vsftpd to open a data connection with the proxy server, instead of limiting it between vsftpd and the FTP client.

Then, you might get:
500 OOPS: vsf_sysutil_bind

This happens because the vsftpd process is trying to bind to port 20 to the IP address of the server. By stracing the process, I found out that this does not work because the vsftpd process that handles communication with clients is unprivileged. This privilege separation is by design. The workaround I found is to add this to vsftpd.conf:

This makes vsftpd bind to another port (I didn't even check which one) but it works. By default it is set to "NO", but it is left to "YES" in the example configuration file and thus why it was there in the first place.

Good luck

Monday, July 20, 2015

Updating a Magellan Triton 500's GPS chip firmware

I've been recently trying to restart using a circa-2008 GPS I own, a Magellan Triton 500. Back when I purchased it, it was so frustrating to use that I gave up. It's time for a rematch.

There used to be a english forum with lots of information on these, but it closed some years ago. Whatever was in this forum is lost forever (and no, the web archive didn't save the posts, only the thread subjects).

There are still tidbits of info scattered here and there, however. Many in russian and german though, which requires running them through a translator, with mixed results. I might try to put a comprehensive page in the future in this blog in case they go down, too.

In the mean time, here is the best hack available for this unit.  I found an interesting post in a German board that explains how to flash an unofficial driver for the SiRFstar III GPS chip that updates its software from GSW 3.2.4 to 3.5.0 and it increases the unit's sensitivity considerably. The details are here:

Here is how to do this.

1. Download this file here:
P.S. maps4me offers many maps for the triton for a one-time download fee, I suggest you check it out.

2. Extract the zip file.

3. Turn OFF your GPS (this is important, if it's already turned on when you plug it in, the GPS driver will fail to install)

4. Run MgnFwUpd.exe

5. NOW plug your GPS, turn it on, and run the update. It takes at least 30 minutes to complete.

6. Tada! The about -> version page should show GSW3.5.0 for SiRF.

I've tested this on Windows 8.1 and it still works even if the software was probably designed for XP.

For advanced users: If you don't want SiRF 3.5.0, there is a way to update from 3.2.4 to 3.2.5 using official Magellan code. I requires downloading the latest firmware update to 1.95 from Magellan, running their update, finding out the temporary directory where it extracts its data, then modifying MgnFWUpd.xml to uncomment the line mgnFWGpsChipUpdate version. I tried it but didn't find 3.2.5 to be very useful.

Monday, September 22, 2014

Change control killed the sysadmin star

Yesterday, I've watched Office Space for the first time in probably 10 years. I can't believe how this movie is still relevant today. What's most funny is that one of the center pieces of this movie, TPS reports, is the type of report I have to file sometimes. Even if the movie's underlying themes have not aged much, there is one thing that Mike Judge would have to consider if he had to direct a reboot (pun intended) of Office Space 2015: change control.

I could go on and on about change control but I won't. However, I can leave you with this song:

Red tape came and broke your heart
We can't approve you've gone too far
Change control killed the sysadmin star

Thursday, October 3, 2013

Running a Matrox G450x4 MMS under Windows 7

I was recently tasked with a small challenge. Given that we have a fair amount of circa-2005, quad-screen workstations running Windows XP for which we know the clock is ticking, is it possible to upgrade them to Windows 7 even if they have ancient Matrox graphic cards?

The answer is, yes, with some limitations. Matrox doesn't have a clear stance on Win7 support for the G450 series. By downloading their latest driver which is supposed to support Win7 SP1, the installer fails without even a hint of what is going on.

By searching for and trying various older drivers, I found out that the WHQL drivers do not support the G450, but the non-WHQL do. To get these drivers, you have to go to the "archived support drivers" area and scroll down to the latest non-WHQL driver you can find for your platform. In my case, it was version 211_00_183. The driver installs and the graphic card works. Case closed.

Of course, by using non-WHQL drivers, you might be asked by Microsoft to remove these drivers if you run into problems and ask for support.

For the curious, these workstations have been limited to being quad-screen ICA clients a long time ago, so I don't expect any performance impact by moving them to Windows 7. If we move on with this scenario, we'll be saving the company some money by extending the life of this equipment for a few more years.

Wednesday, May 8, 2013

Launching the Performance Monitor from the command line or script

On a Windows 2008 R2 server, I needed to launch the Performance Monitor with a built-in live report. That is targeted to support personnel and I don't want them to have to start it an add counters manually each time.

Man, that task proved to be more complex than I expected.

Here is what I've found in the last few days:

Solution 1: use IE
From the performance manager, the only option offered to save a custom report is to save it to an .HTML file which can then be launched from IE. That is clunky, as it has some ActiveX code and you need to acknowledge running it. Furthermore, when you load up that HTML page, you first have to press on the "play" icon to start the data collection, which is another useless step that I don't want support guys to have to do.

Solution 2: use Typeperf
There is a nice utility named "typeperf.exe" that can be used to dump specific counters to the console. It works, but for an odd reason, it can ONLY output CSV output to the screen. If you specify another format, it insists on dumping in a file. In essence it is a good quick-and-dirty tool for the console but not a terrific all-around solution.

Solution 3: use Perfmon in standalone mode (WE HAVE A WINNER!)
You can launch a standalone Permon using "perfmon /sys". This lets you add counters and, look at the magic, the standalone panel offers the possibility of saving that report in a .PerfmonCfg file. To load the file, simply click on it (or use "start meh.PerfMonCfg" within a batch file) and it will bring up a good old Perfmon report on the screen. That, in my opinion, is the best way to achieve my goal.

Sorry for the lack of details, but that should give you an idea.