The Born-again Sysadmin: September 2008

Tuesday, September 23, 2008

Redirecting a chroot-jailed /dev/log to a dumb syslogd

HP-UX's stock syslogd doesn't support multiple input streams, so when chrooting applications that write to /dev/log (namely an SFTP server), you're pretty much stuck with no possibility of logging anything in syslog.

I considered for a short time installing syslog-ng and connecting /chroot/dev/log to /dev/log but that seemed overkill.

That's until I found out that this works perfectly to connect one fifo to another:


while true; do cat /chroot/dev/log > /dev/log; done

Wow. That's an easy workaround. And it doesn't consume much, the loop only happens when a line is written in the log.

So I wrote a nicer wrapper around this line (97 to be exact), and published it here:

http://www.mayoxide.com/toolbox/log_redirector.sh

Bye

Thursday, September 18, 2008

Cable management - a whitepaper from HP

The art of cable management is a trade that seems to be learned from "father to son" and there are few documents out there that actually show how to do it.

The following whitepaper is a start, but it has few pictures:
http://h20000.www2.hp.com/bc/docs/support/SupportManual/c01085208/c01085208.pdf?jumpid=reg_R1002_USEN

Wednesday, September 10, 2008

Paper: Understanding routing in multi-homed HP-UX environments

Multi-homed HP-UX servers, especially ServiceGuard nodes, present a challenge in terms of routing, which is further exacerbated by the lack of documentation on the subject.

This paper tries to explain how to configure multi-homed servers to enhance the routing of IP packets and prevent asymmetric routing.

Note that I call this paper a "graypaper". I do not work for HP, nor do I have any internal knowledge of HP-UX. The information in this paper has been determined by looking at the output of tcpdump and by reading publicly accessible documentation and posts in the ITRC forums. This document is provided "as is" without any warranty.

Click here to read this paper.

Friday, September 5, 2008

mail loops haunting me again

This week, an important mail loop caused slowdown problems in our company's mail servers. The cause of it was one of my servers that, in 12 hours, managed to send over 125K emails. Counting all the bounces that came back, the total number of e-mails must have been around 250K.

Here were the ingredients:

1. All our servers have a local sendmail daemon active. This is a requirement for our applications that speak SMTP to localhost:25. From a security standpoint, I had IP Filter filtering port 25 so I didn't modify the default sendmail configuration too much as I wanted it to remain as standard as possible.

2. After a few months, I forgot about point #1, of course. For a long time, I was under the impression that we had no sendmails listening at all.

3. Last week, we stopped IP Filter on one of the servers which was having some networking problems, and since it's a mission-critical one, I didn't have the guts to restart it. So this basically made the SMTP server active to the outside world.

The 3 ingredients were in place for a mail loop. Here's how it happened:

1. Thursday, I killed a process on the server, and an e-mail was generated with a missing process alert. The e-mail was sent to root.

2. All mails destined to root are redirected, through /etc/mail/aliases, to a MS Exchange mailing-list that includes all the system administrators.

3. One of our administrators, let's say John Doe, was on leave since a while, and it's mailbox was full.

4. The mail bounced back with a message stating that John Doe's mailbox was full. Its return address was either root@server.

5. Since the server had sendmail, and its port was unfiltered, it picked up the mail and tried to deliver it to root.

6. Back to step #2, 150000 times.

Now that loop lasted for a while until I got back at work.

To prevent this in the future:

1. I spent some time making sendmail "send only". The HP-UX sendmail.cf generator, gen_cf, sucks big time but I found out that by setting send_only and modifying /etc/rc.config.d/mailservs, it adds the correct DaemonOptions to restrict it to listening to 127.0.0.1. So even if IP Filter is stopped, at least any bounce will be refused by the server.

2. IP Filter should also be restarted ASAP.

3. I also redirected postmaster and MAILER-DAEMON to /dev/null (they are sent to root by default) so that if steps 1 and 2 are not followed, at least these addresses these won't participate in the loop.

4. I checked how sendmail could be throttled to limit the number of emails that are sent in a specific time period, there are macros for this available but I'd rather not deviate too much from the default settings.

5. I also think the Exchange administrators should reconsider the "let's send a bounced mail each time a mailbox is full" strategy. I know nothing of Exchange but I strongly beleived this can be throttled. If an account has, say, 10 bounces a second, this feature should be automatically deactivated.

As a side node, having support from the manufacturer is important to me. So don't tell me to install postfix or qmail. I don't want to. If I die, quit or go on a hell of a long vacation, I expect any less experienced admin to be able to call HP directly and be supported. That's why I'm relying on the subsystems that are included with HP-UX (sendmail, apache, tomcat, wu-ftpd, etc.) and not the open-source ones. Yes, they're outdated and yes, they're not necessarily the best of breed, but they work. Furthermore, any security patch is issued by HP, so I don't need to take care of that either.

Preventing asymmetric routing under multi-homed HP-UX hosts

I've been having problems with asymmetric routing for a week now, and found some interesting tidbits on the routing algorithm of HP-UX. Most of this was done with experimentation and lots of sniffing with tcpdump. There are so few documents on this subject, that I'm working on what I call a graypaper. I will should post it eventually. Serviceguard nodes are especially prone to this because many of them are multi-homed.

In the mean time, if you experience some asymmetric routing, send me an e-mail. There are some interesting ndd and route settings that can be tweaked to circument it. You need 11iv2 or later, or 11iv2 with TOUR 2.4.