Writing

Feed Software, technology, sysadmin war stories, and more.

Tuesday, August 23, 2011

Support issue complexity metric meets Apache's VirtualHost

Support ticket complexity is one of those things where all of the bean counters want to quantify it and use it to do Other Stuff. Usually it's just about bar charts and dumb things like that which make them look useful. Trouble is, a ticket's complexity is easily a factor of both the issue at hand and who you have working on it. Here's one example.

I came back from lunch one night with a couple of friends. There were two or three techs staring at a laptop, trying to make sense of some problem. Someone said I should go take a look, so I did. On screen and over a shoulder, I was able to read something resembling this:

<VirtualHost 72.3.x.y>
ServerName www.something.or.other
ServerAlias www
CustomLog /path/to/some_log combined
</VirtualHost>

The guys who were working that ticket had a 'tail -f' running on that CustomLog file, but nothing was going into it. They'd reload a page in their browser and nothing would show up in that file. Instead, their web hit would land in the default log file. They were annoyed and confused and wanted to know why.

Now, it's hard to really get a feel for a situation when all you have is one screenful of text left over from a prior command being run (their editor, in this case) and what they've told you. It's even harder when you can't just reach over and start typing. But still, I had an idea.

"Hey, run ifconfig for me". They did, and I had my answer right there. I told them what was up. They were happy to have it solved, and yet were also annoyed that I could just nail it without ever touching the keyboard.

All of the interfaces which had scrolled by had RFC 1918 addresses - 192.168.x.y type stuff. Without even looking at the customer's config in our provisioning system, I could tell just from that that they were behind a firewall with NAT enabled.

Now flash back to that VirtualHost snippet from before. It's set up with a 72.3.x.x IP address -- one of the external addresses allocated to that customer. That means Apache is only going to try to match incoming hits to those which arrive at that particular address. Trouble is, thanks to the NAT, that will never happen.

Long story short, those virtual hosts should have been configured with the inside IPs -- 192.168.x.y -- if any at all. There were also things like __default__ and * which might have worked as well, assuming everything else was in order. Basically, they chose the one option which would look correct and not actually work.

I found out later just how long they had spent looking at it, and it wasn't good. Having three techs stare at something for over an hour makes it sound pretty complicated. Having another come in from lunch and knock it out just like that makes it sound like nothing at all.

Of course, none of this should have mattered. There should have been something which would have caught such misconfigurations. I later proposed and then went on to build such a thing to show that it could be done, but it never went anywhere. That too is a story for another time.