Writing

Feed Software, technology, sysadmin war stories, and more.

Friday, March 9, 2012

This customer needed a much longer extension cord

Working in tech support introduces you to a whole bunch of problems. This is what you expect. After all, when things are going properly, you're not going to hear about it. It's only when something really bad happens that it comes up on your radar.

Not all of these problems were technical, and a fair number of them didn't even happen as a result of something the customer did. Frequently, they would come about when some overzealous sales monkey "set up us the bomb" as we used to say. These things would always blow up during second shift when the responsible party wasn't on site. We had to clean up their messes.

One night, a customer opened a ticket to ask us how to set up "privatenet" between his servers. This is what we called an additional network running with the secondary NICs on a customer's servers. This sort of thing existed so that you could split up web serving and database machines without running up your bandwidth counter. Bandwidth was charged on the primary port (which could see the outside world), and was ignored on the privatenet ports.

So when this customer asked to set it up, it seemed a little anomalous, but I looked into it anyway. Normally, new servers just go online with all of their IPs set up already, so there's nothing magic needed to get the privatenet going. I looked at this guy's setup. He had two servers, all right, but there was a big problem.

They were about 1500 miles apart.

Going across the Internet to talk to your database server fails in multiple ways. First of all, there is the potential for exposing your database server to the outside world if it isn't set up properly. Nobody but you should be talking to that thing. Next, there's latency, since all of those requests have to get to a far-away machine and then come back.

Third, there is the matter of privacy. If those MySQL connections are running in the clear as usual, anyone in the middle could see what your site's backend traffic looks like. This could leak a bunch of personally identifiable information (PII) about your customers. Obviously, this is not what you want!

Finally, there's bandwidth. All of your database queries will generate outgoing traffic from your web server as it heads to your database server, and then the responses will generate outgoing traffic out there. You wind up paying for every byte. You could do this, but why? This is also why "just run OpenVPN between the two machines" would have been a poor solution.

I told the customer that privatenet between his servers would be impossible. It was only supported between devices in the same location, and only if they were in certain racks depending on where they lived. I posted this as an update to his ticket and sent it off.

A few minutes later, he called in, and wanted to talk to me. He was not happy. I sighed and took his call, and found out what had happened.

The sales guy had "popped" his new server in the brand new data center in Texas, while his original server was in Virginia. Normally this would not be a problem if you were trying to migrate from one to the other, but this guy was trying to split up web and db duties. In that situation, you want two boxes in the same location with quick privatenet between them.

We had screwed up and needed to fix it post-haste. I told my boss what was going on, and we called our second shift account manager to come in and help out. He was the only person who could "pop" a new machine in the right location (Virginia), and he wasn't working that night. He recognized the nature of the problem and drove in anyway.

One of us called the data center and warned them that we would be sending a build order for a new machine as soon as we could. They decided to pull the parts ahead of the official "new contract" notification and got right on building it.

I kept an eye on things. The new machine went online at 12:02 AM. This particular drama had started at 6:47 PM the night before with that original "how do I set up privatenet?" ticket. We had managed to contain it to just our shift and hopefully didn't suffer too much of a hit to our reputation with that particular customer.

When it's late at night and someone else has left you a nice steaming pile, all you can do is sweep it away and hope that their day-shift masters deal with them later.