Software, technology, sysadmin war stories, and more. Feed
Thursday, May 23, 2013

As ugly as possible and as fast as possible

I'm an enabler. I've done things which let incredibly broken software projects proceed by whipping up stopgap solutions given basically zero notice. I thought I was doing the right thing back then, but in the long run, the project's fundamental flaws brought it down.

Roll back a few years. I was working as a "systems developer", which basically meant I was there to write programs which did stuff that other people needed. If the support teams wanted something to automate their SSL certificate signing requests, I whipped one up. If they needed a better way to grovel around in tickets to find search terms, I'd build one.

None of this was intended to go into production. Everything I did was intended to be a demonstration of what could be done and also to prove that a use case existed. Then, the "real" developers in the company (in a completely different department) should have taken the cue from that and integrated it into the ticketing and provisioning systems.

One night, my manager came to me with a request. It seemed the company was about to throw the switch on their latest crazy project. They had purchased this third-party system to hold on to customer and account info. Up to this point, it had been stored in a system which had been built right there at the company. This system was just a big web app, and it basically worked from any recent browser with JavaScript enabled. This new thing only worked in Windows.

Instead of having all of support switch to Windows, they just rigged it so the ticketing system could "reach into" this new system to get customer/account information. Before, all tickets and customer data lived in the same database. You could do a JOIN to get customer data along with a ticket, for instance.

Once this changed, customer data lookups were now handled by a RPC call out to the other system. The ticketing system would get ticket data from its own database, then it would fire off a RPC to the (Windows-based) new system, and then it would squish it all together to make the results page.

They were about to roll this out, but they had cut a bunch of corners to make it happen. They threw out entire features to let it "ship" on time. They probably think they didn't matter. They were wrong.

One of the features was something we'll call "super search". It was the sort of thing you'd use when a customer called in and didn't know their account number. This form had a whole bunch of input fields, and you could put in a little bit of information to look for possible matches. You could do partial matches on the account name, customer name (that is, the actual human on the phone), server name, and several other fields.

Let's say I called in and for some reason didn't know my account number. They could have plugged in "rachelbythebay" and probably would have found my account since it would match part of my server name.

Well, during this switchover, there would be no "super search". Support techs who got a phone call from someone who didn't know their account number would be stuck. They'd go to try it, and it would just fail with some kind of cryptic error.

This is when it showed up in my life. My manager asked me to do something about this. Basically, could I grab a copy of the old database before they throw the switch? Could I then build some kind of web frontend which would let people search it? Oh, and they're throwing the switch tomorrow morning, so can you get right on it?

I said the data would get stale the instant I took a snapshot, and it would only get worse as time went by. The longer we had to use it, the more detached from reality it would get. Also, it wouldn't be fast. I wouldn't optimize it for run time. Instead, I'd optimize it for development time. She said that was just fine and asked if I could deliver something right away.

I actually agreed and went to it. The first thing I did was to write a query which would dump out the entire customer database. It purposely "denormalized" all of the data. The way it usually worked was that you'd have an account ID, and that would map to a row in a table with the account data. Then there was another table which mapped account IDs to actual people, each with their own "person ID". Then, there was a table with nothing but people.

I wanted all of this data in one place and didn't care about the duplication or keeping it clean, so I did the sort of query which gives a bunch of repeated stuff in the output. It looked like this:

account 1 | joe bob's varmint removal | joe bob
account 1 | joe bob's varmint removal | billy bob
account 2 | rachel by the bay, inc.   | rachel
account 3 | south park studios llc    | stan marsh
account 3 | south park studios llc    | eric cartman

I got something like a million rows from this. It was everything in the database fully blown up every which way. It's like doing a SELECT * with multiple tables involved. I didn't really care. Now I had all of it and didn't have to worry about trying to map the schema across. I just created a new table in my own database which would have all of the fields, and then wrote something to insert all of the stuff from that dump into it.

INSERT INTO stopgap ( account_id, account_name, contact_name, ... )
VALUES ( 1, "joe bob's varmint removal", "joe bob", ... )

Now I had a huge table full of data, and it was just a matter of providing some way to search it. This part was easy. The frontend was just a static web page with a giant FORM which fired off the data via POST to a dumb little CGI program. That program took the non-empty components of the query and generated a bunch of SELECTs using LIKE and %. Yep, it was ugly. Very ugly. I practically optimized for ugly.

As ugly as it was, it worked, and I was able to deliver it in a couple of hours. The next morning, it was there for anyone who needed it. It took a few seconds to run, but it was better than nothing at all.

Did anything special happen as a result of this? Nope. Nobody cared.