Writing

Feed Software, technology, sysadmin war stories, and more.

Monday, September 17, 2012

Attacking a build warning when nobody else will

How about another coding story from the world of corporations?

I transferred to a team which had a continuous build and unit tests which were both horribly maintained. Essentially, they weren't maintained. I know this because the green/red status indicator was red the whole time... only none of them knew it, because they didn't bother to look.

Now, back on my old project, I had installed a little browser extension to help keep tabs on my build and tests. It would connect to the build and test servers to fetch their status, and would show as either green or red. It was easy to know when something broke, and since I always had a browser running, I'd never miss it.

When I changed projects, I re-pointed that extension at the continuous build for the new project, and that's when I found out just how bad it really could be. That indicator went red and stayed there. There was much to be done before it would ever become useful.

One of many cleanups I wound up doing involved porting some Python code. There was something or other in part of our tree which used Django to run a small web service. Every time our "build everything" target ran, it would also go through that web service and attempt to roll it into a package. It would succeed, but not without throwing a warning about deprecation. Apparently we were using version X, and support for that was going away soon, and needed to upgrade to version Y.

I asked about this and found out from the nominal team lead that he had seen it, knew about it, may have even tried to do something about it, but decided to just leave it there as long as he could. He actually told me at some point that he intended to just let it stay there as long as possible, possibly because nothing was "actually broken" yet.

At first, I let this go since there were other bigger (and smellier) cleanup fish to fry, but still it sat and persisted. Nobody else seemed to care, or really even notice. Finally, one day, I decided to see what could be done about it. I didn't know much about Django, and never was much of a Python person, but that wasn't going to stop me from learning about the situation. For all I know, it could just be a matter of editing a couple of fields to say "use version Y" and that would be it.

Most of it was in fact just a bunch of mechanical version flipping. I'd go in and find something that referenced one version and change it to the new one. There were a couple of small interface changes which needed to be handled, but again, all of this was no big deal. There was a document which explained what had changed from one version to the next, so it really just came down to making it happen.

Finally I got down to one lingering trouble spot. It's the spot which makes me wonder if this is why nobody else bothered to go through with it before. This newer version of Django we would now be using was UTF-8 clean. Whereas the old one would accept arbitrary sequences of bytes and didn't really care about what they looked like, this one was strict about compliance.

Upon switching to the new version, it started blowing up when serving one key page. It took me some time to find out what was causing it, but eventually I found the culprit. There was a page which had a <script> block in it, and then inside of that block, there was some Django template magic to "#include" a chunk of JavaScript from a file. That file was not UTF-8 clean, and it was bringing down the whole operation.

I verified this by commenting out the include directive. It wound up disabling the script, but at least it didn't crash, and it proved what the real problem was. Upon inspecting the file, I found it was yet another fork of sorttable, and Django hated it because it contained a non-UTF-8 character -- I believe it was the pound (GBP).

I wasn't about to go down the rabbit hole of trying to edit that file and not break something else, and so I went another route. There was absolutely no reason for a freestanding chunk of JS to exist as an inline script block. It could be served up as its own URL endpoint just like any other resource. So, I flipped the web page around to effectively just do a "SCRIPT SRC=/sorttable.js", and then added a new endpoint to the web app to serve that same file as /sorttable.js.

It all boiled down to the fact that Django was perfectly happy to serve an opaque blob at a given location, just like it would for an image or MP3 or whatever else. It just couldn't take it as a template, because then it actually had to "think about" (parse) the contents. By removing the file from an "include" context, I got it away from that path.

That was the last hurdle, so I sent it off for review, got it approved, and checked it in. It went out to production and that was the end of that.

It took me, the Python avoider and newcomer to the team, to clean up the mess. Why it had been allowed to sit and rot like that, I may never know.