Writing

Software, technology, sysadmin war stories, and more. Feed
Tuesday, August 27, 2013

A roaming fixer encounters some strange code

What sort of life does a roaming fixer have? It might involve a whole bunch of reverse-engineering. Here's a bit of a brain-dump on the work which starts happening.

Let's say you have a big complicated service. It has a bunch of moving parts, all made by different people. The parts come in all sorts of flavors: new, old, stable, wobbly, complicated, simple. The entire system probably "works" for the most part, but like so many things, there might be ways to improve it. Such improvements could save programmer time, operation engineer time, user time or frustration, for instance. There might be money to be saved if you recover a bunch of CPU time or disk space and don't have to grow your cluster this quarter.

One of these events winds up with you being pointed in the general direction of some code. Naturally, you've never seen it before. Really, nobody's looked at it in quite some time. It's been there all along: functional, but not really stellar. Maybe it used to work well but was based on some assumptions which are no longer valid, and in this new world it doesn't perform at the same level.

You encounter the code. It's context-free and comment-free. It has no tests. There are no design notes, and there is no documentation. Now what?

There has to be some way to crack the ice. One approach would be to grab onto an interface and try to run the code. Writing a very dumb and simple unit test would be a start. Assuming it's a class, then pick a public function that looks promising. Maybe there's a "set" function. Hit it and see what happens. Then maybe poke "save" and see if it'll dump to disk. You get the general idea.

Maybe that works, or maybe not. When I get stuck at this point, I usually start working backwards from the code to try to come up with some pseudocode. Basically, I write the pre-code notes I would have created way back when I first wrote it... if I had written it. I essentially turn the code into something between English and the native language, dropping bits of the syntax which aren't really useful.

Here's a really stupidly simple example.

foo.h:
 
Foo::checkWOD()
 
  - poke backend for word list
 
    - if this fails, try the cache
 
      - if not found in the cache, yell and return
 
  - flip through list
 
    - if name == 'xyzzy': scream('word of the day') and return true
 
    otherwise return false

As you can see, it's a mix of C++-ish syntax, python-ish "ifs" and single-quoted strings, and some plain old bullet point action going on. Even though there's only really 6 lines of description up there, the actual implementation might be 30 or 40 lines of code depending on how it was written and organized.

If I had written this in the first place, something like this probably existed at some point. It might have been on paper, in a text file, on a marker board, or just in my head for a while, but it was there. Here, I had to work backwards to get this sort of thing.

What next? Well, who talks to this code? What sort of hooks do they have into this code? In other words, where are the "seams" between the code I'm targeting and every one of its customers? Are they clean or messy? Is it going to be easy to do a replacement with a new implementation, or will the call sites themselves need to be reworked too?

That might be another list of notes, starting with the filenames found in a grep, and expanding with details of what those bits of code do, and what parts of the target code are accessed. Sometimes you get lucky and can find things which aren't even used any more -- dead code! This means it doesn't need to be ported to the new scheme.

Sometimes, people do sneaky things. Stuff like pointers and references make it possible to hand copies of things to other things, and they can then call you from weird places. They might even do this under a name which doesn't resemble the name you grepped for. This expands your search parameters. Now you have to look for references to Foo, but also references to TheBar. Yeah, it happens.

There's more, of course, but this is how it starts.

This is the kind of stuff which keeps me busy.