Writing

Feed Software, technology, sysadmin war stories, and more.

Saturday, November 10, 2012

Writing tests first can be useful in some situations

I like to think of some practices as "being able to reach backwards in time" to make things happen before you get to a certain point. In software development, I think that having a strong testing regime can be one of those things. Not every task may call for that kind of strict adherence, but those which do can get some interesting results from it.

In my days of wrangling test automation infrastructure, I wound up writing most of a system which was supposed to use as much of the company's Secret Sauce as possible instead of re-inventing things. The previous project had a serious case of NIH. Now, NIH at this company was common, but it was always relative to the outside world. It was rare to find it happening to something created internally, particularly when it was fundamental plumbing stuff which was being maintained by a dedicated team.

One thing I had to create during all of this was a way to select variants of the Linux kernel for testing. It needed to allow choosing things based on the presence or absence of "tags" which would be applied to different trees. That way, you could have a "mainline" tag, a "burnin" tag, a "HEAD" tag, and so on. Those three tags would probably be mutually exclusive, but there could be others mixed in: "i386", "x86_64", "secretneutronaccelerator", and so on.

I came up with a scheme which should allow all of this to happen. Each kernel tree would have any number of tags. Then, any given test could have a filter specification which had a few parameters: "all", "relation", "include", and "exclude". Here's how it worked.

If "all" was true, then your list started with the set of all known kernels. You might use this if you wanted to test everything. In this mode, "relation" and "include" were ignored since they had no meaning. You could set "exclude" and remove kernels with certain tags. This would give you the set of all kernels minus anything matching those tags you specified. In practice, this would be used as a "test everything except for these two or three weird ones" handler.

You could also set "all" to false, and then your list would start with nothing in it. You'd have to explicitly add kernels to the list by using "include" statements to match tags. This is also where the "relation" came in. It could be set to "AND" or "OR" for the appropriate boolean logic. With this, you could scoop up kernels which matched any of the tags you had listed, or only those which happened to have all of them. In this mode, "exclude" was ignored.

Given this design, I decided to test it on my whiteboard. I created six theoretical kernels and gave them a mix of tags. It looked something like this:

Then I had a series of configurations and the expected matches:

all: true, rel: X, inc: X, exc: tag2 = k2, k4, k5

all: true, rel: X, inc: X, exc: tag2, tag3 = k2

all: false, rel: OR, inc: tag1, tag2, exc: X = k1, k2, k3, k4, k6

all: false, rel: AND, inc: tag1, tag2 exc: X = k1, k3

all: false, rel: AND, inc: tag4, exc: X = (empty set)

all: true, rel: X, inc: X, exc: (empty set) = k1, k2, k3, k4, k5, k6

Filter scribbles

Given all of this, it was a simple matter to mechanically translate this into a series of tests. The kernels would be initialized the same way every time, and then I'd make a series of calls to the filter code with the above parameters, one for each test. The test would then verify that the output set was equal to what I had predicted on my table.

With that in place, it was a simple matter to then go in and write the code which did all of the work. Also, by having actually created some use cases, it gave me the opportunity to play around and see what a sensible API might look like. Sometimes it's hard to tell exactly how that should work until you see how the callers will be. Test cases are those first callers.

In any case, knowing I wanted to have high test coverage for this project meant writing a bunch of tests. Knowing I'd have to test things forced me to come up with test cases. It also subtly steered my decisions toward things which were easier to test instead of those which might have required a lot of rework later to add the right kind of testing hooks.

This is the ultimate result: long before I get my test coverage numbers back, I'm already designing things to plug together in a way that gives me a way to test it. The alternative would be to barrel on ahead and try to figure it out later. Of course, by then, the code is in use, and it's potentially fragile, and you're afraid to change it, and all of this.

Time and resources aren't always such that this sort of thing will work. When it's possible, I recommend it. When it can't be done, try not to stress about it too much. It would be silly to think that one tool can solve all problems in all situations.