Writing

Feed Software, technology, sysadmin war stories, and more.

Saturday, November 19, 2011

Multithreaded bug finding, and an idea: temporal fuzzing

A couple of years ago, I had an idea about a way to mess with programs to look for design or implementation flaws pertaining to parallelism. I haven't had a chance (or really, a reason) to try it in recent times, so I figured I'd share it here. The alternative would be sitting on it, and that's just evil. There's enough of that going on in the software world already.

My idea is simple enough. I call it temporal fuzzing.

What's that, you say? Well, fuzzing is what you call it when you lob all kinds of ridiculous data at your program to see what happens. If it has some unhandled corner cases, you might find interesting bugs that way before the evil hax0rs do.

Temporal fuzzing, then, would be injecting a bunch of variable-length intervals into your program. You do this to take a program which usually behaves the same way and make it purposely behave differently.

Now, again, I haven't actually tried doing this, but it would effectively look like this:

usleep(a);
do_something();
usleep(b);
do_something_else();
usleep(c);

All of those "usleep" calls would be injected by the temporal fuzzer. Then the values of a, b, and c would be varied on every pass.

The idea here is to purposely trigger the worst case scenario. Instead of having to wait for that .001% case where your test chokes due to some bizarre scheduling thing between threads, why not force it to happen?

There's another interesting point to all of this. It's the kind of stupid grunt work which is really annoying for humans to do, but is totally trivial for a computer to do. All it has to do is keep running your test suite with different combinations until something blows up. Then it just reports what it did and where it went wrong, and you can check it out and try to learn from it later on.

Sure, there are tools like helgrind (part of valgrind), and they rock, but they do tend to assume a certain sense of interactive use. They also might not cover every scenario and/or use case which something like this might expose.

If nothing else, this sort of trick would give you ammo to use against obstinate developers who might claim that the output from helgrind "proves nothing because it'll never happen". You can turn right around and say, look, right here we have a scheduling scenario where it does happen, so suck it up and get it right.

So there it is. Hopefully this idea will go somewhere.


December 1, 2011: This post has an update.