Writing

Feed Software, technology, sysadmin war stories, and more.

Saturday, April 14, 2012

Making C++ coverage builds really easy

I've been working on improving my C++ build tool. The last time I wrote about it, things were functional but decidedly early. Input and output files were comingled, and a "make clean" was necessary before doing "git add" or your repository would be polluted with binaries.

Clearly, this was suboptimal. Fortunately, it was only temporary. I've since changed it to use a separate output path. As long as I was doing that, I decided to make it have several output paths. It's my start towards honoring zero, one, infinity.

Why would you want multiple output paths? That's easy. Let's say you have your normal default builds. They don't do anything special. You don't add debugging symbols to them with -g, but you also don't strip those symbols with -s. It's just the same boring stuff you'd get by calling g++ with basically no extra flags.

But hey, sometimes you're in a pickle and you really need those extra symbols to make your gdb or valgrind session easier. That's when you want a debug build with -g added. You also want to be sure that you don't mix and match things here. That is, if you want a debug build, you probably want all of your objects and binaries built that way, not just the ones which changed recently.

Here's what I'm talking about. Take some random project. Do "make". You get a default build, right? Now tell it to do a debug build. Maybe that means "CFLAGS=-g make", or something like that. Does anything happen? I bet nothing happens, because as far as make is concerned, all of the targets have been built.

So let's say you touch main.cc so it gets rebuilt. Make rebuilds that object and anything else which might depend on it and re-links everything into a new binary. Now you run it and trigger your error, and ... no symbols! What gives?

Well, what happened is that your problem happened down in a module which wasn't recompiled with -g this time around, since it had been compiled previously. Now you have to "make clean" and start all over. You also have to do this every time you flip back and forth.

Now, let's say your project isn't quite that flexible. Maybe you have to 'configure --enable-debug' to switch that on. Now you're looking at an even longer slog through recompiling every time you want to flip back and forth.

I suppose it is possible to build a project with Makefiles which allows for default, debug, and other sorts of builds and keeps all of their objects and binaries separate without forcing you to re-run configure. I'm not sure I've ever encountered one, though. It seems like it would be a lot of work.

Over here in my world, it's not much work at all. Take a look:

~$ mkdir depot
~$ cd depot
~/depot$ touch .depot.root
~/depot$ mkdir -p src/core
~/depot$ cat > src/core/main.cc
#include <stdio.h>
 
int main() {
  printf("hello from main\n");
  return 0;
}
~/depot$

Notice things have changed a bit from last time. Now there's this top-level directory which has "src", and all of the source code lives under there somewhere. I've added a dumb little example program in there.

Now to build it in the usual way:

~/depot$ dep_cli core/main

I0414 193055 25070 build/deptracker.cc:131] Analysis of all deps done

I0414 193055 25070 build/dep.cc:297] cd src && g++ -Wall -Werror -I. -c core/main.cc -o ../bin/core/main.o

I0414 193056 25070 build/deptracker.cc:199] g++ -Wall -Werror bin/core/main.o -o bin/core/main

-rwxr-xr-x 1 bb users 8003 Apr 14 19:30 bin/core/main

~/depot$ bin/core/main

hello from main

~/depot$

That's it. Not too bad, right? Incidentally, dep_cli is the name of the command line interface to this thing for the moment, in case you're wondering. It doesn't have a good name yet.

Anyway, now let's say I want a debug build. No problem.

~/depot$ dep_cli -c debug core/main

I0414 193339 25097 build/deptracker.cc:131] Analysis of all deps done

I0414 193339 25097 build/dep.cc:297] cd src && g++ -Wall -Werror -g -I. -c core/main.cc -o ../debug/core/main.o

I0414 193339 25097 build/deptracker.cc:199] g++ -Wall -Werror debug/core/main.o -o debug/core/main

-rwxr-xr-x 1 bb users 8611 Apr 14 19:33 debug/core/main

~/depot$ debug/core/main

hello from main

~/depot$

Again, it's no big deal. This one is slightly bigger since it has a full set of debugging symbols courtesy of g++ and its -g switch.

There's also an "opt" build which is just like debug, only it uses "-s" to strip out most of the debugging symbols. It makes for a smaller binary which is harder to analyze with tools like 'strings'.

Finally, there's one which really makes this worthwhile: coverage builds. This switches on coverage mode at compile and link-time so that the compiler, linker, and runtime code will spit out instrumentation about lines of code and whether they are run or not.

To demonstrate this, first, I'll change the demo code around so we have something slightly more interesting. This time around, main.cc will have a function call, and that function will include a branch which is not taken.

~/depot$ cat src/core/main.cc
#include <stdio.h>
 
static void do_stuff(int i) {
  if (i == 0) {
    printf("hello from main\n");
  } else {
    printf("goodbye\n");
  }
}
 
int main() {
  do_stuff(0);
  return 0;
}

Now I run a little helper script which does all of the magic. First, it builds the target in coverage mode, then it uses lcov to set things up. After that, it runs the program. This makes the coverage code which was linked in emit statistics on which lines were visited. Next, my script runs lcov a few more times to pull everything together.

Finally, it runs genhtml for pretty output. This is what it looks like:

Top level output from lcov/genhtml

Not too bad, right? Selecting main.cc gives about the same results, which is to be expected since there is just the one file:

Output from lcov/genhtml for core/

Finally, picking main.cc, we see the line-by-line analysis:

Output for core/main.cc

As you can see, there is a line of code which was never run. If we were trying to maximize coverage with our tests, it might be time to write one which deliberately calls that function with a nonzero value.

You should see what it looks like when I run it on a much bigger project with bunches of subdirectories, libraries, and header files. It's quite a thing to see.

If you write a lot of C++ code, you should check this out. Let me know if you are interested.

There's life after Makefiles.


April 22, 2012: This post has an update.