Writing

Software, technology, sysadmin war stories, and more. Feed
Wednesday, March 28, 2012

A better build system

Makefiles are really starting to bug me.

Most of the stuff I do when writing code for my projects follows the same patterns over and over. The code is laid out a certain way, the files are named a certain way, #includes flow in a particular manner, and pieces are glued together. Source compiles into objects, objects link into binaries, and sometimes external libraries join the fray if something interesting is going on.

This is what it looks like. Let's say I'm going to create a new project. For the sake of this example, it's going to have two supporting libraries and something with a core which kicks it all off.

~$ mkdir depot
~$ cd depot
~/depot$ mkdir lib1
~/depot$ mkdir lib2
~/depot$ mkdir core

Obviously, now we need some actual code. After whipping up some sample files, now my tree looks like this:

~/depot$ find -type f
./lib1/lib1.h
./lib1/lib1.cc
./lib2/lib2.h
./lib2/lib2.cc
./core/main.cc

As for the contents, they are simply piddly things, just enough to show that they exist and can run:

~/depot$ for i in `find -type f`; do echo "--- $i"; cat $i; echo; done
--- ./lib1/lib1.h
class Lib1 {
 public:
  Lib1();
  ~Lib1();
 
  void Run();
};
 
--- ./lib1/lib1.cc
#include "lib1/lib1.h"
#include <stdio.h>
 
Lib1::Lib1() {}
Lib1::~Lib1() {}
 
void Lib1::Run() {
  printf("lib1 is running\n");
}
 
--- ./lib2/lib2.h
class Lib2 {
 public:
  Lib2();
  ~Lib2();
 
  void Run();
};
 
--- ./lib2/lib2.cc
#include "lib2/lib2.h"
#include <stdio.h>
 
Lib2::Lib2() {}
Lib2::~Lib2() {}
 
void Lib2::Run() {
  printf("lib2 is running\n");
}
 
--- ./core/main.cc
#include <stdio.h>
 
#include "lib1/lib1.h"
#include "lib2/lib2.h"
 
int main() {
  printf("core is running\n");
 
  Lib1 lib1;
  lib1.Run();
 
  Lib2 lib2;
  lib2.Run();
 
  printf("core is done\n");
  return 0;
}

Given all of this, I can roll all of those .cc files together into a single binary and run it:

~/depot$ g++ -Wall -I . -o core/main */*.cc
~/depot$ core/main
core is running
lib1 is running
lib2 is running
core is done
~/depot$ 

Okay, so I know it works, but that's no way to build stuff. I could write a shell script to do this for me, and maybe that would work for a while. Eventually I'd have to write a Makefile. Odds are, I'd have to write a series of them just to keep track of all of this stuff.

Now I have lib1/Makefile, lib2/Makefile and core/Makefile. They're pretty similar: they all turn .cc and .h files into .o files. Their targets are also the same, so this winds up being shrunk down into a common file which is then included.

Obviously, these three Makefiles mean nothing if they aren't called by something else, so there's a fourth Makefile at the top level to hook them in at the right times. It has to take the commands you give it (build all binaries, build all tests, clean up, ...) and recursively pass it into the Makefiles in those subdirectories.

This is about the point when you start wondering if that 15 year old paper was actually right after all.

Time passes, and now lib1 has a test suite called lib1_test. So now, lib1.o is used by lib1_test and core/main. This is no big deal.

More time passes, and now lib1 has a dependency on MySQL. It now needs a special -I flag when compiling. It's actually a flag which changes from machine to machine. You have Slackware machines and Red Hat machines, and they use slightly different values for everything.

Worse, lib1.o now has a link-time dependency on libmysqlclient which has to bubble up to its users. The -L and -l flags vary from system to system, and they have to find their way into the linking of lib1_test and core/main. If you forget them, you get the nasty spew about symbols which weren't found and your build dies.

You might try to get clever by making a massive CFLAGS and LDFLAGS and gluing everything you want into it. Now you have these *huge* gcc lines with dozens of extra -I, -L, and -l flags. This makes the compiler search all sorts of paths for no good reason when trying to fetch an #include, and it also sends the linker down a bunch of dead ends while looking for libraries.

Worse still, it adds a bunch of dependencies to your project that never existed before. lib2 doesn't use MySQL, but now lib2_test has a dependency on libmysqlclient just because you put "-lmysqlclient" in your global LDFLAGS? How is that a good thing?

So maybe you stop doing that, and now you track your LDFLAGS on a per-Makefile level. There's still some unnecessary cruft going on, but at least you've limited the bleeding to a per-module level.

Some time after that, you write a third library and call it lib3. It needs to be linked into core, so now you have to make some changes.

First, in core/main.cc: #include "lib3/lib3.h"

Second, in core/Makefile, you add lib3/lib3.o to the list of dependencies for the "main" target.

Did you catch that? We just put the same information in two places. If you forgot the #include, any reference to lib3 will probably fail to compile. If you forget the Makefile, then the missing symbols will trip you up at link time. This is annoying.

Later, it turns out that lib3 was a dead end and needs to be removed from core. It'll still sit in the tree, but it won't be shipped for now. Someone pulls the #include out of core/main.cc and goes on with life.

They forgot the Makefile change, so lib3 continues to be linked into main. Your binary now takes longer to link than it should, is now bigger than it has to be, and it is carting around code which might do who knows what. Recall that merely linking a C++ object into a binary can make it do stuff as documented in my spooky C++ action at a distance post.

Someone comes up with a great idea: let's build stuff for the Mac! Now you get to make all of this stuff work on yet another system with its own CFLAGS and LDFLAGS for each of the additional libraries. You also get to find out where things "just worked" on your Linux boxes, and now you need to inject even more magic flags just to make it compile or link on your Mac. You thought just saying "-lfoo" was sufficient... and it was, on your two Linux-flavored setups. On the Mac, it comes in via MacPorts, so it's not in the default linker library path! Got you!

...

...

...

This ... is ... madness.

...

I've been through this too many times, and I'm tired of it. It's time for something different. The common case should be easy. Simple additions of external libraries as dependencies should be simple.

If this sounds familiar, it's because I described it back in January. Back then, I had no pressing need to get it running, so it just stayed on the back burner. I've since had a desire to build my existing stuff on a Mac, and dealing with all of these ridiculous Makefiles drove me into working on it again.

So let's go back to our original tree here:

~/depot$ find
.
./lib1
./lib1/lib1.h
./lib1/lib1.cc
./lib2
./lib2/lib2.h
./lib2/lib2.cc
./core
./core/main.cc

Now I have this tool which understands .cc and .h files, and knows that an #include "foo/bar.h" means you either want a standalone header file, or a library called foo/bar. It can tell by seeing if it's just a lonely header file, or if there's a matching "foo/bar.cc" in there. If it finds both, then you have an actual target which needs to be compiled, and then the resulting object needs to be linked later on.

So let's see what this tool has to say about my source tree. It's called "bb" because I don't have a better name for it yet.

~/depot$ bb core/main
F0327 215724 10114 dep_cli.cc:33] .depot.root not found in pwd - cd to 
depot path first
Aborted
~/depot$ 

Okay, it's confused. It doesn't know that it's sitting in the root of a source tree for a project, but I can help it out. Let's create that file and try again.

~/depot$ touch .depot.root
~/depot$ bb core/main
W0327 215803 10124 hints.cc:43] Unable to open: /home/bb/.build
I0327 215803 10124 deptracker.cc:105] Analysis of all deps done
I0327 215803 10124 dep.cc:168] g++ -Wall -Werror -I/home/bb/depot -c /home/bb/depot/lib1/lib1.cc -o /home/bb/depot/lib1/lib1.o
I0327 215803 10124 dep.cc:168] g++ -Wall -Werror -I/home/bb/depot -c /home/bb/depot/lib2/lib2.cc -o /home/bb/depot/lib2/lib2.o
I0327 215803 10124 dep.cc:168] g++ -Wall -Werror -I/home/bb/depot -c /home/bb/depot/core/main.cc -o /home/bb/depot/core/main.o
I0327 215803 10124 deptracker.cc:167] g++ -Wall -Werror /home/bb/depot/core/main.o /home/bb/depot/lib1/lib1.o /home/bb/depot/lib2/lib2.o -o /home/bb/depot/core/main

Okay, it complained about some weird missing file, but it looks like it did something! Did we get output?

~/depot$ find
.
./lib1
./lib1/lib1.h
./lib1/lib1.cc
./lib1/lib1.o
./lib2
./lib2/lib2.h
./lib2/lib2.cc
./lib2/lib2.o
./core
./core/main.cc
./core/main.o
./core/main
./.depot.root

We did! Does it run?

~/depot$ core/main
core is running
lib1 is running
lib2 is running
core is done
~/depot$ 

It runs like a charm!

So okay, now let's make life a little more interesting. lib1 is going to add that MySQL dependency... for real. So now lib1 looks like this:

#include "lib1/lib1.h"
#include <stdio.h>
#include <mysql/mysql.h>
 
Lib1::Lib1() {}
Lib1::~Lib1() {}
 
void Lib1::Run() {
  MYSQL* mysql_ = mysql_init(NULL);
 
  printf("lib1 is running\n");
 
  mysql_close(mysql_);
  mysql_library_end();
}

What happens when I try to build now? It'll probably be unhappy.

~/depot$ bb core/main W0327 220234 10234 hints.cc:43] Unable to open: /home/bb/.build
I0327 220234 10234 deptracker.cc:105] Analysis of all deps done
I0327 220234 10234 dep.cc:168] g++ -Wall -Werror -I/home/bb/depot -c /home/bb/depot/lib1/lib1.cc -o /home/bb/depot/lib1/lib1.o
I0327 220234 10234 dep.cc:168] g++ -Wall -Werror -I/home/bb/depot -c /home/bb/depot/lib2/lib2.cc -o /home/bb/depot/lib2/lib2.o
I0327 220234 10234 dep.cc:168] g++ -Wall -Werror -I/home/bb/depot -c /home/bb/depot/core/main.cc -o /home/bb/depot/core/main.o
I0327 220234 10234 deptracker.cc:167] g++ -Wall -Werror /home/bb/depot/core/main.o /home/bb/depot/lib1/lib1.o
/home/bb/depot/lib2/lib2.o -o /home/bb/depot/core/main
/home/bb/depot/lib1/lib1.o: In function `Lib1::Run()':
lib1.cc:(.text+0x26): undefined reference to `mysql_init'
lib1.cc:(.text+0x40): undefined reference to `mysql_close'
lib1.cc:(.text+0x45): undefined reference to `mysql_server_end'
collect2: ld returned 1 exit status

Oh dear. Yep, those are the dreaded missing symbols. We need a rule which says "any time you use mysql/mysql.h on this box, you need to remember two specific LDFLAGS at link time". This goes in the .build file in my home directory.

~/depot$ cat > ~/.build
system mysql/mysql.h ldflags -L/usr/lib64/mysql
system mysql/mysql.h ldflags -lmysqlclient
(^D here)

Right, so, let's try this again.

~/depot$ bb core/main
I0327 220356 10263 deptracker.cc:105] Analysis of all deps done
I0327 220356 10263 dep.cc:168] g++ -Wall -Werror -I/home/bb/depot -c /home/bb/depot/lib1/lib1.cc -o /home/bb/depot/lib1/lib1.o
I0327 220356 10263 dep.cc:168] g++ -Wall -Werror -I/home/bb/depot -c /home/bb/depot/lib2/lib2.cc -o /home/bb/depot/lib2/lib2.o
I0327 220356 10263 dep.cc:168] g++ -Wall -Werror -I/home/bb/depot -c /home/bb/depot/core/main.cc -o /home/bb/depot/core/main.o
I0327 220356 10263 deptracker.cc:167] g++ -Wall -Werror -L/usr/lib64/mysql -lmysqlclient /home/bb/depot/core/main.o /home/bb/depot/lib1/lib1.o /home/bb/depot/lib2/lib2.o -o /home/bb/depot/core/main
~/depot$ core/main
core is running
lib1 is running
lib2 is running
core is done

That's it. I can now go and add that same #include to any other project I build as this user on this machine and it will automatically pick up the flags.

When I take the source code over to my Mac which has a totally different style of install courtesy of MacPorts, I need to use flags which reflect the new locations:

system mysql/mysql.h cflags -I/opt/local/include/mysql5
system mysql/mysql.h ldflags -L/opt/local/lib/mysql5/mysql
system mysql/mysql.h ldflags -lmysqlclient

Aha! See that! You need to augment the compile-time flags on a Mac. On Linux, we don't even have to think about that, since it turns out to be in the search path for header files.

In the Makefile world, you'd now need to go in and add a MYSQL_CFLAGS and make sure that gets set properly and gets added to every place where you compile that header. Don't forget, if you include that from your own foo.h, and then your foo.h gets included somewhere else, now you have to propagate that flag up to that target as well!

Now, I'm not claiming that this will handle all build situations. That would be incredibly foolish. If your source code resembles this environment or can be rearranged to match, then it might be helpful to you. Otherwise, you're out of luck... at least, for now.

I'd love to share this with some beta testers. If you're interested, send me a note and let me know!


April 14, 2012: This post has an update.