If you're writing code in C++, odds are good that you are maintaining state in two places, and you probably get it out of sync more often than not. What are the two places? One of them is the top of your .h and .cc files:
#include "base/logging.h" #include "net/http_client.h"
If you're using make, then the other place is in a Makefile somewhere:
my_project: my_project.o base/logging.o net/http_client.o
This isn't great. When it gets out of sync, you end up with broken builds when stuff is missing, and bloated binaries when too much stuff has been left in. If you're using cmake or anything like that, you're in the same situation. You're just throwing a slightly different syntax at the problem.
Also, don't forget about external libraries. Are you using protobuf? How about libcurl? mysqlclient? libpq from Postgres? GNU Radio? All of those have requirements - extra compiler flags and extra linker flags. You need to carry those all the way up any time they are referenced in a project. Do you want to edit a Makefile (or whatever) every time you hook in a library that hooks in some external dependency? No way!
Your code already has all of the dependencies expressed right there in it. See those #include directives? Those unambiguously state "I need this header and/or library target in order to work". It's been sitting there all this time. You just have to start using it for your own benefit.
If you want something like this to work, you have to commit to a certain amount of consistency in your code base. You might have to throw out a few really nasty hacks that you've done in the past. It's entirely likely that most people are fully unwilling or unable to do this, and so they will continue to suffer. That's on them.
Here's what you need to do in order to have success with this sort of build approach.
That's not so bad, right? You are probably doing this already.
Any time you #include something inside the tree, do it with "", and always spell out the full relative path to it inside the tree. It's always #include "net/http_client.h. It's never just #include "http_client.h", even if you're in the same directory.
This one is simpler. Any time you have a dependency outside the project, it's a system-level #include with <angle brackets>. Then make sure you always use the same path for the same targets. If it's #include <mysql/mysql.h> in one place, then it should be that anywhere else which also uses that external library.
base/foo.cc and base/foo.h compile into base/foo.o. lib/bar.cc and lib/bar.h compile into lib/bar.o. You will never compile more than one .cc file into a single .o file. That's what linking is for.
base/foo.cc and/or base/foo.h can #include lib/bar.h, but in that case, lib/bar.cc and/or lib/bar.h can never #include base/foo.h, because that would cause a cycle. In other words, no loops are permitted.
This also has the nice side-effect of making you do the right thing when you design your code. If bar can't refer back to foo, then you can't make awful implementations which are hard to figure out six months later when nobody remembers how this thing worked.
An internal target is what you're referencing when you do an #include with a relative path "in quotes like this". If you #include "lib/bar.h", you have expressed a dependency on a target called lib/bar.
A header target is one that only has a .h file present. It is not compiled, but it is scanned to see if it has additional dependencies (#includes), same as any other internal target.
A library target has both a .cc and a .h file present. It is compiled into a .o file. Both the .cc and .h files are scanned for additional dependencies (#includes).
A binary target has a .cc file present and also has "int main(" somewhere in it. It is first compiled into a .o file just like a library target, and it is also scanned for dependencies (#includes). Then once all of its dependencies are satisfied, that .o file is linked (along with any other library .o) files into a binary. The link stage also uses any ldflags picked up along the way from any dependencies.
A system target is what you're referencing when you do an #include with a relative path <in brackets like this>. You might do #include <mysql/mysql.h> or #include <string> to express this need.
Most system libraries need a little help for you to use them. This typically means augmenting the include directories searched by the compiler, and the library directories searched by the linker. The easiest way to handle this is to have your build tool notice when someone adds a system dependency, then look up the flags it needs.
One approach is to just literally specify the flags for a given target, like this:
system_header { name: "atomic" ldflag: "-latomic" }
You can do this with more complicated targets, but it can be annoying to keep straight. Instead, you should use something like pkg-config to get whatever values were bundled in by whoever built the library in the first place, like this:
system_header { name: "jansson.h" pkg_config_name: "jansson" }
When the build tool encounters #include <jansson.h>, it'll run pkg-config --cflags jansson to figure out how to compile something which depends on it. Later, it'll also run pkg-config --libs jansson to get the -L and/or -l flags required to link it into a binary.
If you're wondering "why do we need the config stanza if we have pkg-config", that's because pkg-config won't tell us which #include targets map onto which pkg-config library names. If there was a way for pkg-config to say "when you see #include <jansson.h>, ask me about jansson", then you wouldn't need this!
If any pkg-config (or equivalent tool) maintainer types see this and decide to put in the mapping from (what-to-#include) to (what-to-ask-about), that would be amazing and would solve SO many problems.
A future version of the build tool may come "pre-loaded" with mappings from #include <whatever> to pkg-config names for popular libraries and other common targets. Ideally, very little would be needed in any sort of config file.
Any compiler or linker flags picked up while processing a system target "attach" to whatever local target referenced it. Then, if that local target is referenced somewhere else, the compile and/or linker flags also travel upward in the tree. This way, if target B depends on jansson and picks up some special flags, those flags will be there when A is compiled. They will also be there when A is pulled into my_project and linked into a binary.
Any given set of system_header { ... } directives should hold up across that same version of the Linux distribution or OS. You might need to tweak it slightly if the pkg-config stuff is wrong on a given system. pkg-config hints for some libraries frequently leave out things that they actually need. (GNU Radio, I'm looking at you.)
This approach has been shown to work across multiple Linux distributions, BSD flavors, and architectures with minimal changes to the system_header directives.
Actual portability issues come down to what you use in your code. If you use epoll(), don't expect it to work on non-Linux systems. If you try to access more than about 3 GB of memory, don't expect it to work on a 32 bit system. This is all about what you do in your code and has nothing to do with the build system itself.
Not necessarily. I built something to do this a long time ago and have been using it ever since. Other people heard about it and did the same thing. It's not that hard to do.
Want to see what it looks like? Check out some recordings from 2013 in which I use the build tool without calling attention to it. Notice there are no build scripts, and there's no build language. You just write code and tell it to build. If it's a binary target, you get a binary out and can run it right away. Also, once you teach it how to handle an external system dependency, it can deal with it anywhere else.
Yeah, so, people who get into code-building projects because they have some nerd need to create yet another specialized language for expressing how to build their code are never going to "get" this approach. Let them be. They'll be creating yet another half-assed incomplete implementation of half of Common Lisp and will be very happy with that. (Meanwhile, they'll still be using some terrible build system with a different DSL to actually get some amount of "real work" done when their bosses yell at them for screwing around on the job.)
Those of us who just want to write C++ code and turn it into usable programs with a minimum of fuss will be off doing that instead of spending cycles screwing with a build system and a build language.
Yup. Way back at the end of 2012, I released both a Linux x86_64 ELF binary and a Mac ... uh, whatever... binary to let people play with a version of my build tool. Nobody actually downloaded or used it, and it's been almost 10 years, so I removed it. (Download some random binary off the Internet and run it? Sounds legit, right?)
It seems better to just rig up a page like this to guilt people into trying to do something about it on their own. This is that page, and now you know.
Well then, since you put it that way, we should talk. Hit the contact link at the bottom of the page and let's figure something out.
You can send comments, questions, or whatever via my contact form.