Writing

Feed Software, technology, sysadmin war stories, and more.

Wednesday, April 4, 2012

My C++ style which everyone will probably hate

There's a meme about C++ which says that everyone picks and chooses their own 20% or so of the language to use and ignores the rest. I don't know if this truly applies to every C++ programmer or project, but I know that I definitely avoid certain parts of the language. Just because it's there, I don't necessarily want to use it.

This also extends down to more "meta" issues, like how the code is arranged and how it all hooks together. Yet again, this is a matter of choosing a subset from the huge number of ways in which you could fit together all of the pieces. I choose to ignore some techniques for my own reasons.

All of this leads to a coding style, or, if you will, a "manifesto" of how things should be written and how they should be laid out. I've been doing this for a while now, so when I wrote something which would automatically read through source code to compile and link things without Makefiles, it was easy to just support that case. The flip side of that decision is that non-conforming code trees just aren't supported by my little build tool. That's fine by me, since they can still use make, autoconf, automake, and whatever else just like they always have.

So, with all of that said, and with the realization this will probably generate a bunch of holy war comments if it reaches the right circles, here's what I do with C++. Most of these were adapted from other places. I'm not claiming to have invented this stuff, in other words, and if you've worked where I've worked, then yes, it will seem very familiar.

All C++ code is in either .cc or .h files and no other extensions.

Any chunk of code I write which talks to another chunk of code I wrote will live somewhere within the same greater "depot". Basically, instead of having a bunch of directories like ~/prog/foo, ~/prog/bar and so on in the past, now I have ~/prog/depot/foo and ~/prog/depot/bar. This makes ~/prog/depot the top of the tree, and this is where git starts.

Includes are always relative to that tree. Even though foo.cc and foo.h are in the same directory (and must be, in fact), foo.cc will #include "foo/foo.h". This means there is no ambiguity about which file you might want. If they are referencing other parts of the tree, they must be "...", not <...> which is reserved for system files.

Including a .cc file is right out. I figured that denying it here will keep people (including myself) from being lazy and trying it. It also has caught a few dumb mistakes I made while pulling older code into this scheme.

Speaking of #includes, you have to include what you use, and then only that, but you also have to be smart about using it. In particular, including too much stuff in a header file is a bad idea, since it tends to drag too much junk around. Let's say you start writing a class and put this in the .h file:

class HoneyBadger {
 public:
  HoneyBadger();
  virtual ~HoneyBadger();
 
  virtual void EatSnake();
 
 private:
  Claws* claws_;
};

Obviously, you have to do something about that reference to "Claws". Instead of pulling in the whole definition of that class, you can just put a "class Claws;" up top and keep that file relatively lightweight. Eventually, you'll have to pull in the definition of that class, but that can happen in your .cc file when you actually use it.

Circular dependencies are not supported. If foo/foo.h includes bar/bar.h which includes foo/foo.h, it's game over. Find some other less mind-bending way to do whatever it is you are doing. Just because you can do something doesn't mean I want to put up with it.

My build tool is not the preprocessor, so it doesn't care about your #ifdefs and other magic. In fact, those things are evil and shouldn't be used outside of the ifdef guards for headers:

#ifndef HB_HONEYBADGER_H_
#define HB_HONEYBADGER_H_
 
// ... blah blah blah ...
 
#endif  // HB_HONEYBADGER_H_

If there's some kind of platform-specific stuff you have to do, then get it over with, bury it deeply, and provide a uniform interface to everyone else. You must not leak this craziness out to the rest of the world. If your interface can settle down and graduate to become a system-level library, that's even better.

With this build tool, an #include "..." is a statement that you have a dependency on that target. Including foo/foo.h causes that file to be scanned, and any of its includes are further scanned in that manner. It also creates a check for a matching file foo/foo.cc. If both of them exist, it is categorized as a library and will be compiled into a single object file. That object file will then be used when linking occurs.

So, if core.cc includes foo/foo.h which includes bar/bar.h, then you will see compilation runs to create core.o, foo/foo.o, and bar/bar.o. Any "helper" compile-time flags (like -I) encountered while chasing these includes will be used at higher levels.

If core.cc happens to be the top of the tree by virtue of having a main() in it, then these three objects will be linked into a binary. Libraries and other link-time flags which were encountered earlier during the include-chasing stage will be used at this point.

Why bother with this? Well, that part is easy. It's the results!

Yesterday, I wrote some new code for one of my projects. It was spread out across a bunch of files, since there were several classes involved. I just used #include as usual and went on with life. While doing this, I was always able to "bb my/target" and have it just work. At no point did i have to wrangle Makefiles, build scripts, or anything else of the sort.

I also didn't have to mess with my .build hint files since every system library in my tree is already accounted for. I can now safely drag in the mysql client or libcurl or libmicrohttpd or whatever when I need it, and my build tool will just figure it out.

The only sad thing about this is that I took so long to finally try it. This is approximately what I wrote in my own notes in November 2003:

I think it would be interesting if lib.h could be included in such a way that also means "and I also need lib.o, thank you very much". That way, any time you linked server.o into something, it would know that lib.o also has to come along for the ride.

I could probably stay busy for several years just fleshing out some of these ideas from years gone by. The way things tend to go in circles and nothing new ever happens in some places, even a nine year old idea can still be useful.