Writing

Feed Software, technology, sysadmin war stories, and more.

Monday, December 17, 2012

My C++ style and creating wrappers for low-level things

I had a friend ask me for more details about my "C++ style". I'm not one to disappoint someone who is clearly interested in what I do, so here it is.

My big thing is trying to wall off the low level C library stuff so that it doesn't leak into other things I try to do in my programs. For example, I find that a lot of my programs need to take a string and write the contents to a file. Some of them do this several times before the file is closed.

You can't just pass a std::string to a C function like fwrite() since it doesn't know what to do with it. It really just wants to see a pointer to the data, the member size, number of members, and a FILE* pointer. This means you have to take that string and pull out the c_str() (pointer to a C-style string) and length() data to give it to fwrite(). Of course, this also means you need a FILE*, and you get those by way of fopen.

This means now I have another thing to worry about: that FILE pointer. It has to be opened and stored somewhere until I'm done and finally closed. If I forget to call fclose() on it, my program will leak that file descriptor and some amount of associated housekeeping memory. With that kind of flaw, the program would eventually die upon hitting the ulimit on open fds -- usually 1024.

Now, I could do like I did in the old days, and just explicitly call fopen, fwrite, and fclose every time I need to do this in my code. It also means I have to make sure I get it right every time and don't forget anything important. If there are multiple ways out of the code which does the writing, I have to make sure every one of them hits fclose(). It's annoying.

So, instead of worrying about that, I wrap all of that stuff up in a class. It creates the FILE* at Open() time, and hangs onto it until Close() is called. If you destroy an instance of this class with it still open, it yells at you since you didn't clean up after yourself, but it still closes the file to avoid a leak.

In practice, it looks like this:

bool do_something() {
  FileWriter page("/path/to/page.html");
  
  if (!page.Open()) {
    // yell about it
    return false;
  }
 
  if (!page.WriteString(...)) {
    // yep, yell about it
    return false;
  }
 
  // ... and so on ...
 
  if (!page.Close()) {
    // you guessed it
    return false;
  }
 
  return true;
}

I didn't mention this earlier, but this code actually opens the file with a temporary extension and renames it into place at Close(). This way, it becomes an atomic update in the event it's overwriting an existing file. This also means that if it should bomb out while it's running, it won't hurt the existing file. It also has the interesting side effect that failed runs will leave the temp file behind for debugging purposes so that I can see what did get written before it bailed out.

FileWriter itself is where all of the lowercase C functions with short names are called. It pulls in the headers like stdio.h to get things done. It reaches into those C strings with .c_str() and .length() to get the values it needs. It is my translation layer between the charmed world of do_something and the stark reality of the C library underneath all of this.

It's possible to go overboard in terms of library functions. Setting out to make a bunch of "just in case" wrappers is pointless. I tend to use the raw C functions directly in a new chunk of code until the right sort of API makes itself known to me. Then, once that becomes apparent, I can make a small helper class which behaves that way and calls all of the low-level stuff for me.

I should also mention that this sort of disconnect makes testing much easier. It's far easier to write a bunch of tests which use FileWriter directly to put it through its paces than it would be to attempt the same thing via one of FileWriter's users. Given that testing a file writing class usually involves creating actual temp files somewhere, I find it best to keep it on a short leash.

This separation between actual disk I/O and client programs can sometimes present an interesting opportunity to test those programs in a bunch of wild simulated environments. In one of the few places where I'll actually use inheritance, I can create a mock version of something like FileWriter and have some other code use that instead. That way, I can cause all sorts of nasty failures at different spots which would normally be hard to trigger when actually talking to the disk.

For instance, let's say you want to see what happens when your calling program calls Close() and it fails. Given that fclose() itself can fail, you'd better come up with a plan for handling it! The trouble is trying to create a genuine fclose failure on demand. It's just a pain to make that sort of thing work. So, instead of doing that, you use a MockFileWriter and have it return false when its Close() is called. Then stand back and watch the fireworks.

Obviously, if you've done your homework in the calling code, it will do something reasonable and life will go on. One more branch will be shown to have worked in a test scenario, and your coverage numbers will inch up just a little higher.

I'm not saying this is the best way to go, and it's definitely not the only way to design something like this, but it's the one I'm using right now. If you ever wondered what makes some of my code tick, this should shed a tiny bit more light on things.