Writing

Software, technology, sysadmin war stories, and more. Feed
Wednesday, January 1, 2014

Inflicting a worst-case scenario on my own binary tree code

I have unintentionally written ticking time bombs in my code. Hopefully, I don't do it nearly as much any more, but the old ones remain. One of them bit me earlier this week.

About 12 years ago, I sat down and wrote a program to manage my personal diary. It's not the sort of thing I ever shared with the world, and it was just enough to take a bunch of flat text files and render them into somewhat-nicer HTML. It was later the inspiration for the "publog" software which generates these posts, the Atom feed, and yes, even the books.

The original software, however, never changed. Aside from some cosmetic changes over the years and adaptations to varying path schemes when I moved it to a Mac, the core has remained. It was written in C with very different techniques than I might use in C++ today.

One of the things I did when writing that program was to cache information about entries. Instead of having the program rewrite every input file into an output file on every pass, it would instead stat() the input file and compare its mtime to the cached mtime. If it was the same, nothing had changed and so nothing needed to be done.

On disk, the format was simple enough. It was something like this:

2014/01/01 1388621917 "thing 1" "thing 2" "thing 3"

This was then stored in my own hand-rolled binary tree implementation, keyed off the first blob: "2014/01/01" in this case. I figured this would make for speedy lookups later on when it came time to check the data for an arbitrary record.

So, I wrote it this way in 2002, and it's been fine all this time, and then it broke. It just started segfaulting a couple of days ago. I decided to feed it to gdb just to get some idea of where it was blowing up and was rewarded with quite a mess.

The backtrace showed several thousand calls to one function with a name like add_to_tree(). That plus the segfault makes it pretty clear: I was blowing up the stack with excessive recursion. What was going on?

To understand this problem I first had to get back into the groove of my old code. The best way to do that was just to read it "in order", starting from "int main" and following branches, taking notes along the way. Remember, this thing is 12 years old and I forgot all about the finer points of how it worked along the way.

Eventually I had a sort of narrative about how it was supposed to work, and things started fitting into place. That's when I realized what happened: I had created a pathological case for my binary tree.

Recall that I have a cache, and I write it to disk between runs. It gets written to disk using an in-order traversal, so I get things like this in that file:

2013/12/01 ...
2013/12/02 ...
2013/12/03 ...

You get the idea. When this gets loaded in, my binary tree code just sees it as being "greater than" the root node, and then that node's right child, and then that node's right child, and so on.

Yep, I turned my binary tree into a linked list.

This held up for years and years until I finally gave it the entry which broke the camel's back, and it simply could not recurse any more given the available stack space. Then it just died.

This would also explain why it had been getting slow over the years, but it all happened so gradually and I didn't really notice it. That and upgrading my workstation hardware multiple times made it disappear into the noise.

If my binary tree had been balanced, this wouldn't have happened.

Rather than trying to hack up one of those, I just pulled this crusty old code into my current tree of C++ projects and introduced it to a little thing called the STL. It's still really horrible code, but I've managed to chop out the nastiest bits which were keeping it from working. Maybe one of these days I'll go back and bring the rest of it up to spec.

Of course, that may be another 12 years from now. See you in 2026?