Writing

Feed Software, technology, sysadmin war stories, and more.

Friday, January 22, 2021

Lurking bugs behind infrequently-executed code paths

As a result of being cooped up in the house for this extended period, I've been doing a lot of random tidying on my computers. I have years and years of accumulated cruft lined up along with a few things which are vintage and neat. The trick is finding out which is which and throwing out the useless stuff.

Part of this has involved digging through the debris left over from old projects. One of those projects was from around 2003 or 2004, when I rigged up a really ugly parallel port connection to an old Commodore 1541 floppy drive. This was not one of my originals, but rather a drive I had purchased explicitly for the purpose of "saving" my old disks.

It's probably a good thing that I did it then, back when the disks were "only" 15 or so years old, instead of waiting much later. I doubt many of them would still be readable now that another 15 years have passed, in other words.

Anyway, as I went digging through an "unsorted" directory the other night, I came across a pair of disk images from a truly ancient project I had done once upon a time. While I don't remember my motivations any more, at some point I decided I needed to write a bulletin board system (BBS) so people could connect to my computer to send and receive messages and trade files. There were a bunch of existing systems, but for some reason I had gotten this notion in my head that I needed to write one.

A kindly sysop of a local board had given me a copy of his hand-rolled program and I had used it as the inspiration for my own thing. These disk images represented the "program" and "data" disks from that project, circa 1989.

Upon finding that, I decided it was time to fire it up and see how much of it worked. Would it handle the year 2021? Did all of the code make it over intact? Just how much of that thing did I manage to finish? How many dead ends would there be for the different options on the "main menu", advertising a feature that never got written?

I logged in, and eventually found my way to something called "graffiti". The idea was that people could write a short message here once in a while, and it would be displayed to the next caller to the board. That caller could then leave something and so on. I don't remember why we had both this and the typical "sub-boards" (message areas) on the same systems, but there it was.

I told it to display all that it had, and sure enough, there were ancient test messages from a much younger me, plus stuff from about a year later expressing amazement that "so much time had passed". This is when it occurred to me: did the code exist to "write" a new graffiti post? Did I ever do that, or were the existing entries just crap I had done by hand by appending to the file?

Obviously, I had to check, and wrote something dumb out. After hitting ENTER (well, RETURN, given it's an [emulated] C-64, but whatever), it gave me a bunch of options: abort, edit, list, save, view. So, wait, list *and* view? I had to see what this was about. I hit L. It asked me for a line number, so I gave it one.

That's when I was rewarded with this:

?SYNTAX ERROR IN LINE 6500.

With that, the program died, and would go no further. Clearly, this needed to be investigated. What was wrong? The line of the code looked like this:

6500 P$=STR$(EN)"+".":GOSUB 1720

See that first "? Yeah, that shouldn't be there. What that line is supposed to do is generate a prompt which is the line number to edit (EN), a period, and then send it to the display/caller (the subroutine at line 1720).

What it did in practice was crash.

What really sucks about this is that this is a mistake I made back when originally writing this, and then never noticed. I guess I never bothered to test this branch of the code, and it survived in its little magnetic time capsule all the way through to the present day. Even later snapshots on that disk have the same error in them.

It looks like I wrote this bug in 1989. It just bit me in 2021.

Just think about that: it's a syntax error which hides in your code until you finally execute a certain branch. That's unthinkable today, right? Haven't we all grown to the point that no program ships with this kind of stuff in it? Don't all of our systems have various "sealing" type steps applied to them before they leave development and enter production?

The sad news is that this, like many other failures of the time, is still trivial to do. You can bury all kinds of terrible things in barely-used paths and they will just hang out, waiting for their fifteen minutes of fame, even if it takes thirty years to happen.

Of course, THE ONE doesn't make mistakes. But the rest of us do.