Writing

Feed Software, technology, sysadmin war stories, and more.

Wednesday, March 7, 2012

Fighting boredom at a sysadmin job by writing a binary grabber

I used to get really bored during the day at my school district sysadmin gig. For the most part, my systems just sat there and worked without complaint, so "business as usual" meant things like "answer user e-mails really quickly and blow their minds". When even that failed to produce adequate work, the next item was "look at Slashdot". If that was down or otherwise out of fresh stories, then I had to find other sources of amusement. Sometimes, that meant writing programs.

One afternoon, I decided to take advantage of a NNTP feed that we had available by virtue of being hooked up with a certain provider. I knew all about using tin or whatever to bring in uuencoded files, but it was becoming a pain as things got bigger. Even MP3 files would typically be split across multiple posts.

While it was possible to manually save those in tin and then decode all of them, it was annoying. I wound up writing my own little uudecode helper which would automatically step through the different pieces, but all of that tagging and saving mess in my actual newsreader was just wrong. Worse still, if a part was missing, I usually wouldn't find out until it was too late. I needed to use a scratchpad to keep track of the parts as I found them, and ... yeah. What a mess.

On this afternoon, I decided to write something which would first pull in the list of subjects and cache it locally so I could experiment without annoying the server admins. Then I wrote something else which pulled in all of that data and made sense of it. It turned out to be surprisingly simple.

The typical notation would be a subject like "foobar.mp3 (1/3)", where both the current part number and total number of parts would be included. Sometimes it was (parens) and other times it was [brackets], but it was simple enough to handle. I decided to make my program chop off that part and interpret it while keeping track of the actual message ID on the server.

By doing this, I managed to arrive at a list of files instead of a list of posts like an ordinary news reader would have. I added some logic to make it ignore anything which was incomplete by checking to see if I had all of the parts. For the above example, that means finding 1/3, 2/3 and 3/3 before declaring victory. It also had the nice side-effect of hiding non-binary posts since they rarely matched that format.

Even with the "incomplete filter" in place excluding a fair number of files, there would always be plenty of interesting things to see. My next step was to render all of this in a nice list. I did all of this with ncurses and built a list which fit in whatever size terminal you had at the time.

The up and down arrow keys served to move a highlight bar or scroll the whole list, while the right and left arrows would scroll the current line horizontally. In this way, you could see the title even if it was too long for the current window. In addition, if you just idled on a too-long line, it would automatically scroll it all the way to the right, pause, and then reverse and scroll all the way back to the left. Then it would pause at that end before starting over. I mostly did this to show off since it was rarely necessary to read the full line to know what you'd be getting.

After that, I added file tagging and post fetching. Pressing the space bar while on a file would tag it so that all of the associated posts would later be downloaded. I never got around to making this run in a background process, so it would all happen at once later when I was done looking at the list.

Later improvements included intelligent caching of the subjects data on disk so that it would only have to pick up the latest posts from upstream. This meant no redundant transfer and faster startup times, in addition to less load for our NNTP server.

I should mention that all of the tracks I found through there later drove a bunch of CD purchases. Years later, when I switched to using a Mac as my full-time workstation and general amusement system, I declared musical bankruptcy and restarted by importing everything from my CDs. I filled a few tracks in from the iTunes music store too. As a result, all of those "acquisitions" from the old days are now gone.

Oh, and finally, while I was having to come up with things to do, the "network engineers" were undoubtedly running around putting out fires on their side of the house. They were always "busy", and I determined that was probably not a good thing. If you're supposed to be running a stable service and you're always busy, what's really going on there?