Software, technology, sysadmin war stories, and more. Feed
Monday, September 5, 2011

Why programming language selection matters

I've had people ask me what languages I use to get things done. My answer is usually something like "it depends, but I have my favorites". For a significant amount of time, it's worked out something like this: I might use a shell script to do some dumb one-off thing or prove a point, but anything more complicated graduates to C (or now, C++). Special situations call for their own solutions, but otherwise, it's shell or C or C++.

So then they ask how I know when to switch from one to the other. I have a decent example from around 1997, at my school district sysadmin job. The district library folks had a crusty old inventory system which was DOS based and had no clue about networks. They wanted to put it online somehow and make it searchable.

Even though there was no network support, one of the more clever folks over there managed to get it to "print" its entire inventory to a file, and then got that file to me. I just had to split it, do some basic parsing, and turn it into a usable format for web-based searches. It looked like a one-off shell script job: loop through the input, 'cut' a few fields, echo them in a certain order, repeat. I wrote one and set it to work.

This shell script was working, but it was horribly slow and inefficient. All of those calls to cut and grep and everything else were taking forever, and when multiplied by the length of the file, it was clear it was going to take far too long. This was on our brand new 200 MHz Pentium Pro box which was a big deal in 1997.

I left it running and started rewriting my text-mangling stuff in C. I never really liked C for text handling, but needed the relative speed. I managed to write it, test it, and then run it for real and have it finish before the shell script finished. It has become my go-to example of why language choice can matter, and why a bunch of interpreted subprocess calls can be the wrong choice.