Software, technology, sysadmin war stories, and more. Feed
Wednesday, July 25, 2012

Programming language versioning bothers me

There was a point when some people I knew in the web hosting biz were trying to stand up a little consulting gig inside the larger company. They had a name, a contract, and the ability to go off and work on special jobs, and were trying to grow things. They started taking on bigger customers and were slowly making things happen. Of course, a bunch of stupid politics happened (like Project Darkness and Umbrellagate and some others, too), and it was killed. It wasn't shut down explicitly at first, but when you fire the two primary people behind a project, that effectively does the same thing.

During the short interval in which it existed, they wanted to write something to handle migrations. This is where some new customer is already running their stuff at some other provider and wanted to come over to us. They might have been running on H-Sphere, or Plesk, or CPanel or something else entirely, and they would frequently ask us for help in moving their sites and customers over.

Invariably, the guys doing this consulting work had to do it by hand, and that's why they started talking about writing tools to make it happen. The one thing which always perplexed me was their choice of languages. They wanted to do it in Python. That by itself didn't really raise any flags with me at first, but then I got to hear them grumble about certain things over dinner one night.

Apparently, there can be huge differences between versions. Keep in mind that you have people migrating from all sorts of different source systems with all different operating systems. Depending on what the customer had at their old place, it was entirely possible they'd have an ancient version of Python installed -- if any at all. This was far enough back to where having it around wasn't a given. Old systems were valid migration sources, after all.

Worse yet were those systems which had multiple versions installed. Some of them had one version which would run as /usr/bin/python but had another one installed so things like up2date (or whatever) could run. The guys were lamenting the fact that they couldn't count on version X being there all the time because it gave them features Y and Z.

This really blew my mind. In my sheltered little world of writing programs in C, I simply did not worry about versions of programming languages. I mean, here it was, 2005 or so, and C had always "just been C" as long as I had been using it -- about 10 years by that point. Sure, there had been some craziness in the past, and tools like gcc still had options like -ansi, but who ever tripped over that now?

The entire notion of "I need a feature which only came into being in the past year or so" just never popped up in my life. I mean, sure, that might happen with some dumb library, but we were talking about migrating data here. We just needed to do a whole bunch of file I/O while converting data. The odds of needing some magic library were low.

Years later, I still think back to that meal. I mean, if you have a "living language" that keeps growing new things, you actually have a double-edged sword. On one hand, there's always something shiny and new coming down the pipe. On the other, you can't rely on it being there.

This raises a new existential question: if a certain language feature isn't widely available yet, does it even really exist? After all, if you can't guarantee it will be there for every system where you could run, wouldn't it be a pretty bad idea to build your program around it? You'd better resort to whatever the old way happened to be.

Now you're left to the lowest common denominator in the name of compatibility. Have you really accomplished anything? What's the point of having new features if you can't actually use them?

All of this would be different if you could just compile this stuff to a binary format which wouldn't care about what happened to be installed, but of course, that's usually not the case with so many of these "fast changer" languages. They are nothing without their interpreters, and therein lies the basis for this versioning mayhem.

Oh, further making matters interesting is the prospect of not being able to detect a missing feature until it's too late. Here's a scenario. Someone who's a real whiz in your language of choice knows all of the newest tricks and gets assigned to add some feature to your program. It's something to handle a tricky corner case which comes up maybe once every 100 migrations.

Life goes along well. Bunches of migrations happen. Then, one day, a migration runs on a machine with an older version of the interpreter installed, and it just so happens to also need the corner case handling code. Some "if" branch wakes up for the first time in a while, and barrels straight into the brand new feature. The interpreter sees this, has no idea what to do, and blows up. Why did it fail? Well, your new person used a language feature which is only a year old, and it doesn't exist on that system. Oops.

Congratulations. You now have a piece of code which has run hundreds of times on as many systems and has never had any trouble, but it falls apart here because of some local implementation detail.

Can this happen in any language? Sure. Play with enough deep, dark stuff and I'm sure you can can unleash run-time equivalents of nasal demons anywhere.

The only real difference is how deeply you have to dig to find those demons. With some languages, they're 60 feet down a well and locked inside a steel vault. You have to really work to set them free.

Meanwhile, some languages practically invite them to dinner.

In the end, all of this was moot. Those two guys were fired and the ones who remained at the company found other places to be and other things to do. The whole idea of having that internal consulting team evaporated. There was no longer any need to build migration tools.