Writing

Feed Software, technology, sysadmin war stories, and more.

Wednesday, September 12, 2012

Runtime requirements and messy dependency management

I find it interesting to see where programming languages have been taken. A bunch of them seem to find their ways into niches which seem completely disconnected from what the official marketing may have been. This turns up in weird places when you start hearing how certain backend operations really work, and the twisted things which are done to keep them going.

Years ago, I was talking to a friend who was in town on some kind of visit. He was telling me the finer points of what made their backend run, including the colocation center where it was hosted and what it was written in. They were running their own boxes and I figured it might be interesting if they thought about coming over to the web hosting company where I worked.

It turned out their backend software was written in a programming language once marketed as "write once, run anywhere". Upon hearing this, I asked how many platforms it ran on. The answer was one. This confused me. They were willing to pay the price for adopting that particular language, but weren't using the actual feature which you got in return? That seemed to be the situation.

I figured, okay, maybe that "one platform" just meant Linux. I was wrong. Not only did it mean Linux, but it meant a specific flavor of it. I forget which one they were running, but let's say it was RHEL. Even then, it was a specific version of it, and not just any of them - I think 2.1, 3, and 4 machines were common at the time.

On top of that, they needed a specific install of Tomcat and the JRE. If any of those parameters were changed, there was a good chance it would not run. This just blew my mind. They were stuck with all of the warts of running Java, but didn't get to experience the obvious benefit from having an intermediate bytecode.

I never got to ask how it had been selected. I suspect they went into it with the best of intentions and only wound up becoming constrained once they realized what production life was really like.

I'm always a little suspicious of web hosting environments which have such specific requirements, particularly if they involve things which are not the stock versions you get with whatever OS you happen to install. Every change is just another thing you have to track and own yourself for the lifetime of that particular tree in production for your project.

My friend wasn't one of these people, but there were folks who would come to us at the web hosting gig, would request a RHEL ES3 machine, and would then demand we install Apache 1.3 on it. That was a nontrivial affair, since there is a whole stack of stuff on the machine which ties into the web server, and it had all been built around Apache 2.0. I later found out that many of these requests came from people who had modules which would not run in the 2.0 world.

I would love to see people grab onto the concept of having relatively low dependencies on the base system without then going whole-hog into the world of virtualization. It seems to me that there can be a middle ground.

Right now, it seems like everyone who's running compiled software winds up with hidden dependencies based on whatever might already be installed on their machines. Unless they are very careful about these things, it's possible to get a dependency on something which does not normally get installed. This makes everything run just fine until you migrate to another system, and then it bombs.

Some would say the answer is just to have one "golden system" image and then just virtualize that whole thing and run dozens of them. I'm not really a fan of that whole approach. The last thing I want is a multiplication factor making it so that instead of just N systems, I now have N physical machines multiplied by the M virtual images running on top.

It seems like this sort of thing could be detected by using a carefully-constructed chroot jail to see just what a binary needs to run. Then, you explicitly patch everything into it until it stops complaining. Once you get tired of doing this, then you work your way back through the build process and start paying attention at the compile/link stages. If you needed to say "-lgnuradio-core" at link time, for example, you're going to need to make sure those same libraries find their way into your runtime environment.

At this point, some other people start writing "recipes", such that when program X with the gnuradio-core dependency gets installed, something else runs and does a "rpm -i" (or installpkg, or whatever) on their custom gnuradio-core.package.foobar file. Trouble is, now you end up with machines which have a crazy mishmash of stuff installed everywhere. If you're not careful, you could wind up in the Linux equivalent of DLL hell. You might even have to make lists of things which can't both run on the same machine because they'll stomp on each other. I also worry about the problems inherent in maintaining this in two places: one where you build it and another where you run it. How do you handle versioning?

I'd rather see it happen like this: the compile/linking knowledge is used to build bundles for each thing you want to run. It should contain everything which is needed to let that software behave properly. The only external dependencies should be those things which you can find everywhere, like the Linux/glibc ABI. Relying on the presence (or absence) of anything else is just going to bring you pain later on.

When it's time to actually run one of these things, you drop that package into a controlled space and start the binary. It can then fulfill its linking requirements entirely from that space and can run in a nice little container. In a world where static linking isn't always possible, this is about the best you can hope for.

I don't have a solution for this particular monster just yet. The day when I can say "hey you, go run this on there" and not have to worry about what other stuff may or may not already be present will be a very happy one indeed.