Writing

Feed Software, technology, sysadmin war stories, and more.

Monday, March 26, 2012

The worst kinds of RPC mechanisms

Let's say you work at a big company which has a bunch of programmers working on interesting bits of infrastructure. Maybe they've written storage systems, multi-machine cluster management stuff, job dispatch and control, and of course some kind of RPC mechanism to let all of these pieces talk to each other. Then your own project needs some way to communicate between two parts.

What do you do? You write your own RPC mechanism. Of course.

But, given that you're doing it in Python and you have an affinity for ssh, why not re-exploit those methods? After all, an RPC is nothing more than a bundle of data with a method name and maybe some parameters. That could just be a dict! And how do you get a dict from one place to the other? Why, you pickle it, of course!

So now you have this serialized blob of data and you need to get it to the other process. You have a ssh connection open to the other host where it's running, so you can just fire it down that link. The other process will read it on stdin, and it can reply on stdout with yet another pickle. Then you pull that in, unpickle it, and there's your dict in reply. Yay!

What a mess. I ran into this when something was misbehaving and I tried to add some debugging prints to one of the processes. Little did I know, its stdout was actually being read by something else over the network. After making my change, now all of the runs were coming back with some arcane "pickle failure" message. All I had done was add a print, so the only way that could break a pickle is if ... oh no.

That's how I discovered their little RPC mechanism. It had slipped in at some earlier point in the project, perhaps before I even joined.

After I figured out how these guys liked to roll, I determined that "pickle" is a good indicator that something bad is about to go down. If they stir that into their design, chances are that it's a real stinker.

Another place used to pass state around by hiding a base64-encoded pickle in a hidden INPUT field in their forms. I used to wonder just how many interesting bits you could set just by tweaking that and feeding them one of your own design. Maybe you could take over the machine!

Then there's the project which was written using XMLRPC for network communications because their language had a convenient library for it, but none of the other supported languages at the company talked that. Nobody had written a library to speak XMLRPC for those other languages because the company had a perfectly good RPC mechanism of its own already. This particular project was named after a porno film. Really.

You know you have a really bad case of NIH when you even apply it to the established baseline libraries from inside your own organization!