Software, technology, sysadmin war stories, and more. Feed
Saturday, August 13, 2011

Kiss o' death: plan ahead or suffer later

Wikipedia has a great article about NTP server misuse and abuse which should be required reading for anyone creating a protocol. In particular, anyone who is setting up a situation where the clients might ever be out of their control should read it and really take it to heart. It might make the difference between a controllable situation and something which quickly gets out of hand.

I'm sure a lot of people will tune out, thinking, oh, gee, I'll never write a protocol. I'm just some web person. Wrong. If you're doing any sort of client-side stuff, like in Javascript or similar, and it phones home, you have created a protocol whether you like it or not. That means you should pay attention to this.

Let's say you have a bit of scripting which polls your web server every 10 seconds to look for an update to something. Maybe you haven't quite gotten around to writing the whole long-poll thing. Or maybe you have, and that's the fallback interval. Either way, you have a situation where you might have a client system hitting you regularly for a very long time. Multiply that by a lot of clients and it could get interesting.

Normally, this is what you want. But, what if something goes wrong? You need a way to stop the flood from the source rather than trying to hack around it on your server. For that, you need to have both the equivalent of a "kiss o' death" packet as described in the article and client-side support for it.

It could be rather simple. Perhaps you do some kind of AJAX call-out to get an update from the server. If you have access to the raw HTTP return codes in your implementation, you might rig it so that you treat some of them a special way. That way, if you need to start shedding load, you can start returning that code and trigger the special-case code path. That code would then disable polling, or throttle it back, or something else intended to stop the burning.

If raw HTTP twiddling is unavailable for whatever reason, don't despair. Let's say you're shooting JSON around, and normally you define "data" to be equal to "stuff". Instead, you could define something called "halt". Then, in your new data handler, you'd do something like this:

function updateHandler(data) {
  if (data.halt) {
    auto_poll = 0;  // Slow it down (or stop it).
  // Do your normal stuff here.

That's it.

You may never need this, but if you do, you'll be so relieved when you realize it's out there in all of your client code, able to be switched on remotely. Just don't forget it if that day ever arrives.

Final note: sure, you could also try to write your own exponential backoff routine to hopefully make this less of a problem. That's cool and all, but appreciate that you might not get it right, and then you'd really need this kind of remote kill switch. Imagine what would happen if your throttle went the wrong way in some corner case, or wrapped around to a negative delay, or something stupid like that.

Put it this way, are you feeling lucky?