Software, technology, sysadmin war stories, and more. Feed
Sunday, June 26, 2011

Growing web applications from scratch

This post is actually a result of a request for notes on how to rig up a stateful web application by a friend of mine. Well, there are many ways to do that, and this is just one of them. I specifically refrain from describing this as the best, that all others need to just go away, and all other comparisons of that type.

So you want to create a web application. By that, I mean something which is somewhat dynamic, and has users poking at it and changing things. It probably remembers stuff for a non-trivial amount of time, like seconds or minutes. In other words, you're not just throwing static HTML at them, and you're actually accepting data back from them. You're probably serving that data back to them and a few other people, too.

You could try to make this happen by writing a bunch of CGI programs which wake up and talk to a database. One of them would probably be the reader, supplying JSON or maybe even fully populated web pages with that content included. Others would then need to be created to write back to this database. It's not much of a stretch, and it's not too scary. People have been doing this sort of thing for years.

The problem is that it's ridiculously heavyweight for something simple like a chat application, or a silly buzzword game for people watching their company's TGIF show. Also, any special logic you have for processing that data has to happen on the way to the database, or on the way back out. Now, remember that since you're a CGI program, you're going to start and stop for every hit. That's a lot of churn. Ouch.

Well, at this point, you could turn to something that keeps your program running. Maybe you could go suffer the kind of self-abuse which comes from FastCGI. Or you could try something like mod_perl or mod_python or go all the way to Django. Or you could go for something even more ShinyThing compliant and dive into Rails, which means now you've committed to Ruby.

What if you really like your usual way of writing daemons and other persistent standalone server processes and don't want to deal with someone's framework, let alone the language flavor of the month? Isn't there some way to keep all of this web cruft to a minimum and still get things done?

Naturally, there is. First, imagine your web application as three parts. There is the client side stuff, running in a web browser. It's probably a whole bunch of Javascript, with or without jQuery or whatever else you use to stay sane. There's also your usual HTML for actual content, and CSS to make it look nice and round off all of those corners.

Then there's a server process written in your favorite language, just hanging out somewhere listening to a port, waiting for incoming connections. Your approach should be simple: accept a connection, parse a request, do something with it, and emit a response. You'll stay running the whole time, so if you want short-term state for something, just keep it in memory. My list of buzzwords works this way.

Finally, you have to connect those clients to your server. Your clients generate HTTP requests: GETs and POSTs. I imagine your server probably won't stoop to such craziness and will instead prefer something a little more sane. Try Facebook's Thrift or Google's Protocol Buffers. They're both open source. You can invent your own RPC mechanism or go find something which just needs a Stub. By the way, your users will never see this, so this can be just about anything.

Let's put it all together: a user clicks on something in your page. Your Javascript does a jQuery $.ajax() call (or whatever) to hit some URL on your web server. That maps onto some dumb little CGI program which turns the metadata (client IP, cookies) and the actual request (GET query string or POST stream) into a proper request for your RPC mechanism, and fires it off at your server.

Your server receives that request, chews on it, and generates a response if appropriate. Your client is probably going to expect JSON, so you could just emit JSON at your server and let your little CGI pipe just pass it back across as-is. Once you're done, close that socket and kick off the client, and go service some other client. That's it.

Once all of this works, it becomes simple to add a new feature. For one, you write a RPC handler in your server to accept the new request, and make it do whatever you need. Then you change your Javascript client code to make something that collects some input from your user, invokes that RPC handler (indirectly, via your CGI "pipe"), and then consumes its reply and does something useful with it.

After getting the hang of this, new web services lose much of that feeling of staring up at a mountain, wondering how to get up there. The "how" part is gone, and now you just need to figure out a "what".