Writing

Software, technology, sysadmin war stories, and more. Feed
Wednesday, October 26, 2011

Build your own feed reader

It seems that Google Reader is the next thing which will be scooped up by the "everything must be social" maw that is Plus. It comes as no surprise to me, since both Reader and Buzz have been set to die on the same chopping block this whole year. It's just taking forever to happen.

Now it seems like lots of people are freaking out about the pending loss of a service they dearly enjoy. I feel their pain, although for a different reason. I've been trying to get away from all things Google for the past six months, and this has been a tough one. I wanted to keep up with my feeds without logging in.

However, I've been lazy about it. I've been logging in, catching up, and then logging back out. It's had the side effect of reducing my polls to once every couple of days instead of several times a day. This was bothersome but not enough motivation to get moving on a fix.

It took this announcement of impending G+ integration to get me moving. I started working on a solution for myself. After a couple of days I had it to the point where I no longer need to log in to Reader any more. It's utterly basic and devoid of features, but it does the job for me.

I'm writing this post to encourage others to do the same. It's easier than you think. I started by snagging my entire list of feeds from the export feature. Then I picked one URL from it and pulled it in with wget to take a peek. Anyone can do the same.

Not surprisingly, both Atom and RSS are XML. I'm not exactly a fan, but you have to go with what you have. libxml2 is basically installed everywhere now, so I turned to its header files under /usr/include and its actual documentation/examples to get some idea of how to navigate things. It'll break your brain if you are like me and have never wrangled XML before, but you'll live.

A bit of mangling calls to libxml2 eventually yielded something which would read in a feed and would spit out a nice STL data structure full of posts. From there it was no big deal to make it emit very simple HTML output. All you need is the title, the link, and the date. The rest, even the content, is unnecessary for a first cut.

What about actually fetching the contents? Well, that's a matter of calling libcurl. I was in a groove of trying very hard to not reinvent any wheels, so calling upon another library which is bound to already be installed sounded good to me. I must pass on my compliments to the libcurl devs for the whole "easy" API. It gets the job done and Just Works.

Given all of this, the pieces start to come together pretty quickly. libcurl gets the data, libxml2 lets me navigate it, and some custom glue turns into a reasonable structure which can then be used to build an output page.

Anyone can do this. It puts you back in control of your own feed reading experience.

Consider it an excellent use of an afternoon.