Writing

Feed Software, technology, sysadmin war stories, and more.

Friday, March 2, 2012

Another half-baked idea for life in a SOPA world

Ordinary people might stand to learn a thing or two from the malware of the world if certain restrictions are added to the Internet. Some of these zombie machines have been running software designed to prevent detection. They sometimes have things which are intended to keep their evil payloads communicating with their controller despite efforts to shut it down.

Here's the progression of my thoughts on this. Back around 2003, I had an idea and scribbled it down into my own private diary. It was yet another idea which was unlikely to improve the world, so I didn't share it or experiment with it. It basically involved hiding covert instructions in DNS.

Around that time, systems would get compromised and sometimes a listening port would be added. The botnet controllers would then connect to those ports to command their machines. This didn't always work too well. Those connections could be detected and filtered, and they tended to draw attention to themselves. Zombie machines needed to lay low until they were needed.

The next thing I heard about was how a bunch of ISPs had to work together to pull a dozen or so IP addresses offline in a short timeframe to prevent the latest set of zombies from doing something bad. Apparently they had determined that these hosts were all "phoning home" to a hard-coded set of IP addresses, and by interrupting that, they could shut down the botnet.

Apparently a later version of this changed to actual hostnames and/or domain names, but then some quick work managed to seize those domains before much damage could be done. Again, the "good guys" got there ahead of some hard-coded schedule.

I looked at this and figured that hardcoded addresses and domain names are just too obvious. Anything which is explicitly listed will be tracked down and squashed. Instead, I'd have some kind of rule for checking domain names. Perhaps it would make up strings within some range of lengths with digits in some positions and letters in others. Then it might add a dash or two just to make it interesting.

When I wanted to deliver a command to my horde of machines, I'd just register a domain somewhere with fake information which matched the rule. Eventually, my zombie machines would discover it and read their instructions from my DNS entries for that domain. If done properly, the rule would be able to expand to a huge number of domains so that it would be impossible to register them all in advance.

Obviously, here in 2012, this is old news, but in 2003, I thought it was novel and wrote it down. Two years later, in December 2005, I learned about the "Sober" worm. It seems to have used a date-based algorithm to do exactly that. I considered that another successful prediction and left it at that.

Now you hear about places which analyze their recursive DNS server logs to see what clients are looking for. If they start making queries for certain patterns, it might trip an alarm somewhere. Then that system can be kicked off the network and checked for an infection. This kind of detection is usually noticed by ordinary users when it flags a false positive.

What generates a false positive, you might ask? Well, things like Chrome make random DNS queries at times to see what the network looks like. If you get the same answers for a bunch of random strings, odds are you are behind some kind of screwy DNS server which is hijacking what would otherwise be a NXDOMAIN. If your machine has been flagged but checked out clean, that might be why.

Recent events with different governments seizing domain names inspired me to start thinking about this kind of thing again. This time, it's not about a hypothetical botnet which needs to avoid detection. Instead, it would be something mundane like living in a future where reddit.com had been yanked but still I really wanted to see some pictures of cats or ponies that day.

I started thinking about crazy schemes like new.net, where they claimed to invent a bunch of new top level domains. All it does is stick a ".new.net" on the end of certain queries so that "foo.bar" really winds up resolving "foo.bar.new.net".

By extension, you could pick a bunch of domains which would be "helpers" in the event certain sites ever went down. Then, instead of trying to resolve reddit.com, you'd actually try to resolve reddit.com.(domain1) and reddit.com.(domain2) and so on.

Of course, once this had gone on for a while, those helper domains themselves would probably be targeted for takedowns. Once they were all gone, you'd be right back to square one: unable to get to your favorite web site.

So then, to tie this all together, the thought occurred to me: what if the helper domains were unpredictable? Take the kind of random craziness which was demonstrated in the zombie/malware code and use it instead to dodge external attacks on the DNS. Once you find a working helper domain, you're back in business.

Since this is a half-baked idea, it's going to be full of holes. One obvious problem is identifying the real site if a bunch of them seem to work at the same time and all give different answers. How would you know who to trust with your login and password in this world? You probably wouldn't be able to solve this one trivially.

The best solution I have been able to think of so far is to start getting people in the habit of attaching some kind of strong and yet unique signature to things they wish to have identified as belonging together. Imagine if something happened and you could no longer load up rachelbythebay.com/w for your daily dose. Then you found out that my writing had popped up at another site. How could you trust it?

Well, if both the old posts and the new posts were signed by the same entity (like a PGP key), it might help. You still could never be totally sure what was going on, but it would be better than nothing. If this actually panned out, then it really wouldn't matter what URL I happened to use to post my updates, as long as you could eventually find it. Once you found it, you'd know it was probably the same author behind the posts, since the crypto signatures would match up.

At that point, URLs might not be as important. You could have roaming bots which go looking for content and make a note of who signed it. Then, when you want to find content by someone, instead of loading up their URL, you ask your swarm of bots what they found with that signature. Where they found it wouldn't really matter.

This kind of focus on content and authors might create a level of abstraction on top of URLs, just as DNS added abstraction on top of IP addresses, and IP addresses add abstraction on top of things like Ethernet addresses.

Still, at this time, there's no need for it. Until it becomes absolutely necessary to use this kind of craziness to actually get things done, nobody will bother. I don't blame them, either. Without a use case, it's just a bunch of architecture astronauting.