Writing

Feed Software, technology, sysadmin war stories, and more.

Monday, April 1, 2013

Embedding CR and LF to really mess up IRC

Way back in the '90s, IRC was basically the only place to be if you wanted to chat with other people online. Sure, there were probably other weird little enclaves here and there, but it was pretty much the default. Communities came and went, and lots of wars happened. People would fight over "nicks", channels, and anything else you can imagine.

Sometimes, there were bugs in the actual server code, and they could be used to really make life miserable for other people. I saw this one happen back in April of 1996 and never really understood it at the time. Now, I can look back and start making sense of the whole thing.

A rumor started going around that someone had found a bug in the "channel key" code. This was something you could set on your channel which would bar access to people unless they specified that same key when they attempted to join. Without that key, they'd only get an error message. I figured it was the usual "embedded \0" C madness and started playing around with it, but never really got anywhere with that.

Some time later, a few of us found the actual patch which had been going around and tried to learn from it. There was one interesting part in the code which was in the neighborhood. Check it out and see if this makes you go "hmm":

		u_char  *s;
 
		for (s = (u_char *)*parv; *s; s++)
			*s &= 0x7f;

In this case, "parv" is the key supplied by the user right after it's been through a sanitizer which removes things like spaces and other unsavory characters. In case C isn't your thing, I'll explain what this loop is doing.

This loop just starts at the first character of "parv", and looks at each one until it hits the end (\0). While it's doing this, it replaces the current character with one which has the high-order bit cleared. It's doing a bitwise AND with 0x7f, which is just 01111111, so something like 10000001 would turn into 00000001. No big deal, right?

Well, actually, it was. Imagine what would happen if you passed it a string like this: "foo(141)(138)blah", where (141) and (138) are characters with those literal values. First it would pass through the sanitizer above this block, and it wouldn't find any spaces or carriage returns or anything like that, so it would then drop into this for loop. In this loop, it would strip the high bit off the 141 and 138, turning them into 13 (141-128) and 10 (138-128), respectively.

What's a 13 and a 10? Oh, just a carriage return and a line feed. So now you have this string which is "foo\r\nblah". What happens next is the server will push this mode string out to all of the connections where this channel exists. That means all of the locally-connected clients (people), and all of the servers too. It looks something like this:

nick!user@host.name MODE #channel +k foo
blah

Got that? It's an ASCII protocol, so it uses chars like CR and LF to split up commands. You just managed to escape from the MODE line and onto your own command line. It's the IRC equivalent of blasting 2600 Hz down the line back when that meant something. You now have control of the content in there.

What can you do at this point? Almost anything. Maybe you want to kill someone's connection, so you send "foo(141)(138)KILL victim :ha ha", and it goes out like this:

nick!user@host.name MODE #channel +k foo
KILL victim :ha ha

Lots of bad stuff started happening. Not surprisingly, a bunch of people tore down their servers for immediate patching and restarted a few minutes later.

What was funny is that the patch added a warning which would go to the server console, and it would say if someone was trying to abuse the "+k bug". Then, if you were dumb enough to try this on a patched server, the operators could see you and ban you for being annoying.

It was pretty crazy. It's also another example of what can happen when you use in-band signaling and have ambiguity between your field or record separators and the actual data which is provided by users. It screws up protocols like this, it makes life interesting for the web (with all of the angle-bracket and ampersand escaping we have to do), and it throws monkey wrenches into Apache log parsers.

For more on this fun little problem, check out the original post from 17 (!) years ago.