Writing

Feed Software, technology, sysadmin war stories, and more.

Thursday, July 26, 2012

Technicalities versus intent

Remember the days when PGP was published in a book to get around export restrictions? The idea was to put it in a hardback volume which could then be shipped to some other location, at which point the covers could be removed and it could be scanned back in. After a round of OCR processing and perhaps some manual adjustments to correct for scanning errors, you'd have usable source code again.

In that vein, here's a half-baked idea of mine from many years ago which is intended to make a point. At the time, I was thinking of alternate ways to look at data encoding. The core of my idea is simple enough: open a file and read it as a series of ints and then print it as a human-readable number, like 2487982721. Obviously, this is going to be a whole bunch of numbers.

Next, take each distinct value and feed it to 'number' from the bsdgames collection. This will give you something like this:

two billion. four hundred eighty-seven million. nine hundred eighty-two thousand. seven hundred twenty-one.

Repeat until you run out of data. Now you have a whole bunch of really dull English text which just rattles off numbers over and over. Save this to another file. You could then turn this into a book just like PGP.

To get back to your original binary file, you have to map those words back to numbers, and then write the ints back to disk. Assuming you don't have any ambiguities in your encoding (like byte order?) then you might just get the same file back out.

Oh, but that's complicated. So, you switch to bar codes, but oh no! That's too easy! Now you're obviously sending data around, and you're infringing on so and so's copyright! So okay, change the encoding around. People have been sued over a handful of notes in songs which sound somewhat familiar.

Let's go with that and say that a bar code now encodes notes directly rather than being binary WAV information, and oh hey, it just turns out that all current bar codes can be interpreted by this new scheme. Do some modulo magic and map the individual digits to A-G, maybe. Whatever.

Then, you can go and point out that many products out there in the world currently encode copyrighted music! Hey! This can of Coke plays the melody to Funkytown! What do we do now? Do we ban those Coke cans because they have the data which is owned by someone else? No? Then why was the book bad when this isn't? Bits are bits! The horse is out of the barn! Hack the planet! Hack the planet!!

If this sounds familiar, then perhaps you have spent too much time around people who are all about the technology and don't consider intent. It doesn't matter how you infringe on something. If some adversary can convince the right people that you intended to distribute their song or movie, look out. You're probably in trouble.