Software, technology, sysadmin war stories, and more. Feed
Monday, July 18, 2011

Another student copies my code, and I declare war

Last year, I found myself having to take some programming classes in order to fulfill requirements for a degree. It wasn't a big deal, and in fact it felt a little like cheating since my own life experience paved the way for understanding all of our assignments. These classes were a little unusual in that we had both the usual thing where you'd submit directly to our instructor and we also had these things called discussion boards. That's where it got weird.

Think of something like reddit, or Hacker News, or even Slashdot, but with your fellow students uploading code. Half of our assignments worked this way. We were also expected to download and analyze the code of a few other students for each assignment like this and give useful feedback. Having worked in a place with a culture of code reviews, this was no big deal.

I got into something of a pattern. I would do the work and upload it, and wait for others to show up. Usually nobody else had posted yet. Eventually, they'd start filtering in, and I'd bring them in, check them out, comment, and move on. I could usually keep up with the inflow, especially as things got into the higher classes and the number of students shrank.

One day, something bizarre happened. I downloaded some source, and it looked familiar. It looked too familiar. It's almost like someone had adopted all of my little nuances and other wacky style things which I brought with me from my years of doing whatever. At first, I thought maybe someone was just really a big fan of my style and was learning from it, and emulating it really well, but I kept noticing things that did not make sense.

Finally, I decided to take myself out of the equation and let the computer tell me if it was an homage or if it was just a plain rip-off. I started by grabbing a sample source file for the same class from my submission and this other person's submission. Then I flushed it through perl (another case where I only use it to run regexes) to remove things like DOS line endings, empty lines, and all spaces. This would squish things down into a morass of characters, but it would also remove trivial fuzzing by this other person.

Then I used grep to remove anything before an "import" line, knowing the header text would certainly be completely different. Now with those two output files in place, I ran diff against them.

There was no output from diff. Both files were identical.

Just to be sure (and to make a more interesting typescript output for later use in my report), I ran a 'sum' tool against both. md5sum, sha1sum, whatever, that sort of thing. Identical. Every time.

This guy had the gall to rip off the code of the most eagle-eyed person in the whole class, and then submit it to that same session. He could have just taken the class over and used it then if he was that incompetent. Or he could have saved it for the next assignment which built on this one, and only went straight to our teacher. But nope, he posted in a way where I could see it, and see it I did.

True to my nature, I called on it publicly. The offense had been committed in public, so I exposed it in the same venue. Naturally, he had some excuse, like how there's only one way to write this kind of stuff, and all sorts of other ridiculous things which don't hold water. Instead of owning up to it, he dug the hole deeper. At this point, it was game on. I gave him both barrels of evidence.

I finally zapped him by saying, hey, you didn't change the IP address consistently everywhere. He, like many people, ran a RFC-1918 netblock in, and I did not. I had something completely different going on to match my own home network, and he did a lousy job of flipping the numbers around. What he created was something which wouldn't function on either network. It was clear he had just edited the code and resubmitted it without ever running it.

I reported all of this to our instructor, and he vanished from the course. What really irked me is that she didn't catch it herself, and apparently didn't have anything even remotely close to my dumb perl + grep + diff cheat detector hack which took all of 30 seconds to do.

Maybe she was a faker, too.