Writing

Software, technology, sysadmin war stories, and more. Feed
Wednesday, January 9, 2013

The magic backup on a floppy disk

Writing about monitoring systems the other day reminded me of a particularly nasty hack I once learned about. It was the sort of thing I might have set up in a pinch a long time ago, but thankfully never did. This particular hack was not a one-off and it wasn't kept in a cave. It was actually turned into a product.

Think way back to the '90s. You lease a machine at a glorified colocation center. It's just a generic minitower whitebox with the cheapest components available. It is literally anything they can get which will run the right operating systems and won't catch fire too often. Put simply, the hardware is junk.

This wasn't a big deal for most people since super-cheap hardware means you can keep a huge stack of replacement parts on hand. When a power supply or motherboard blows up, you can just rip the machine down, swap it out and get it back online in just a couple of minutes. Sure, the site is down for the duration, but if it doesn't happen that often, who cares?

Many customers were okay with this, but they were still nervous about losing their hard drives. That's where the evil hack comes in. It was called "bounce backup", and it involved unnatural things which go "bump" in the middle of the night.

Systems with this "bounce backup" scheme would have a second hard drive installed. Then some special software would be installed on the usual primary disk, and a floppy drive would be mounted in the machine. (Normally these machines didn't have or need removable media.) The floppy had just enough of an "OS" to start a clone from one drive to the other.

I think it used "dd" or something else that was also the wrong way to do it, but it's been many years since someone told me about this and I may have gotten that particular detail wrong. The point is that the machine would reboot into a weird little floppy-based environment and just throw all of the data from hda onto hdb (or hdc, or whatever). It was dead to the world while this was going on.

Remember that I started this post with a mention of monitoring systems. Here's why. Obviously, if the machine is rebooting into a closed environment where it's off the network and all of its services are down, it's going to set off the monitoring alarms. That is... it will, unless the machine itself can somehow turn off monitoring when it "goes down for bounce", as they used to call it.

So here's the whole picture. There's this cron job on a Linux box. When it runs, it fires off some kind of signal to the monitoring host which says "stop worrying about me, I'm checking out". Then it twiddles something on the floppy to make it "take over" and reboots. The machine comes back up off the floppy and does whatever. Then it twiddles the floppy again so it won't take control and reboots again.

A bit later, it comes back up off the normal hard drive-based install and goes back to normal. I'm not sure exactly how they re-enabled monitoring. It's possible they added some kind of init script to the boot sequence to always make it proclaim its return to the world of the living when this "product" was added to a machine.

Years later, the monitoring system was changed out, and the new one didn't have this particular feature. Those machines still running "bounce" would no longer be masked out of monitoring when the time came. I think those customers were "encouraged" to upgrade to newer equipment and some other kind of backup system.

I do know that many customers never got that far. I would encounter crusty old machines with crazy hard drive setups, wild cron jobs, and yes, the occasional dusty floppy disk years later.

I imagine this seemed like a good idea for someone at some point. All I can figure is that a whole bunch of alcohol was involved.