Software, technology, sysadmin war stories, and more. Feed
Tuesday, April 23, 2013

Apple's Time Machine hates me again

Oh dear. It seems I have run into yet another fundamental issue in some of Apple's "plumbing". It's clear this is something they just can't get right.

First, some background. For my backups, one of the things I use is called a Time Capsule. It's this short plastic box which has a hard drive in it and sits on my network, and gets super hot. It's basically their official solution for "network attached storage". Then, there's some software on my machines called Time Machine, and it's supposed to wake up from time to time and automatically push updates to that server.

When it works, it works. The problem, just like so many other Apple things, is when it breaks. It's obviously unable to recover by itself and requires the kind of manual hand-holding and evil hacks which should not exist in this situation.

The latest failure mode is that it will just stop backing up. There's no warning that anything has gone wrong. The little "circular arrow around a clock" icon on the top bar looks the same. However, if you click on it, you'll soon realize something is up. Instead of saying something like "Last backup: (some time today)", it says "Last backup: April 15, 2013".

Yes, that's what mine is saying right now. It hasn't backed up in over a week, and yet it didn't feel the need to tell me about this. This is for a system which is supposed to run hourly, assuming the machine is on my home network and is turned on (it's a laptop). If you figure 24 hours in a day, and 8 days since it last ran, that's *192 missed backups* with not so much as a peep from this thing. It's inexcusable!

This actually happened to me once before, a couple of months back. The fix is to roll up your sleeves and play sysadmin on a box which claims to take care of everything for you. First, you need to click around and make sure the top level of the Time Capsule is mounted. This usually just means bringing it up in the Finder. Once it shows up as mounted under /Volumes, you're ready for the next step.

Now you have to find the ".sparsebundle" which is associated with the machine in question. It's just the big blob which is the encrypted filesystem. First, it has to be attached... without mounting it.

hdiutil attach -nomount -readwrite foobar.sparsebundle

It should spit out a bunch of /dev/disk* lines in response. One of them should be "Apple_HFSX", and that's the one which now needs to be fscked. Yeah, the same thing which you probably haven't run by hand in years on your Linux box, because those systems have figured themselves out by now. You get to run it by hand. It's going to take forever, and it'll turn your laptop into a nasty little space heater, but that's how it goes.

fsck_hfs -rf /dev/disk(whatever)

This will take hours, even if you have a machine which is otherwise idle and has a gigabit link back to the Time Capsule box. If the machine is busy or has a slower link in between, I imagine it might take days. If you're lucky, it will do whatever needs to be done, and will just dump you back out to the prompt.

If everything went well, then you can use hdiutil to detach the image and then switch your normal backups back on. It might just "grab on" and start backing up. Since it's been stuck for a long time, this means many more hours of having the machine bake itself while it pushes data over to the network disk.

What I can't figure out is why TM apparently won't do this "attach + fsck" thing by itself. It's obviously what gets things working again, and it's supposed to be in charge of that sparsebundle file, so what's the deal? It would be even better if the TC itself did all of it locally to offload the work from my laptop, but that would imply having a disk drive with actual brains. That's been out of style since Commodore went out of business.

Macs don't remove the sysadmin duties from your life. They just hide them and hope you don't notice.