Writing

Feed Software, technology, sysadmin war stories, and more.

Wednesday, May 15, 2024

SSD death, tricky read-only filesystems, and systemd magic?

Oh, yesterday was a barrel of laughs. I've said a lot that I hate hardware, and it's pretty clear that hardware hates me right back.

I have this old 2012-ish Mac Mini which has long since stopped getting OS updates from Apple. It's been through a lot. I upgraded the memory on it at some point, and maybe four years ago I bought one of those "HDD to SSD" kits from one of the usual Mac rejuvenation places. Both of those moves gave it a lot of life, but it's nothing compared to the flexibility I got by moving to Debian.

Then a couple of weeks ago, the SSD decided to start going stupid on me. This manifested as smartd logging some complaint and then also barking about not having any way to send mail. What can I say - it's 2024 and I don't run SMTP stuff any more. It looked like this:

Apr 29 07:52:23 mini smartd[1140]: Device: /dev/sda [SAT], 1 Currently unreadable (pending) sectors
Apr 29 07:52:23 mini smartd[1140]: Sending warning via /usr/share/smartmontools/smartd-runner to root ...
Apr 29 07:52:23 mini smartd[1140]: Warning via /usr/share/smartmontools/smartd-runner to root produced unexpected output (183 bytes) to STDOUT/STDERR:
Apr 29 07:52:23 mini smartd[1140]: /etc/smartmontools/run.d/10mail:
Apr 29 07:52:23 mini smartd[1140]: Your system does not have /usr/bin/mail.  Install the mailx or mailutils package

Based on the "(pending)" thing, I figured maybe it would eventually reallocate itself and go back to a normal and quiet happy place. I ran some backups and then took a few days to visit family. When I got back, it was still happening, so I went to the store and picked up a new SSD, knowing full well that replacing it was going to suck.

Thus began the multi-hour process of migrating the data from the failing drive to the new one across a temporary USB-SATA rig that was super slow. Even though I was using tar (and not dd, thank you very much), it still managed to tickle the wrong parts of the old drive, and it eventually freaked out. ext4 dutifully failed into read-only mode, and the copy continued.

I was actually okay with this because it meant I didn't have to go to any lengths to freeze everything on the box. Now nothing would change during the copy, so that's great! Only, well, it exposed a neat little problem: Debian's smartmontools can't send a notification if it's pointed at a disk that just made the filesystem fail into read-only mode.

Yes, really, check this out.

May 14 20:04:47 mini smartd[1993]: Sending warning via /usr/share/smartmontools/smartd-runner to root ...
May 14 20:04:47 mini smartd[1993]: Warning via /usr/share/smartmontools/smartd-runner to root produced unexpected output (92 bytes) to STDOUT/STDERR:
May 14 20:04:47 mini smartd[1993]: mktemp: failed to create file via template ‘/tmp/tmp.XXXXXXXXXX’: Read-only file system
May 14 20:04:47 mini smartd[1993]: Warning via /usr/share/smartmontools/smartd-runner to root: failed (32-bit/8-bit exit status: 256/1)

There it is last night attempting to warn me that things are still bad (and in fact have gotten worse) ... and failing miserably. What's going on here? It comes from what they have in that smartd-runner script. Clearly, they meant well, but it has some issues in certain corner cases.

This is the entirety of that script:

#!/bin/bash -e

tmp=$(mktemp)
cat >$tmp

run-parts --report --lsbsysinit --arg=$tmp --arg="$1" \
    --arg="$2" --arg="$3" -- /etc/smartmontools/run.d

rm -f $tmp

Notice run-parts. It's an interesting little tool which lets you run a bunch of things that don't have to know about each other. This lets you drop stuff into the /etc/smartmontools/run.d directory and get notifications without having to modify anything else. When you have a bunch of potential sources for customizations, a ".d" directory can be rather helpful.

But, there's a catch: smartd (well, smartd_warning.sh) fires off this giant multi-line message to stdout when it invokes that handler. The handler obviously can't consume stdin more than once, so it first socks it away in a temporary file and then hands that off to the individual notifier items in the run.d path. That way, they all get a fresh copy of it.

Unfortunately, mktemp requires opening a file for writing, and it tends to use a real disk-based filesystem (i.e., whatever's behind /tmp) to do its thing. It *could* be repointed somewhere else with either -p or TMPDIR in the environment (/dev/shm? /run/something?), but it's not.

This is another one of those "oh yeah" or "hidden gotcha" type things. Sometimes, the unhappy path on a system is *really* toxic. Things you take for granted (like writing a file) won't work. If you're supposed to operate in that situation and still succeed, it might take some extra work.

As for the machine, it's fine now. And hey, now I have yet another device I can plug in any time I want to make smartd start doing stuff. That's useful, right?

...

One random side note: you might be wondering how I have messages from the systemd journal about it not being able to write to the disk. I was storing this stuff to another system as it happened, and it's in my notes, but I just pulled this back out of journalctl right now, and it hit me while writing this. Now I'm wondering how I have them, too!

Honestly, I have no idea how this happened. Clearly, I have some learning to do here. How do you have a read-only filesystem that still manages to accept appends to the systemd journal? Where the hell does that thing live?

The box has /, /boot, /boot/efi, and swap. / (dm-1) went readonly. The journals are in /var/log/journal, which is just part of /.

If a tree falls in a forest and nobody's around...

...

Late update: yeah, okay, I missed something here. I'm obviously looking at the new SSD on the machine now, right? That SSD got a copy of whatever was readable from the old one, which turned out to be the entire system... *including* the systemd journal files.

Those changes weren't managing to get flushed to the old disk with the now-RO filesystem, but they were apparently hanging out in buffers and were available for reading... or something? That makes sense, right?

So, any time I copied something from the failing drive, I was scooping up whatever it could read from that filesystem. The telling part is that while these journals do cover the several hours it took to copy all of the stuff through that USB 2->SATA connection, they don't include the system shutdown. Clearly, that happened *after* the last copy ran. Obviously.

Now, if those journal entries had made it onto the original disk, then it would mean that I have a big hole in my understanding of what "read-only filesystem" means even after years of doing this. That'd be weird, right?

Just to be really sure before sending off this update, I broke out the failing SSD and hooked it up to that adapter again, then went through the incantations to mount it, and sure enough:

-rw-r-----+ 1 root systemd-timesync 16777216 May 14 17:06 system.journal

The last entry in that log is this:

May 14 17:06:38 mini kernel: ata1: EH complete

There we go. Not so spooky after all.