Writing

Feed Software, technology, sysadmin war stories, and more.

Tuesday, March 22, 2022

Some people don't deserve access to the machine room

A long time ago, I tried categorizing the people you encounter in tech support as one of three major types: the ones who investigate problems and come up with solutions, the ones who use those solutions in response to problems coming up, and the ones who think they're an investigator but are actually terrible and screw things up on the customer's machine. Pretty soon, it's full of awful crap all because this person really did not understand what's going on with anything, and just barreled on ahead anyway.

It's this "let's just run around doing crazy stuff that is totally not indicated by ANY kind of data" thing that brings me to today's story.

I was eating dinner. My phone started ringing, and I let it go to voicemail. Whatever it was could wait. Later, I checked on things, and sure enough, it was the new "webmaster" from a gig where I was the sysadmin. His message started out in a most remarkable fashion: "both box1 and box2 have crashed", continued with some more blather, and then added that he had rebooted box2 (with the reset button, that is).

box1 was the mail server for the whole operation. box2 was one of the web servers. Either of those being down would be a big deal. Still, I didn't buy it. These things didn't just die like that, particularly at the same time.

I calmly walked back to my home office, flipped over to my "work" session on box1 that always stayed logged in, and hit ENTER. It was alive. I called him, and he asked "what about box2?", so I loaded up a web page on that box in Mozilla (yeah, this was a long time ago...) and it was fine. I poked box3 for good measure. It was also fine.

Later on, I got an interesting e-mail.

Sorry to have bothered you about that so-called outage. There's something strange going on with network 1. While I was getting an IP address on network 2, I asked (name) for a fixed IP for my workstation on network 1. He gave me x.x.x.16. It connects to box4 and box5, but not with box1 and box2. I used x.x.x.1 as the gateway. I went back to dhcp and everything is reachable again. I'll ask (name) about it tomorrow.

Yes, this person did something to his workstation that ended up changing its IP address, and upon trying to connect outward from it, failed to reach some of my machines. Because he was physically present and I wasn't, he got to one of them and pushed its reset button on the front panel. The box got shot in the head for no good reason. Poor little box.

What happened is that he had been given the "poisoned" IP address. Once upon a time, (name) had set up this network scanning thing that kept filling up my logs with crap. I asked (name) to knock it off, since I actually watched those things and tended to act on the signal it provided. Crapflooding my logs made it harder to find the actual crazy stuff like infected client systems (and worse).

So, after he failed to leave my machines alone, I just filtered out his scanner on box1 and box2. He could shoot packets at me all day long and they'd just drop into the bit bucket. It's like, cool, cool, have fun with that. I bet you don't even notice. I had set it up nearly two years earlier according to my notes in the firewall config file.

# (date two years in the past): idiot box that won't stop scanning me
DROP * x.x.x.16 * * *

When this "webmaster" got a hold of the IP address, the two machines which had existed back when this happened were still dropping it with a very simple and stupid iptables rule. Any machine that had been "born" past whatever point they stopped the scanning never got the rule because there was nothing to drop, and that's why box3, box4 and box5 didn't act the same way.

Let's see. You have physical access to a bunch of servers that are not yours. You do something to your workstation. The next thing you know, you can't reach some of those servers from that workstation. Do you undo what you did to your workstation? No. Do you find another machine? No. Do you ask someone else to also try hitting it? No. Do you try to hop into a machine that IS responding, and then try to poke one of the "dead" machines from it? No. Do you notice the difference between a host that truly is down and one that is just dropping your packets, i.e. ICMP host-unreachables from the router versus... you know, nothing? No.

What do you do? You let yourself into the server room and start pushing front-panel reset buttons thinking it'll do something useful.

It takes a certain kind of individual to go and do things like that.