Writing

Software, technology, sysadmin war stories, and more. Feed
Wednesday, July 24, 2013

Sniffing for rogue unmanaged switches

In the past week or so, I ran into an interesting sysadmin type question online: how do you detect unmanaged switches on your network? They don't have IP addresses or any other way to be contacted directly. Imagine your typical "desk" switch: it might have four or five ports, a power plug, and a couple of happy little LEDs. There is no serial console since it is just a dumb little embedded device.

This seemed like a potentially interesting problem to solve. Obviously, you could just brute-force the matter by picking one of your own (managed) switches and tracing out from each port to the target system. If you encounter another device along the way, then you have your proof.

Clearly, this sort of approach does not scale. It might be good if you suspect a certain person and just need to find out, but it would be ridiculous to try to do this in any real quantity. There just aren't enough interns in the world to support this kind of manual monkey work.

You could also try to monitor the port for traffic from too many MAC addresses. This also takes a fair amount of work, assuming your switch even supports the kind of "mirroring" required to get a copy of traffic on that port. If your switch supports port security and MAC filtering, you might be able to do that, but again, that requires upkeep, since people move around and NICs change. There has to be another way.

After pondering this one for a while, I had an idea: link state. Given that I have control of the switch infrastructure in the area to be scanned, that should mean that I can also administratively enable and disable ports in software. This might mean jumping through a web or ssh (or even telnet, gag) interface, or maybe could just be a SNMP message. The point is, I'd be able to temporarily turn off a port.

Ideally, this would make the port look just like it had been unplugged from the switch. If the end station is in fact directly connected to that line, it should see the link go down. How can you find out? That's easy. If this is a corporate environment, you should also have access to the machine in question. Turn the port back on and go look in the syslog.

I unplugged the cable from one of my test machines to demonstrate this:

Jul 24 19:07:06 edu dhcpcd[1809]: eth0: carrier lost
Jul 24 19:07:06 edu kernel: [ 199.930160] forcedeth 0000:00:05.0: eth0: link down

Now, not all operating systems or NICs will log this sort of event, but quite a few will. If the machine logs link state changes which correspond to your frobbing of the port admin state, it's a pretty good guess that they are directly connected. Here's why.

Imagine the situation where someone has a rogue switch plugged into the drop in their office and then has a whole bunch of machines plugged into that. When you turn off the port on your end, that makes the "uplink" port on their switch drop out. However, all of the machines in that office still see a link since they are plugged into the local switch. Sure, they're off the rest of the network from a logical perspective, but at the physical layer, they can still see something out there.

This might fail if the unauthorized switch is somehow set to drop the link state on its "inside" ports when its "uplink" port also drops. This might sound weird, but I used to have a couple of fiber transceivers which had a feature called "MissingLink". When the fiber link went down, they'd purposely drop off their twisted-pair ports so the switch could see there was a problem. This allowed you to use port state in the switch as an alert condition even though the fiber wasn't directly connected.

Fiber transceiver

So let's say for some reason this doesn't work. Maybe you can't get to the syslog on the box. Did you notice the other thing which happened when the port went down on my test box? My DHCP client started freaking out. This is what happened after it came back up:

Jul 24 19:07:17 edu dhcpcd[1809]: eth0: carrier acquired
Jul 24 19:07:17 edu kernel: [ 210.778050] forcedeth 0000:00:05.0: eth0: link up
Jul 24 19:07:17 edu dhcpcd[1809]: eth0: rebinding lease of x.x.x.x

It basically went right back out on the network to make sure it had a healthy DHCP lease. You should find that many systems will do this after they've had an "out of network experience". So, even if you can't use the syslog, you can still use your own DHCP server logs to see if that machine starts asking for a new lease after you bounce the port. If it doesn't, odds are, it didn't see the port state change, and that means something else was in the way.

Here's another possibility. Maybe you can get into the machine, but not as root, and the DHCP thing isn't working for you for whatever reason. Maybe it has a static IP address. You suspect it's on an unmanaged switch with some other machines but you can't be sure. I'd set up something to run a broadcast ping or fire off a UDP datagram to the broadcast address or anything else I might be able to do as a mere user. Then I'd start it running in a loop: run once, sleep 5 seconds, run again, and so on. With this running in the background, I'd pull down the switch port for 30 seconds or so.

That should let the broadcast generator thing run 5 or 6 times while disconnected from my network. When I put the port back online, I should be able to log back in (or resume my existing session... TCP can survive such breaks usually) and look at what it captured. If that machine logged traffic from any other system in the meantime, it's obviously on some other switch!

Other things to watch would be the ARP cache. If it manages to grow while kicked off the main network, then something is out there with it. This one would probably need a much longer period of downtime to be sure everything else expires, but it should work with no special permissions on the machine.

I hope I never need to use anything like this, but it's a fun thought experiment at any rate.