Writing

Feed Software, technology, sysadmin war stories, and more.

Monday, May 7, 2012

Spying on support techs and catching product failures

While working on some details for another post, I stumbled across a part of my homebrew ticket indexing system which hadn't been documented yet. It was something which provided more of a signal regarding problems that people were having both at the present time and in a larger sense.

That part was query logging. In essence, any time someone did a search from something from either their browser or with the little CLI helper tool (which really just called wget or curl), I'd find out about it. It left a little entry behind in the logs which had their query term and IP address.

Since I had access to the ticketing system's reporting database, I could query the "logins" table to see who had connected to that system from that address in recent times. The DHCP leases didn't turn over that often, so I had a pretty good idea who it was.

So, more often than not, I'd be sitting there at my desk, working on something or other, and I'd see a hit scroll by. I could spot the term and then find out who it was and pop open a chat window.

"Such and such, huh? Don't forget about XYZ."

"Are you spying on me again?"

Of course, being support techs, they knew about tailing log files and that I had a weird way of hearing about what sorts of issues were going on at the time. That which I didn't get from a web log or another chat session would frequently be picked up just by listening to people talk. There were some definite benefits to being on the support floor, even though it did resemble a big cattle feed lot after we moo-ved.

In any event, I could also get a feel for what sorts of problems kept coming up. If one tech kept querying the same thing over and over, they probably had some issue remembering the fix. If they kept querying a bunch of stuff over and over, then it might be cause for concern. You have to assume a certain amount of ignorance when they first start, but if they don't learn over time, then something might be wrong.

Another thing I could see is whether certain queries came up more than others, and if they suddenly appeared out of nowhere. This was good to provide data when product management dropped a new stinker on the rest of the company to meet their rollout quota for a quarter. If your ticket quantity shoots up seemingly out of nowhere and you're not making more money to compensate, something is very wrong. That sort of garbage needs to be stopped cold.

The whole "buckets of money" thing is amazing. As long as you can make your department look profitable, it doesn't matter how badly you screw over some other department. Pretty soon you have a bunch of bitter department heads stabbing each other in the back every time they can.

This is progress?