Software, technology, sysadmin war stories, and more. Feed
Monday, September 12, 2011

Support tickets and the slack metric

Have you ever really looked at what tech support people do during their shift? For some of them, it might surprise you. When you find out that nobody who matters seems to care, it might even enrage you.

Following the development of a ticket scoring system for support issues, I started thinking about ways to see who was really working and who wasn't. I came up with a simple idea: for any given moment in the day, you should have something going on.

Any time you have a ticket assigned to you in an active state, the more you're helping out in general, or at least that was my theory. Someone with no active tickets assigned to them is probably doing less for the team than someone who has one and is chewing on it.

I decided to build something which would show this, and starting calling it the "slack metric". You'd be able to see when someone grabbed a ticket, and when it entered various statuses. You'd also be able to see them move on to the next ticket and how much idle time there was in-between. That idle time is what I called the "slack time", and thus, that's how you get the "slack metric".

The question was how to actually visualize it. I wound up creating a huge table where every ticket got its own row, and every minute of your shift had its own column. Where a ticket (row) and minute (column) intersected, there would be a colored character which represented both the ticket status and whether it was assigned to you or not at that time.

Part of a row might look like this:


You had to know what the colors meant in the context of the ticketing system to make any sense of this. The first span was before the ticket even started existing. It had no status and thus no color. Then it was created, and it was "New" -- that really bright green color. It was also not assigned to the person being analyzed, so it was just a dash. Three of them means it was in this state for between 3 and 4 minutes (seconds are rounded off).

Next, that tech grabbed the ticket, which automatically marked it "Unsolved" -- a slightly more pale green. Since this ticket was now assigned to the person being analyzed, the character used was a "Y". Three of these in a row meant it was "unsolved" and assigned for about three minutes. This is probably when the work happened.

After that, it goes grey. This corresponds to a status called "confirm solved". In other words, the tech thinks it's done, but the customer has 48 hours to respond and say "nope, it's not" to keep it open, or to say "yep it is" and mark it as done from their side. This customer must have been keeping close tabs on this ticket, since it only stayed in that status for 1 minute.

Once the customer marked it as done, it showed up as "closed with feedback" -- that yellow color. It stayed there for four minutes until someone noticed it and popped it over to "solved", which is the final grey.

So let's review here: the customer makes a request. It sits idle three minutes, and then work begins. The work takes at most three more minutes, and that includes the time to grab the ticket, do the work, and respond to the customer. The customer notices that about a minute later, and then marks it done, and four minutes after that, some tech acknowledges that and it goes away forever.

That's about as fast as it can get -- and yes, this is from a real ticket from many years ago. I didn't mess around when working support. Now imagine a bunch of these rows all on the same screen. You can see what happened with a given ticket by reading across the row. You can also see what that person was doing at a given time by reading down the column.

Someone who's staying on top of things will actually pop over to other tickets while an earlier one is in some dormant state. Maybe you need clarification on something, so you ask a question and set it to "require feedback". Then you have a choice: you can be a slacker and sit there and hope they reply, or you can just switch gears, grab another ticket, and keep on going. This tool showed the difference quite well.

Why did I write this? Well, it's simple. I used to notice things that did not look like the ticketing system up on other people's monitors. I couldn't be totally sure (since I was stuck at my desk actually working -- what a concept), but it definitely did not look like work to me! One day in particular, this one individual was really blatant about this, and so I ran this "slack metric" tool on him the next day.

In the nine hours he was there (9 AM to 6 PM) he touched four tickets. I'm not talking about closing tickets, which means actually working on them and resolving them. Oh no. He touched them. That means they were assigned to him for some amount of time, but whatever he did was not sufficient to fix them.

Normally you close some number of tickets (say, N) and touch some greater number of tickets (say, 2N) in a shift, assuming you're actually working. However, if you're just pretending to work, you can just occasionally grab a ticket, throw it somewhere else, and just sit there doing nothing the rest of the time.

This guy touched four. He closed zero. None. 0. In nine hours. Basically, he did no work in the whole shift.

There was plenty of work to be done, too. Someone else on his "team" came in at 7, left around 4:30 or so, and touched around 17 tickets that same day. When I showed these numbers to him, he was understandably angry. It also explained his confusion, namely "why was I so busy yesterday?", and other similar rants.

This is how life went. The bad people got away with murder, and the good people burned out and moved on. If you let a cycle like this run for long enough, pretty soon all you have is bad people.

Some would say this has already happened.