Writing

Feed Software, technology, sysadmin war stories, and more.

Wednesday, March 27, 2013

"Put down the crack pipes" doesn't go so well

I've received some requests for elaboration on what kind of stuff can stand in the way of a woman in tech. This is probably because of last week's post where I called back to my "cat food factory" as an analogy for what happens after you get into this industry.

I will attempt to relate some of these things. I can only speak about the things which have happened to me, in the assumption they happen to others. However, you should actually talk to other women before attempting to apply these stories to everyone.

...

If you perform an interview for a technical position, you run the risk of being assumed to be someone in HR. I did nearly 100 interviews, mostly over the phone, over a 4.5 year period at the big G, and on multiple occasions had a candidate question my position. When this happens the first time, you sort of cock your head sideways and think, wait, did I really hear that? Did he really ask that? Maybe they ask that of everyone.

So, then, afterward, I'd check in with some of the guys around me. These were both teammates and people who just happened to sit nearby. I trusted they would give me honest answers to my questions, particularly if I didn't seem to have an agenda or otherwise wasn't "on a roll" when I brought it up.

I'd say, hey, did you ever have a candidate ask if you were an engineer? Or, did you ever have a candidate ask if you were in SRE? Or, did they ever think this wasn't going to be a technical call?

None of them ever had that happen to them.

...

For my first two years there, I wore a pager and was on-call for a team which supported a pair of services and later split to supporting just one of the two. The service I kept up had a whole bunch of personally-identifiable information (PII) about users in it, and so it had been constructed with an extra degree of paranoia.

It ran on its own machines, had them rolled up into its own "laser-mounted-on-the-head" clusters, ran its own storage cells using the multi-disk machines, and then ran its own database on top of those storage cells. All of that software came from other teams. We built, configured, and maintained our own instances of those things in order to actually run our service on top.

This was all done in the name of security, since there was a time when merely becoming an engineer would otherwise grant you full access to the kingdom. This particular service was isolated in numerous ways to keep that data safe. It meant duplicating a bunch of things which were normally run by their own dedicated teams. It also meant that I got to touch the "full stack" and not just my little app on top of it.

In subsequent years, I left that behind and went on to a development environment where we were testing kernels and other things. There were some people on the larger team who wanted to use some of these storage cells to keep copies of their test data, because it was rather large. They also decided the amount of quota required (which came out of some budget somewhere) was excessive, and so they wanted to do something very strange.

Normally, the storage cells wrote things as "r=3", that is, for every chunk of data you write to it, there are at least two other copies somewhere else in that cell. If it drops below that for whatever reason, the other two replicas spring to life and start "cloning" it to other machines in the cell to bring it back up to that level.

This means if you obtained 1000 GB of quota, you really could only store about 333 GB of user data, since the other two copies consumed the rest of it. These guys saw this as wasteful and so decided they were going to purposely set their data to "r=2".

I told them this was a horrible idea. The system was not designed to be used that way. In fact, if you stored your data that way, and something happened to it, the people who ran the storage service would not help you. They actually had a big warning on their team web site which basically said "we will not attempt to recover missing chunks which are less than r=3".

I further told them that if space was an issue and if their data was immutable, there was something you could do to mark the file as "archived" which would do some magic encoding to it. It would no longer consume 3x the space after that happened, but it would still have the same availability. It wouldn't quite shrink things down to merely 2x, but it was a good middle ground.

They didn't listen to me.

Some time passed, and I found out they had been talking to some guys somewhere else in the company. Those guys told them the same thing: r=2 is a bad idea, so use the magic archival encoding setting and the cell will give you back some of that space.

That's when they came back and said "okay, we're going to use the archival setting".

I fumed. I had told them this days before. I even qualified it and said that I had run the storage cell software in question and knew how it performed. Heck, I even built some of them from scratch on bare machines once in a while, and that involved a whole bunch of grungy stuff most of the maintenance type SREs never do!

They didn't believe me. They had to hear it from someone else.

Dismiss it as just a fluke if you want. Say it's not because of who I am if you want. It still happened. And it happens. And I hear it from other women in the industry, too: an idea or concept from their mouths is ignored until it's repeated by someone else who happens to be a guy. Meanwhile, an idea from a guy in the same environment is at least considered the first time out.

Obviously there are counterexamples. I'm just saying that this does happen from time to time. Does it happen to you?

...

Not being heard is not always one of these "online interactions are weird" things. Sometimes it literally means not being heard, as in, the vibrations in the air which we call sound.

It's a Monday morning in August of a few years back, and that means it's time for the weekly production meeting. There are a bunch of people sitting around a big table. It's maybe ten feet long and five or six feet across. The room is barely bigger than that, with just enough room to fit the table, chairs, and space for the door to swing, plus screens for the projectors at the far end.

It's not a big room, in other words.

Everyone is seated, and some topic is raised. One of the guys is saying something about how type 2 traffic is lossy over the backbone. Another one is talking about how it doesn't matter. A third mentions maybe there could be a workaround.

I notice this: hey, wait, type 2 traffic? We don't use type two traffic.

I go "uh... wait." Nothing happens.

They keep talking. They're going more quickly now, and are a bit louder than usual. One of them has come up with something they can do about this so-called packet loss for type 2 traffic, and the other one doesn't like it. Never mind the fact that we don't use type 2. I need to stop them before they start making decisions for something that won't even *DO* anything.

"Guys? Hold on there." I probably leaned in and gestured with an open hand (palm up, thumb out, like "what?") while doing this. It should have been obvious that I had something to add.

Nothing. They're still going.

Yap, yap, yap, back and forth. Oh, we can do this, no that sucks, no, we'll do that instead. But then we have to do this. Yes, their hacks are going to lead to bigger problems. This has to be stopped.

I'm practically waving my arms around now. Still, nothing. WTF?

They're still going.

It's been a crappy couple of weeks for me. I've been having a miserable time at this job, with the oncall and treatment by the others on my team in general. I'm over it. I raise my voice and speak up, making sure they notice.

"PUT DOWN THE CRACK PIPES!"

Okay, that worked. They're now all looking directly at me, and they're shocked.

"We don't even use type 2 traffic for replication, so none of this matters. We're at type 1 because we are in the serving path for a type 1 service and have to be as reliable as they are. The only time we use anything else is for the metadata, which travels in the swamp, and that's because of that bug I tracked down and reported, and they're sending us a fix in the next release."

They're not happy.

There's an awkward silence.

Finally, the boss or someone else says something like "okay, moving on..." and they go on to the next agenda item. Nobody makes eye contact with me.

I've stopped yet another half-assed idea from going into production, but at what cost? I find that out immediately after the meeting when my boss pulls me aside.

"This is materially affecting your perf score."

Translation: the number which is used for calculating my raises, bonuses, stock options, promotions, and the like, is lower than it otherwise would be because of the way I dealt with the rest of the team.

So, basically, my options are: "be ignored and have bad things happen", or "get noticed and have bad things happen". The middle ground, if there is one, is impossibly thin. The slightest misstep will put you onto one or the other, and then you're in trouble.


April 14, 2013: This post has an update.