Writing

Feed Software, technology, sysadmin war stories, and more.

Tuesday, April 24, 2012

I reject your unnecessary naming conventions

We seem to have a vocabulary problem in this field. By that, I mean there seems to be a tendency to create new terms for things when a simple descriptive phrase would do. Worse still, these new terms tend to leak out into other contexts and create situations where not knowing a term can be used to "filter" people.

Imagine my story about genuine questions, but instead of it being some strange "John Elway" reference in a calc class, it was some computer-related thing which talked about "memoization". You might also consider the possibility that it might sound like memorization, and the ignorant listener might not even realize there's a difference.

I maintain that some of these things serve to set people apart. When there's a term which is opaque until it is explicitly defined by the holder of that knowledge, it can be used to create a knowledge gradient. You must then bow down at the altar and admit your relative inferiority before you can be allowed to drink from the fountain of knowledge.

The flip side of this is that people who like this situation will then go and amass large numbers of arcane terms just so they can wield more than other people. This is ridiculous. This is about communicating effectively, not trying to "catch 'em all" like some kind of Pokemon game! Deliberately communicating badly is just stupid.

So, instead of doing that, how about giving things reasonable labels based on relatively common words in your language?

Here's an example -- I wrote about my weather station monitoring project last year. It did a few things which were apparently novel for the field at the time.

First, instead of making you wait for data to arrive, it kept running in the background and kept copies of whatever it received from the sensors. When a client program checked in, it just handed that data over as quickly as possible.

Second, when it was time to render an image for something like temperature, it would try to cache it. Basically, if you asked it to generate the image for the current day, it would build it from scratch, since the data was still subject to change.

However, if you requested a prior day, it would still build it from scratch, but then it would save the final output image to a file on disk. The logic was simple enough: the past is not going to change.

If at some point in the future, you or someone else requested that same day, then it would just serve that file from disk. This meant it didn't have to go through the relatively expensive procedure of pulling all of the data points out of the database, determining the bounds, graphing all of the numbers, and then building an image from it.

Using the cache was far faster and less resource-intensive. It was a simple trade of disk space (to store the images) versus actual computing power to build them from scratch every time.

So let's see. I had a set of previously processed inputs and avoided performing calculations on them a second time. Doesn't that sound familiar?

It might be different if there is something super specific about what a given term means and that nothing else could possibly describe it. But be honest -- when is that actually the case?