Writing

Feed Software, technology, sysadmin war stories, and more.

Sunday, August 19, 2012

Bumming instructions and optimizing circuits

One thing I've noticed while honing my skills in a given field is that sometimes I remember the good things which have been said about other people who have already been doing it. I hope this applies to others, too. For instance, if you read Hackers (1984) and remembered the parts where the folks at MIT were trying to squeeze down their program by "bumming" instructions, you might try to do that when it matters in your own life later on.

I've tried to keep things like that in mind when designing stuff, particularly when playing around in some assembly language. Most of that happened way back in school, since these days I tend to use things a few steps above raw assembly.

For example, let's say you have a lab where you are expected to run some kind of loop and then jump out after some number of iterations have occurred. Let's further say that you don't actually use the loop counter at all. You just want to do 10 calls to something and then move on.

The direct way to do this would be to just assign a zero to the location (probably a register), then do the work, increment the counter, and then compare it to 10. If it didn't equal 10, go back to the beginning. Otherwise, fall through. Assembly languages vary, but it's not too much of a stretch to think of "load", "increment", "compare" and "jump if not equal" in a few of them.

But what if you could make one of those instructions disappear without affecting how your program worked? It would now run that loop a tiny bit faster. For this, you just have to pay attention to what's actually going on when you run an instruction. If your processor has a "zero flag", then maybe you want to run that counter backwards. Instead of starting from 0, start from 10. Then, after every pass, decrement it, and immediately check the zero flag. If it's set, you're done. Otherwise, jump back.

In this scenario, you're doing "load", "decrement", and "jump if not zero". One of your instructions vanished. As long as you're not using that loop counter, or not using it in a way that actually cares about their order, then you're done.

You have now swapped a bit of ease-of-understanding for a few cycles. Hopefully, anyone who comes along later will recognize what is going on and will appreciate it. This is no different than being "clever" in ordinary programming languages now. Tread carefully.

In my case, it was a one-off lab and it was a way to show the prof that I actually "got" what was going on, and yes, I should get a good grade. The code saw no further use.

In a similar class, perhaps with the same prof, we had to wire up a traffic light controller. I've already written about the loopiness which happened when we had the wrong counter part in our circuit. In this same lab, I also had a chance to optimize the circuit. Here's how it worked.

The overall logic of my traffic signal was that certain conditions would "let the clock pulse through" to the counter which stored the current state. We only had four states, so this was pretty simple. First, I wrote out rules like this:

Now that I had my conditions, I just needed my signals. For each of the terms in parentheses up there, I had to figure out where to get all of those things. Things like "red #1" and "red #2" were just a matter of making sure the counter was "00" or "11", respectively, whereas the vehicle trigger was just a line arriving from our pushbutton. "10 seconds elapsed" was taken by making sure bits 3 and 1 (8 + 2 = 10) were set on our secondary counter.

All of these were ORed to get a single "increment" line, and then that was ANDed with the clock pulse. This meant it could only pass through to the counter when the clock pulse was present, and it effectively debounced the inputs for me.

So now I had the overall design and started laying out the chips which would do all of the work. I set it up in the "obvious" way with no funny stuff going on. When it was time to map it over to real parts, I started to see a bunch of inefficiencies.

Our parts kit had a bunch of ICs which did more than just one gate per chip. There might be 2, 3, 4, or even 8 gates on a given chip depending on what it was doing. So, let's say I needed an AND which took 3 inputs. I'd put down one of those chips, and, oh hey, that's a dual 3-input AND chip, so now I have another one just sitting there idle.

Now I have another signal which needs a 2-input AND to happen. At this point, I could just go drop another chip on the board to do that, or I could just cheat with one of the 3-inputs. How do you use a 3-input AND with only two signals? Easy. Tie one of the inputs high, so it's always 1. That effectively makes it a 2-input AND.

Likewise, if you have a three-input OR and only have two signals, tie one of them low so it's always 0. Then it will only act on the two real signals you have. I suppose you could also "double up" an input signal in one of these cases, but that seems unnecessarily complex.

Suddenly, I started seeing opportunities all over the place. Maybe I had a NAND open but needed an AND. Okay, so let's just use the NAND which is available and then flip its output through one of the many inverters which are otherwise just sitting there! I get the same result and save space on the board.

How did I even know to try this sort of thing? Easy. I had read about the master of such things at some point in the past and saw how it was appreciated by others. That gave me the inspiration to give it a shot when an opportunity presented itself.

Nobody ever told me how these people bummed instructions or optimized chips off the board. I had to figure that out myself. Still, just knowing that something was possible was all that it took to get things going.

Tell kids about a "what" and I bet they can figure out a "how" when the moment arrives. They're more clever than you might think.