Writing

Feed Software, technology, sysadmin war stories, and more.

Thursday, June 20, 2013

Subtractive framework creation makes more sense

Does anyone really start out on a project by saying "I want to build something huge and crufty that doesn't serve any real purpose"? I don't think so. I think it just looks that way after seeing enough failed projects come and go over the years. I'm thinking about the notion of frameworks in particular, but this might apply to other sufficiently large software systems as well.

Consider this situation. You're in a "green field" environment: there is no code already written to do what you want to do, or if there is, you can't bring it along for whatever reason. Maybe the licensing isn't compatible, or it's in a language which can't be used any more. You get the idea. You know you're going to wind up making a whole bunch of utilities along the way.

So what happens next? Do you whip out your crystal ball and attempt to predict everything which will be needed? Does that lead to a whole load of functions which then becomes the to-do list? Are these tons of feature requests or sticky notes put up somewhere to get all of these written?

How about the structure? Even though none of this code exists yet, is there a whole hierarchy being built out somewhere? Is someone already arranging places for this stuff to "hang" on the tree even though there isn't anything to show for it at this point?

It seems like you could refer to this approach of framework building as "addition": you keep identifying things which "should be useful" and start trying to build them. You may actually succeed at building a fair number of them, and now you have this whole pile of tools which may or may not be useful.

This is the point where actual work towards solving the problem starts, and now it's a matter of mapping that work onto the pre-existing framework. I sure hope you had a good "read" on that crystal ball and nailed all of the utility requirements even without actual code use cases, since otherwise your utilities are going to need to be fixed. This may change their behaviors significantly. Whole assumptions about who worries about what may have to be cast out and replaced. What does this do to your wonderful schema now?

I guess if you're some kind of never-fail wizard type, then you might be able to sit down and just slap all of this together and never have to go back to change something. I don't know anyone like that. I don't know if that kind of person can even exist.

My suspicion is that this could only happen if an existing project was being literally rewritten piece by piece. Since the architecture was the same, just with a different language used for the implementation, in theory, the structure might be the same. Such projects do exist, but really, how often do you do a straight-across port without changing something?

Wouldn't it make more sense for framework creation to be a subtractive process? First go and build your program which accomplishes some goal. Then build something to make a second goal happen. Then do a third. Now take two or three steps back and look at what you've created.

What kind of common stuff is there, repeated across the three programs? Call it F. What sort of specialized things exist for the three goals? Call them X, Y, and Z.

F would be those elements which make sense on their own. They'd have to remain useful even after being translated to a more generic form in order to suit the needs of multiple calling programs.

I've been down this subtractive road many times. Initially, when it came to doing JSON, I used to just bodge it together myself and built the strings myself. Yes, really. Then I got paranoid about encodings and wrote something to handle escaping evil characters. That lasted for a while, but then the prospect of releasing >fred to the world came along and got me thinking about less crufty implementations. I started using a helper library called jansson to handle my storage and JSON encoding needs.

Initially, this was just part of fred, and so it lived in that directory. I think it was three separate classes spread across three sets of files: one for the base-level JSON storage and types, another for arrays, and a third for objects. fred used this for a while and it worked okay.

Eventually I found myself working on a project for a client which also did JSON. Now I had three different uses of JSON in my tree: my scanner which was still manually generating JSON, fred with its jansson wrapper, and now this new thing for my client. I finally had enough things approaching JSON in their own ways to give me a better sense of the problem space, and could come up with an API that made sense.

I copied the fred code to a new place and subtracted the fred-specific stuff. A few things were adjusted to make them suitably generic, and then fred, the scanner, and the new program were changed to use it.

Now, after these iterations, I have a json.cc and json.h which has everything I need to handle my current use cases. I don't have to have special handlers for fred or the scanner or the third project. How could I possibly have coded this from scratch when there was no way to know what was coming?

This code, the closest thing to a "framework" I have, had to evolve from other code which works and is in active use. Trying to come at it from the other side probably would have resulted in some time-wasting monstrosity which would have to be thrown out anyway.

Finally, doesn't this seem to fit best with an "Agile" mindset? Code what you need when you need it, and reconsider everything regularly.

Otherwise, aren't you throwing darts at a dart board which might not even be there? How can you possibly see through the fog of time?