ASCII protocol buffers as config files
While I don't go on the Orange Site any more, I still make enough trips through the larger space of similar sites to get some idea of what people are talking about. Last week, the topic of interest seemed to be YAML and how evil it is. I can't argue with that. Every time I've crossed paths with it, I've been irritated by both it and whoever decided to use it for their stuff.
The discussions invariably start talking about alternatives, and frequently end up on JSON. This is unfortunate.
I've mentioned this before in passing, but have never given it a whole post. Today, it graduates to having a whole post about the topic.
The topic is: ASCII-form protocol buffers used as config files.
This was a tip given to me something like 17 years ago when I was "on the inside", and it's turned out very well. Protocol buffers have a canonical ASCII representation, and it accepts comments, too! You get the benefits of not having to write a scanner or lexer combined with a system in which everything is explicitly specified, right down to the data types.
Here's an example of such a file:
# **** Contains auth data: must be thermo:thermo 0660 or better **** db_conninfo: "host=localhost dbname=foo user=xyz_role password= ... # barn (hardwired) server_info { host: "172.25.161.10" port: "18099" } # barn (backup wireless on IoS network) server_info { host: "172.25.225.10" port: "18099" } # office (broken 20230825) # server_info { # host: "172.25.161.17" # port: "18099" # } sensor_location { name: "loft" model: "Acurite-Tower" id: "1563" channel: "A" } sensor_location { name: "entry" model: "Acurite-Tower" id: "2375" channel: "B" }
There. That's not terrible, right? It has a bunch of common stuff that gets repeated as needed for my different servers and sensors. There's also a string that gets handed to Postgres to connect to the database. And yes, notice the comments everywhere.
Over in protobuf-land, this is what the .proto file looks like for that config format:
syntax = "proto2"; package thermo; message LoggerConfig { message ServerInfo { required string host = 1; // 192.168.31.67 required string port = 2; // 18099 } message SensorLocation { required string name = 1; // room required string model = 2; // Acurite-Tower required string id = 3; // 1015 required string channel = 4; // C } required string db_conninfo = 1; repeated ServerInfo server_info = 2; repeated SensorLocation sensor_location = 3;
There's one important bit here: I'm using "required" here since this is a config file format and NOT something that will be passed around over the network. It lets me cheat on the field presence checks, and this is the one case where it's acceptable to me.
If you're using protobuf for anything that gets handed around to something else (RPC, files that get written by the program, ...), whether across space *or time* (i.e., future instances of yourself), use optional and explicitly test for the presence of fields you need in your own code. You have been warned.
How does the program use it? First, it reads the entire config file into a single string. Then it creates a LoggerConfig (the outermost message) and tells the TextFormat flavor of protobuf to ParseFromString into that new message. If that returns true, then we're in business.
I can now do things like hand config.db_conninfo() to Postgres or iterate over config.server_info() or config.sensor_location() to figure out who to talk to and what sensors to care about.
Is it perfect? Definitely not. It's software, which means it will never truly stop sucking, like all other software. It's a dependency that will now follow you, your code, and your binaries around like an albatross. It's yet another shared library that has to be installed wherever you want to run.
But, hey, if you're already paying the price of using protobuf in your projects for some other reason, then why not use it for config storage, too?