Writing

Feed Software, technology, sysadmin war stories, and more.

Saturday, December 18, 2021

Log libraries and the tendency to open holes in things

Logging stuff is in the news this week. Specifically, people with Java and log4j somewhere in their lives are having a miserable time. I figured I'd re-tell a story about terrible things done by way of a bunch of feature-rich libraries and dubious interconnections. I've mentioned this in passing before, but this time I've tried to be much more specific about how it works.

This is a tale that's happened at multiple distinct companies, and I have to imagine it's happening (or *is happening*) at several more. Note that while I'm picking on the G* stuff here since it's open source and thereby more likely to resonate with readers, reimplementations of these libraries under other names have also introduced the same problems to those companies.

Part 1: the flags.

There is a C++ library called GFLAGS. The "G" implies it came from Google, and that's good, because it did. It works by letting you call a DEFINE_xxxx macro in the top of your code, regardless of how deep in the code it's buried. This then becomes a global (!) flag/argument.

DEFINE_bool(enable_foo, false, "Enable the foo feature");
 
...
 
void my_func() {
  if (FLAGS_enable_foo) {
    do_something();
  }
}

Besides being accessible from anywhere in the scope where you did the DEFINE, you can also do an equivalent DECLARE_xxxx and then make it available in your scope. You can do this from a completely separate compilation unit, even, as long as it eventually gets linked with something else that did that DEFINE for you.

DECLARE_bool(enable_foo);
 
...
 
void other_func() {
  // Turns out we need foo enabled to do our stuff.
  FLAGS_enable_foo = true;
  // ...
}

If you're wondering how some code that's buried so deeply in the program can get to see what's in the commandline args, that's easy. There's usually an InitCompanyName() function that you call from inside your main() where you pass it argc and argv, so it gets to consume all of that tasty config data.

int main(int argc, char** argv) {
  InitCompany(argc, argv);  // populates FLAGS_ of all kinds
  // ...
}

So, when you call that program with --enable_foo, it flips that bool to true. Or, when something makes other_func() run, it also flips that bool to true. Anything can read or write these things, so keep that in mind as you follow the rest of the story.

Part 2: the logs.

Next up is GLOG, and as you may have derived from the name, it's a C++ library from Google that does logging. When it's being used, it frequently shows up like this:

  LOG(INFO) << "Something happened, value=" << value;

GLOG is no small thing, and it has tons of features. It can log to stderr, sure, but it can also log directly to a file. How does this work? Oh, that's easy: it uses GFLAGS, naturally, and it has a flag for that! Set it to log to a file and it totally will log to that file.

One of its many features is "log to email". Yep, you can set up an e-mail address, and any time it has something to say, it'll kick off /usr/lib/sendmail and deliver the log message that way. You'll find a flag for turning on that feature, and another for setting the e-mail address.

Oh, but what if your tool to send mails isn't at that exact path? No problem! There's another flag which will let you point at something else. You can have it use /usr/local/my_company/mail/sender instead.

Part 3: the status wrappers.

The company might have a small HTTP or RPC server that automatically gets baked into most network server software. It comes up on a given port and it hands out status information about the program. It might do things like export the values of counters so that something else can fetch them and store them, so yet another tool can be used to monitor and/or graph them.

It might even export the program's internal config values, like whatever it read from a config file. This way you can see what it is actually using to make its decisions instead of trying to guess what the state is like inside the program.

Invariably, someone wires this up to GFLAGS, so now you can see what *all* of those global config flags are set to. This is pretty interesting, since now you can get some idea of what args are being used with a given instance of your server software.

In my experience, this sort of thing appears both via a simple HTTP URL and over some kind of company-specific RPC mechanism, such that you can fire off a GET /something (if HTTP) or a GetFlags (if RPC) type request and get back all of the flags and their settings.

Can you guess what happens next?

Part 4: the status setters.

Someone eventually decides that it would be an amazing feature if those HTTP and RPC interfaces could also be used to *change* the settings of those GFLAGS. Imagine that! You're running something, and you forget to set verbose mode, and you really, really need to debug it without restarting it.

Have no fear! Hit the right HTTP endpoint and just _set_ verbose to 1! Or, equivalently, poke it with the right RPC with the variable name of "verbose" and a new value of "1".

Either way, somewhere inside the code, something is reaching into the global set of GFLAGS and is setting the value to whatever you just handed it. This becomes effective immediately and then there's your verbose debugging info, all without restarting a thing.

You might think this is awesome, but it's usually not, all because of what comes after this.

Part 5: the snarky engineers.

Sooner or later, someone puts all of this together in their head and realizes they now have a giant hole that they can exploit for their own purposes. The chained logic goes like this:

  1. You go searching around the company's code base for "exec()" or "system()" or similar and turn up the logging code with its email feature.
  2. You discover this logging code will execute an external program when it's set to "log to email".
  3. If only the external program was something I controlled...
  4. ... and if only I could enable the "log to email" mode...
  5. Hmm, these are all controlled by GFLAGS.
  6. Wait a minute, they export GFLAGS through a public HTTP and/or RPC interface!
  7. Does that interface let me CHANGE values? IT DOES!
  8. So can I say mail_program = /my/evil/path and then flip log_to_email to TRUE? I CAN!
  9. Can I then get it to log something to run the logging path? Sure! Just poke it in such a way that it calls LOG for some reason. If they love INFO logs, this should be trivial.
  10. And... okay, I own this thing.

Around this point, the snarky engineers realize that every single program at $COMPANY uses this logging suite and so might be exposed this way. It turns out the ones which don't listen to the network are not vulnerable, but anything running that HTTP status server or RPC subsystem *is*, and wow, is that a long list.

This is how a bunch of small bits of cleverness can multiply into a raging dumpster fire.

Incidentally, if you can't make it run a subprocess but can flip arbitrary flags as a completely unauthenticated requester, you can still have all kinds of evil fun with most programs. Plenty of vulnerabilities exist which amount to "we have a debug flag which disables password checks that's only used by developers which is never set in prod, but someone used the flag-flipper to enable it in prod, and, well..."

You get the idea. Runtime flag flipping and variable setting is scary.