Subprocesses are generally bad news
Yesterday, I wrote about a confluence of well-meaning features which can create a serious disaster. It's what happens when you have a flag library, a means to change those flags by way of unauthenticated networks, and programs that run subprocesses. I glossed over a bunch of possible tangents in order to focus on that particular hat trick.
Today, however, is about the specific case of running subprocesses. In short, if you can make your system work without them, then do so! "Shelling out" to something else is a fantastic way to open incredibly bad holes in your system and potentially give the entire game away to whoever happens to find it first.
There are too many ways this can go wrong. I will attempt to describe a few of them.
First off, system(). This is a C library call which looks mighty tempting. It lets you run processes just by constructing a string. They run by way of a shell, and so respect PATH. This means you don't even need to worry about exactly where the binary lives. You can just system("git commit -m update") and it'll work well enough to make you think it's safe and good to leave like that forever.
The biggest problem here is that yes, it runs via a shell. Shells love to take all kinds of flexible input and do marvelous things with them. With that in mind, what if I can get you to put certain magic characters into the command? Going based on our git example from above, maybe instead of saying "update" as the commit reason, I can get you to say something I define. What if I tell you the commit reason is this?
foo ; touch /tmp/0wned;
What's going to happen? It's going to call out to this, isn't it?
git commit -m foo ; touch /tmp/0wned;
Congratulations, you just ran my payload.
"I'll just escape it" is a typical and yet unacceptable response to this. Odds are good that this will get screwed up, and someone will find yet another clever way to inject some command that would benefit them.
Maybe you use popen() instead. Assuming we're talking about the C library version here, guess what? You are also going through a shell, and are just as vulnerable. If you build up a command line and allow me to send you arguments, I can probably make you run my payload.
Python has variants on this. The usual popen-alike you find on Python doesn't usually run through a shell, but it can totally be done if you explicitly enable it. This means it might be vulnerable to this kind of injection if someone switched that feature on.
This is the point where you start grepping through your code base for calls to system() and popen() and whatever flag(s) Python uses to send it through a shell. Maybe you'll find something and can close the holes before someone else drives a truck through it. Maybe they already have.
Something like execve() is somewhat better, since you get to specify the target executable separate from all of the arguments, and there's no shell involved. It actually replaces your process with the new thing, so you have to figure out all of the stdin/out/err business, potentially closing outstanding fds, not running afoul of "things to not run after fork and before exec", and all of that fun stuff. It also means you control the environment of the new process, and can exclude a bunch of nasty stuff that might otherwise be provided by an attacker.
Still, if you can swing it, the only way to win this game might be to not play at all. If a program is just calling into a library, maybe you can do that yourself. Besides avoiding the possibility for command injection, and having to deal with forking and execing and doing that correctly, you can just call into it and call it done.
Obviously, this isn't always an option. It's not like you can call into some magic library that'll do what a call to gcc or g++ would otherwise do. There are many more examples of programs which don't exist in terms of user-facing libraries and will only deal with you through their human-facing CLI tools. In those realms, all you can do is wrap the programs and hope for the best.
One thing to consider is the possibility of explicit separation of responsibilities. Maybe you're stuck having to run some other program for some reason. You have no choice in the matter. Well, nothing says you necessarily have to do it inside the same process, right?
To abuse the earlier mention of gcc, one really bizarre approach would be to have something sitting there waiting for the call over loopback (or a Unix domain socket), which would then accept a few well-specified parameterized requests and would then run gcc in a really sparse environment (chroot, "container", that sort of thing) and kick back the results. It could run as a totally different account, too, so if something truly bad happened, it wouldn't have access to much else.
A cheeky name for it might be "RCE as a service".
Of course, now you have a dispatch problem, and a bunch of lurking processes waiting to be discovered and mistreated, and the matter of authenticating/authorizing requests to those dispatchers, and so on. You might wind up trading one set of problems for another.
In any case, take a few things away from this.
First, subprocesses are a code smell. Avoid them whenever possible.
Second, popen() and system() are really hard to use safely.
Third, if you don't need to allow modifying the environment and/or arguments handed to a subprocess, don't let it anywhere near user-supplied data! Make those things run from hard-coded constant strings or something to that effect. Don't substitute, concatenate or otherwise "taint" the future program's argv[] with data from a potential attacker.
Finally, if you can, wall this stuff off in some sort of jail.