Make it harder to screw up and make unsafe things obvious
Right up front I want to say something: I am not a PHP programmer. I've had to support the runtime for people, and I've modified existing code in the course of getting other things to run, but I've never willingly written new code in it. There are probably things I'm missing here.
With that said, onto the rant.
Last weekend, I was trolling the /new page on Hacker News looking for some inspiration between projects, and found someone who had posted a link to a personal web site. That web site told a story about creating a web app which acted like a guest book but would also attempt to collect some info about why the person was visiting. The idea was to have it run on a tablet in your business to "capture" people in real life.
There was a link to see the source code for this project, so I clicked through and started sniffing around to see how it worked. I wanted to get some idea of where this person was coming from and what sort of techniques they used or possibly avoided in the course of creating it. What I found was not encouraging.
In the first file I checked, I found something like this:
$query = "SELECT * FROM table WHERE `name` = '$_POST[name]' ...
I won't even get into the whole thing about "SELECT *" versus explicitly listing columns and why you don't want to get your DBAs angry at you. This is more about how the query was being constructed. It's a string which then gets handed to MySQL and as a result has the ability to do a great many things.
Since it's being constructed that way, the user-supplied input (in this case the contents of a POST) gets a free ride right into that string. This can open the door to all kinds of badness in the form of SQL injection attacks. It's old, old news, just like how using certain techniques in other languages will open you to buffer overflows and attacks in that realm.
It's almost boring because it's that old. Yet, it persists.
This got me thinking about the human side of this equation. Here we have someone who is trying to make something work, and has created a vector for a well-known attack. I assume this comes from having the seemingly direct route be one of string construction, and countless examples do nothing to disprove it. There are always better ways (like parameterization), but it's going to take something special to make the world take notice.
My thoughts turned to those of "taint". This is where you have the ability to flag a variable as containing user (and thus, untrusted) input. It's a latch, meaning once you set this flag on a variable, it can't be cleared. It also "rubs off", so if you use the data from a tainted variable A to populate variable B, B is also now tainted.
I went digging around and found that while PHP apparently has some third-party extensions to add taint, it does not do it natively. If I'm wrong, write in and tell me, but nothing I found suggests that the version typically encountered by someone on a random LAMP environment will have this configured and running full-time.
This seems to be the root of the problem. I want to see them do something bold. Put in full-time taint checking. Make functions which run SQL queries holler if someone passes in a tainted variable. Make it so you have to set some kind of special "yes, I accept that this will open a gaping hole" flag per source file if you want to disable it. Make it fail to run properly if you don't do it right or turn it off.
Then you can turn people loose on it. They'll still try to write their potentially-vulnerable code, but this time, it won't work. Then, instead of thinking they're done, they will now have a new problem to solve, and in chasing down the problem, might encounter some useful documentation which says "please don't build your queries this way".
Of course, there's also the possibility that someone will just drop the "shut up and leave me alone" setting into all of their source files, but that's okay too, since it makes code auditing really simple. All you have to do is make it eminently greppable, and then someone like me can just grep the whole tree for it upon encountering a new project. If it shows up, you know the mindset of the author and can decide whether it's worth your while to proceed.
Obviously, this would have no effect for the legions of machines which are running older builds of PHP before this vaporware always-on taint thing was released. To address that, allow me to tilt at another windmill and come up with yet another vaporware concept.
There should be a test program which purposely does bad things with tainted data. It would be the sort of thing which the new hypothetical taint-always-on PHP version would reject right away. You would use this to verify if the runtime environment on someone's host was actually enforcing taint or not.
Once this existed, you could ask a new PHP programmer to run this program and post the results. If it says something like "taint check not present in this interpreter", then you know there's no point in grepping the code for the tell-tale "go away" string, since the odds they ran into the roadblock are slim.
Basically, I'm coming at this from an angle of trying to be a helpful web forum participant without trying to get too involved with what could be a big bag of hurt. Having a few more tools available would reduce some of the round-trips required to know where someone is and how much help they might need.
So, to review:
1. They need to run the taint checker and include the output. If they don't have it run, they will be asked to run it before anything else will happen.
2. If the taint checker is shown to be running on their host, then a grep for the "disable taint check" setting will be run. If it's there, it might be better to ignore this person since they're purposely aiming a loaded gun at their feet.
3. At this point, the checker has been shown to work (or they're lying, I suppose) and they haven't intentionally disabled it in their code (or they've hiding it really well), and you can start looking for more interesting things to audit.
Things like this would make it a lot easier to be helpful and not mean.
Finally, nothing I said is unique to PHP or this program. This could apply to anything potentially unsafe in code. Lots of languages might benefit from something like this.
Your language might call it "use strict" or "-Wall" or something else entirely, but establishing that you've already passed that hurdle tends to make other people treat you more seriously when you go asking for help.