http://qs321.pair.com?node_id=177187


in reply to Untainting safely. (b0iler proofing?)

It depends entirely on the application. We don't know what "unsafe" is without knowing the context. If you're talking about shell meta-characters, it depends on which shell you're using (and which shell the user will be using), and should be relatively moot if you use the multiple-argument form of calls like system and exec, which wouldn't do any shell expansion anyway. If you're talking about unsafe text in HTML, we have things like HTML::Entities.

Basically, identify what you're going to be doing with the data, and then figure out how you're going to ensure that this untrusted data is safe.

And no matter how you approach it, don't think of your algorithm as being built to remove bad things. Build it to permit safe things. If this means doing a tr/a-zA-Z0-9_-//cd, then that's what you have to do.