Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

Controlling Inputted Paths in a CGI Script

by ajt (Prior)
on Oct 31, 2001 at 18:05 UTC ( #122341=perlquestion: print w/replies, xml ) Need Help??

ajt has asked for the wisdom of the Perl Monks concerning the following question:

Hi, a while ago I asked the wisdom of a simple site mapping tool, (see Web Site Mapping Tool). I got a few useful messages - many thanks, and so went ahead and built it. It works great, and I'm happy with it, except that I know it's got at least one big security hole!

I'm working within a secured intranet, so I don't expect many hacking attempts, but you can never tell, plus I can't be sure that someone else won't use the script within the company in a less secure environment - who ever reads warning and comments when you are in a hurry?

The problem lies around the way that the script works out which directory tree to map. Here is a summary of how it works:

  • HTML page containing an XML tag to the script is called by the browser. The templating system calls the script, passing some parameters to the script to let it know where it was called from.
  • The script starts and works out on which virtual host it was called from, and looks up that host's config file. If it can't find it, eg fake virtual host, then it dies.
  • After this it works out where it was called from, (via the parameters) and then dies if start of the physical path does not match the one it's allowed to use.
  • It passes the directory of to File::Find, and away it goes.

The obvious problem with this is that if you pass by hand a path such as /allowed/../../notallowed, where the approved path is /allowed/, then this gets through, and then File::Find then traverses a space it's not supposed to (directory permissions aside).

I've done the obvious of obliterating any "..", but I know that there are many more ways to bypass this. Taint and detainting won't help directly either.

CHROOT isn't an option on my NT box, and annoyingly NT permissions don't prevent much either - as currently set. On a Linux box (next phase) I should be able to use CHROOT and file permissions to control the script a bit better, but I'd like the script to be more robust by default.

QUESTION: How do I take in a path from the outside, and verify that it's safe to pass to File::Find?

As ever, humble thanks in advance.

Replies are listed 'Best First'.
Re: Controlling Inputted Paths in a CGI Script
by hatter (Pilgrim) on Oct 31, 2001 at 19:10 UTC
    I think the general rule with anything is to explicitly permit what you think is acceptable, rather than deny what you can think of that shouldn't be there. So, if you think the path should containt letters and shouldn't contain % or $ then don't use ^%$ but instead look for a-z0-9. That'll not only stop people putting in NT variables, but also disallow all sorts of weird character escape methods.

    From what you've already said, you can assume your environment is fairly sanely set up, so maybe specify that the first character after a directory separator is alphanumeric, which will stop people using blah/. blah/../ blah/.somethingsomeonehid/ etc

    There are probably lots of other examples of things your environment wouldn't have, so to begin with, put in something exceedingly restrictive like /([0-9a-z -]+\\)*[a-z0-9 -]+/i then come up with several real, acceptable paths and make sure they'd work. If not, loosen your definition by as little as possible to permit them to work.

    the hatter

Re: Controlling Inputted Paths in a CGI Script
by monkfish (Pilgrim) on Oct 31, 2001 at 18:50 UTC
    If you want to accept all valid paths and file names and avoid anything unsafe you'd need to do something more complicated like split on the / and check each element individually.

    However if you are willing to say, "I don't care about all legal file names", my files will be limited to alphanumeric, underscore, space, dash, slash and dot. (Which seems reasonable). Then remove everything else and eliminate multiple dots.

    $file =~ s@[^\w/. -]@@g; $file =~ s/\.+/./g;

    -monkfish (The Fishy Monk)
Re: Controlling Inputted Paths in a CGI Script
by dmmiller2k (Chaplain) on Oct 31, 2001 at 21:00 UTC
    I've done the obvious of obliterating any "..", but I know that there are many more ways to bypass this.

    You could:

    1. Save the current directory
    2. chdir() to the directory in question
    3. get the new current directory
    4. chdir() back to the saved directory
    5. return the "new current directory" from step 3
    solving at least one problem.

    From that point, you may have to brute-force search the resulting pathname (e.g., split() on '/', examine each component, etc.)

    Perhaps not that helpful ... sorry.


    You can give a man a fish and feed him for a day ...
    Or, you can teach him to fish and feed him for a lifetime
Re: Controlling Inputted Paths in a CGI Script
by tfrayner (Curate) on Oct 31, 2001 at 22:40 UTC
    You may want to check out File::Spec, which has a set of methods that may be used to sanitize a given path or filename, if I recall correctly...

    Update: Having just peeked at the docs, it looks like the $convertedpath=File::Spec->abs2rel($path,$base) method might help, where $path is the path to be checked and $base is the highest level you want to allow access to within the filesystem. E.g. constructions such as $path="/allowed/../../notallowed" should return "../../notallowed" if $base=="/allowed". You can then check for a match with /^\.\./ and discard dubious paths.

    Update2: Now that I stop to think about it, you could then use the converse function, File::Spec->rel2abs($convertedpath,$base) to get a properly qualified, cleaned up path ( i.e. no /../../). Note that a physical check on the filesystem is not (usually, depending on OS) done during these operations.

    Update^3: In what could perhaps be viewed as irony (or perhaps just plain inconvenience), it looks like you may have to move to the Linux phase anyway to implement this idea, as it doesn't look like the File::Spec::Win32 module supports these methods (File::Spec::Unix does, of course).


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://122341]
Approved by root
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (3)
As of 2020-07-02 05:48 GMT
Find Nodes?
    Voting Booth?

    No recent polls found