Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re^3: Using relative paths with taint mode

by haj (Vicar)
on Jun 20, 2021 at 10:29 UTC ( [id://11134060]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Using relative paths with taint mode
in thread Using relative paths with taint mode

This is exactly the consideration I wanted you to do. Perl doesn't know that your script is supposed to be called via the web, but you do. Your reasoning about server admins is ok - they would not need to exploit your use of $Bin to do harm.

I'd change something which isn't related to taint mode, though: In your setup with use lib $Bin;, you have your libraries within the cgi-bin path. This is unhygienic since your libraries are now exposed to attacks from the web. At least you need to consider what happens if someone points his browser to http://your.stuff/cgi-bin/Site/HTML.pm.

In a typical CPAN-like setup you have two different directories for scripts and libraries, so you'd usually end up with use lib "$RealBin/../lib";. This would allow to install that stuff "somewhere" and then symlink to the script (and only to the script) from your cgi-bin directory. That way, only the script's URL is exposed, and $RealBin will resolve the symlink and find the installation directory with the libraries for you. The web server might need a directive to allow symlinks to do that.

  • Comment on Re^3: Using relative paths with taint mode

Replies are listed 'Best First'.
Re^4: Using relative paths with taint mode
by Bod (Parson) on Jun 20, 2021 at 13:08 UTC
    At least you need to consider what happens if someone points his browser to http://your.stuff/cgi-bin/Site/HTML.pm

    I exclude access to all the subdirectories of cgi-bin with a .htaccess file.

      I exclude access to all the subdirectories of cgi-bin with a .htaccess file.

      Forget to create that file or forget to enable .htaccess files and you are screwed.

      Moving libraries (and configuration) out of the webserver's document root and outside any aliased directory not only completely avoids those errors, but is also faster. The webserver does not have to parse .htaccess files. Quoting https://httpd.apache.org/docs/current/howto/htaccess.html#when (Apache v2.4 at time of writing):

      In general, you should only use .htaccess files when you don't have access to the main server configuration file. There is, for example, a common misconception that user authentication should always be done in .htaccess files, and, in more recent years, another misconception that mod_rewrite directives must go in .htaccess files. This is simply not the case. You can put user authentication configurations in the main server configuration, and this is, in fact, the preferred way to do things. Likewise, mod_rewrite directives work better, in many respects, in the main server configuration.

      [...]

      However, in general, use of .htaccess files should be avoided when possible. Any configuration that you would consider putting in a .htaccess file, can just as effectively be made in a <Directory> section in your main server configuration file.

      There are two main reasons to avoid the use of .htaccess files.

      The first of these is performance. When AllowOverride is set to allow the use of .htaccess files, httpd will look in every directory for .htaccess files. Thus, permitting .htaccess files causes a performance hit, whether or not you actually even use them! Also, the .htaccess file is loaded every time a document is requested.

      Further note that httpd must look for .htaccess files in all higher-level directories, in order to have a full complement of directives that it must apply. (See section on how directives are applied.) Thus, if a file is requested out of a directory /www/htdocs/example, httpd must look for the following files:

      • /.htaccess
      • /www/.htaccess
      • /www/htdocs/.htaccess
      • /www/htdocs/example/.htaccess

      And so, for each file access out of that directory, there are 4 additional file-system accesses, even if none of those files are present. [...]

      [...]

      In the case of RewriteRule directives, in .htaccess context these regular expressions must be re-compiled with every request to the directory, whereas in main server configuration context they are compiled once and cached. Additionally, the rules themselves are more complicated, as one must work around the restrictions that come with per-directory context and mod_rewrite. [...]

      Update:

      I usually place only a minimal CGI below document root or in an aliased directory, and have all remaining code and configuration elsewhere, inaccessable to HTTP clients. Something like this (untested):

      #!/usr/bin/perl -T use strict; use warnings; use CGI::Carp qw( fatalsToBrowser ); # <-- not always use lib '/opt/my-app/lib'; use My::App; My::App->run();

      All remaining code is in My::App or loaded by My::App. A welcome side-effect is that almost all errors occur in modules loaded after CGI::Carp is active, and so I get reasonable error messages in the browser during development.

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

        Just found some numbers on .htaccess by nginx:

        https://www.nginx.com/resources/wiki/start/topics/examples/likeapache-htaccess/

        This may look like comparing Apache vs. nginx, but it is not. It is comparing .htaccess enabled (the Apache case) vs. .htaccess disabled / not implemented (the nginx case). Apache with AllowOverride none should be the same as nginx.

        To explain: Each FS stat and each FS read is "expensive" and should be avoided where possible for a fast webserver. .htaccess forces more FS stats and more FS reads, and so the numbers for .htaccess enabled (the Apache case) are way higher than for .htaccess disabled / not implemented (the nginx case).

        Alexander

        --
        Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
        In general, you should only use .htaccess files when you don't have access to the main server configuration file.

        This is on shared hosting so I don't have access to the server configuration file!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11134060]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (4)
As of 2024-04-19 01:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found