Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Do I have to untaint all user input in a form?

by bradcathey (Prior)
on Nov 14, 2003 at 03:04 UTC ( [id://306991]=perlquestion: print w/replies, xml ) Need Help??

bradcathey has asked for the wisdom of the Perl Monks concerning the following question:

Fellow more-XP'd monks,
In my quest to relearn Perl-a-la-PM, I'm trying to grasp the art of form validation for untainting purposes. My questions here are not about my regexes (I'm open to suggestions, but I do most of the heavy lifting in...(gulp)... Javascript), but want to re-validate in my .pl to keep Taint happy.

My Questions:
1. For the purposes of untainting, do I have to validate every user input value, even though they are all going into a database? I.e., is it overkill? Or is it just good practice? (I know I won't get any errors unless I try to open, eval, etc.)

2. Is there a more efficient way to code my validation calls and subs? (I know there are validation modules out there, like CGI::Validate, but I couldn't find anything specific enough. But hey, I'm open!)

3. I realize these are baby-steps, but foundational. So, am I "getting it?"
#!/usr/bin/perl -wT print "Content-type: text/plain\n\n"; use lib "/Library/WebServer/CGI-Executables/"; #Mac :) use CGI::Carp qw(fatalsToBrowser); use strict; use Validate; use Data::Dumper; use CGI qw(:standard); new CGI; Validate::val_alpha(param('name')); my $name = $Validate::val; Validate::val_date(param('date')); my $date = $Validate::val; Validate::val_phone(param('phone')); my $phone = $Validate::val; Validate::val_email(param('email')); my $email = $Validate::val; print Dumper($name, $date, $phone, $email); #testing only #eventually write to the database __END__
Code for Validate.pm:
package Validate; # use package to declare a module our $val; sub val_alpha { $val = shift; if ($val =~ /^([A-Za-z \-]*)$/) { $val = "$1"; } else { &error_page; } } sub val_phone { $val = shift; if ($val =~ /^[\(]?(\d{3})[\)\.\-]?(\d{3})[\)\.\-]?(\d{4})$/) { $val = "$1-$2-$3"; } else { &error_page; } } sub val_date { $val = shift; if ($val =~ /^(\d{2})-(\d{2})-(\d{4})$/) { $val = "$1-$2-$3"; } else { &error_page; } } sub val_email { $val = shift; if ($val =~ /^([\w\.\-]{3,})@([\w\.\-]{3,})\.([A-Z]{2,3})$/i) { $val = "$1\@$2\.$3"; } else { &error_page; } } sub error_page { print "HTML error page prints here\n"; exit; } 1;

Thanks!

—Brad
"A little yeast leavens the whole dough."

Replies are listed 'Best First'.
Re: Do I have to untaint all user input in a form?
by sauoq (Abbot) on Nov 14, 2003 at 03:21 UTC
    1. For the purposes of untainting, do I have to validate every user input value, even though they are all going into a database? I.e., is it overkill? Or is it just good practice? (I know I won't get any errors unless I try to open, eval, etc.)

    When validating untrusted user input, there's not much that could be called "overkill".

    For starters, see tilly's recent node: Use placeholders. For SECURITY! Consider also whether, the data will ever be used for anything else. It might just be going into a database right now, but what are you going to do with it next year? Better safe than sorry, right?

    As for your validation routines, the one that stands out as being insufficient is the email validation routine. Consider using something a bit more robust, like CGI::Untaint::email perhaps.

    -sauoq
    "My two cents aren't worth a dime.";
    
      Thanks sauoq for the reminder. Great node referral. My next step is to add the code writing to the database using placeholders. That's after I figure out how to write a validator that actually untaint my values (see Zaxo below).

      —Brad
      "A little yeast leavens the whole dough."
Re: Do I have to untaint all user input in a form?
by Zaxo (Archbishop) on Nov 14, 2003 at 03:36 UTC

    To add to sauoq's advice on #1, if you elect to put still-tainted data into a database, make it one which supports tainted data from select. You can untaint now, or when you need it. I like now, bucause I might forget later, and somebody else might not suspect.

    On #2, your Validate.pm doesn't untaint anything you can use. You're shifting the argument into a lexical package global and untainting that. The argument variable remains tainted. You can pass your variables by reference to fix that, or else use the (\$) prototype.

    After Compline,
    Zaxo

      your Validate.pm doesn't untaint anything you can use.
      Now, that's sobering, Zaxo. All that work and it wasn't doing what I intended, My quest grows longers as I now have to figure out what you meant by:
      You can pass your variables by reference to fix that, or else use the (\$) prototype.
      I'm hoping chromatic's code will shed some light.

      —Brad
      "A little yeast leavens the whole dough."

        A second look shows me that, the way you use Validate.pm, you are getting untainted data by immediately using the global $Validate::var. That is an awkward design which demands you call the validation/untaint function each time you need the variable (since $var may have changed in the interim). The solution is to write your functions to validate and untaint the variable you hand them. Here is how to write your val_alpha() function that way (untested),

        # val_alpha validates as [:alpha:], spaces, and hyphens. # Usage: val_alpha \$foo [, \$bar] sub val_alpha { for (@_) { if ($$_ =~ /^([A-Za-z -]*)$/) { $$_ = $1; } else { error_page() } } 1; }
        That should validate and untaint for all time the variables you hand it.

        After Compline,
        Zaxo

Re: Do I have to untaint all user input in a form?
by chromatic (Archbishop) on Nov 14, 2003 at 04:07 UTC

    How about something a little friendlier for your validation routines?

    package Validate; sub word { my ($class, $tainted) = @_; return $1 if $tainted =~ /^([-A-Za-z -]+)$/; error_page(); }

    You could call it more clearly with:

    my $name = Validate->word( param('name' ) );
Re: Do I have to untaint all user input in a form?
by Roger (Parson) on Nov 14, 2003 at 03:31 UTC
    There was a recent node 304934 on Perl CGI scripts with tainting. Personally I think tainting is not really necessary for simple Perl scripts, -w should be enough. Besides, adding -T option in the #! line will cause mod_perl to complain, unless you follow what inman suggested in the discussion to modify your Apache configuration. Besides, you could test the CGI script from the command-line with 'perl -T script.pl' anyway, without the -T option in your #! line.

    tilly has a recent node 306983 on place holders and data validation. I think you might be interested in it.

Re: Do I have to untaint all user input in a form?
by dws (Chancellor) on Nov 14, 2003 at 06:40 UTC
    1. For the purposes of untainting, do I have to validate every user input value, even though they are all going into a database?

    It depends on your schema. If you have text fields that can legitimately contain any old pile of bits, then you can get by without validation. For all other types of fields, bitter experience says to assume the worst. That means validating.

      I strongly agree with the validating bit. It does not depend on your schema.

      It is important to realise that when a form is posted to a server, there is no schema information sent with it. For instance, what you think may be a radio button or check box (and thus an expected string value) can be anything when it is posted from somewhere else.

      A contrived example (which I unfortunately have seen happening in the real world):

      <FORM METHOD=post ACTION=/doit.pl> <INPUT NAME=action TYPE=checkbox VALUE="INSERT INTO table VALUES (1,2, +3)"> <INPUT TYPE=submit> </FORM>

      may coerce you into thinking that the value on the server of the field "action" is either the INSERT statement or nothing.

      However, anybody in the world with access to Perl and LWP.pm, might easily set up a request to "/doit.pl":

      use LWP::UserAgent; my $ua = LWP::UserAgent->new; my $req = HTTP::Request->new(POST => 'http://your.server.com/doit.pl') +; $req->content(q{action=DELETE%20FROM%20table}); $ua->request($req); #kaboom, your table is gone
      So, even if you thought you could trust the value of a checkbox, you can't. You always need to check values coming in from the form. Always.

      Liz

Re: Do I have to untaint all user input in a form?
by hardburn (Abbot) on Nov 14, 2003 at 15:06 UTC

    I do most of the heavy lifting in...(gulp)... Javascript

    If the client has JavaScript shut off, what do you do? JS validation should only be used to save a server request that would be rejected anyway. It doesn't end your responsibility of having to do validation server-side.

    do I have to validate every user input value

    Taint mode won't force you to do so, but it's a good idea.

    ----
    I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
    -- Schemer

    : () { :|:& };:

    Note: All code is untested, unless otherwise stated

      Thanks hardburn for that reminder.

      The reason I "gulped" originally, was because I know the bias around the monastery against JS. And I'm starting to agree with a lot of it, including your comment:
      JS validation should only be used to save a server request that would be rejected anyway.
      Which is precisely the way I intend to use it (now that I catching the security mojo here at PMs). Until I started coming to the monastery, I never met anyone who turned off javascript. And for those who don't, which is probably the average surfer—good or bad, client-side validation can very quickly check values, and limit the server side checks, which take more time and have to refresh screens, etc.

      What I'm attempting to do is provide a fast pre-validation for those who want it (or haven't disabled JS), and then the more critical, security conscience validation on the server side.

      All that to say, your point is well-taken.

      —Brad
      "A little yeast leavens the whole dough."
        The only problem i have with JS validation is that it is going to produce redundant code. You see that JS validation is no substitute for server-side validation, as tools such as LWP and crew make it very easy to bypass your front-end "security". As long as you don't mind maintaining both the JS and server validation, then god-speed to you, but i still think that JS validation is waste of my time and my client's money. ;)

        UPDATE:
        I can admit that what sauoq says is good - if you do have some expensive submission that will benefit from being validated on the client side first, well ... go for it. Some redundancy isn't so bad, especially when you compare the chore of keeping it consistant to ... washing dishes. :D It just ain't so bad after all. ;)

        jeffa

        L-LL-L--L-LL-L--L-LL-L--
        -R--R-RR-R--R-RR-R--R-RR
        B--B--B--B--B--B--B--B--
        H---H---H---H---H---H---
        (the triplet paradiddle with high-hat)
        
Re: Do I have to untaint all user input in a form?
by Notromda (Pilgrim) on Nov 17, 2003 at 01:23 UTC
    Look up Data::FormValidator on CPAN. It really is worth the trouble to learn; Once you know how to use it, data validation becomes a snap.
Re: Do I have to untaint all user input in a form?
by jryan (Vicar) on Nov 19, 2003 at 19:03 UTC

    Not really a comment on your tainting, but more on your design. What you really want to do here is call the sub associated with each param, right? Well, we can make this easier with a dispatch table! Change the top of Validate.pm to:

    package Validate; use Exporter::Dispatch;

    Now, in your main code, you can change your validation code to:

    my $validate = create_dptable Validate; my %params; # store params in a hash. Use symrefs to obtain original +behaivor. # (i.e. $$_ = ...) foreach (params()) { $params{$_} = exists $validate->{"val_$_") ? $validate->{"val_$_"}->(param($_)) : $validate->{'error_page'}->($_) }

    Simple, elegant, and will scale if you ever add more parameters. All you will have to do is add the validation routine, and it will get called automatically, without any change needed in the original code. Better yet, if any bogus parameters sneak in, your error routine will get called.

    Update: Oh, I didn't notice at first; val_alpha's name would need to be changed to val_name to be called correctly.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://306991]
Approved by sauoq
Front-paged by bart
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (2)
As of 2024-04-25 06:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found