bradcathey has asked for the wisdom of the Perl Monks concerning the following question:
Fellow more-XP'd monks,
In my quest to relearn Perl-a-la-PM, I'm trying to grasp the art of form validation for untainting purposes. My questions here are not about my regexes (I'm open to suggestions, but I do most of the heavy lifting in... (gulp)... Javascript), but want to re-validate in my .pl to keep Taint happy.
My Questions:
1. For the purposes of untainting, do I have to validate every user input value, even though they are all going into a database? I.e., is it overkill? Or is it just good practice? (I know I won't get any errors unless I try to open, eval, etc.)
2. Is there a more efficient way to code my validation calls and subs? (I know there are validation modules out there, like CGI::Validate, but I couldn't find anything specific enough. But hey, I'm open!)
3. I realize these are baby-steps, but foundational. So, am I "getting it?"
#!/usr/bin/perl -wT
print "Content-type: text/plain\n\n";
use lib "/Library/WebServer/CGI-Executables/"; #Mac :)
use CGI::Carp qw(fatalsToBrowser);
use strict;
use Validate;
use Data::Dumper;
use CGI qw(:standard);
new CGI;
Validate::val_alpha(param('name'));
my $name = $Validate::val;
Validate::val_date(param('date'));
my $date = $Validate::val;
Validate::val_phone(param('phone'));
my $phone = $Validate::val;
Validate::val_email(param('email'));
my $email = $Validate::val;
print Dumper($name, $date, $phone, $email); #testing only
#eventually write to the database
__END__
Code for Validate.pm:
package Validate; # use package to declare a module
our $val;
sub val_alpha {
$val = shift;
if ($val =~ /^([A-Za-z \-]*)$/) {
$val = "$1";
} else {
&error_page;
}
}
sub val_phone {
$val = shift;
if ($val =~ /^[\(]?(\d{3})[\)\.\-]?(\d{3})[\)\.\-]?(\d{4})$/) {
$val = "$1-$2-$3";
} else {
&error_page;
}
}
sub val_date {
$val = shift;
if ($val =~ /^(\d{2})-(\d{2})-(\d{4})$/) {
$val = "$1-$2-$3";
} else {
&error_page;
}
}
sub val_email {
$val = shift;
if ($val =~ /^([\w\.\-]{3,})@([\w\.\-]{3,})\.([A-Z]{2,3})$/i) {
$val = "$1\@$2\.$3";
} else {
&error_page;
}
}
sub error_page {
print "HTML error page prints here\n";
exit;
}
1;
Thanks!
—Brad "A little yeast leavens the whole dough."
Re: Do I have to untaint all user input in a form?
by sauoq (Abbot) on Nov 14, 2003 at 03:21 UTC
|
1. For the purposes of untainting, do I have to validate every user input value, even though they are all going into a database? I.e., is it overkill? Or is it just good practice? (I know I won't get any errors unless I try to open, eval, etc.)
When validating untrusted user input, there's not much that could be called "overkill".
For starters, see tilly's recent node: Use placeholders. For SECURITY! Consider also whether, the data will ever be used for anything else. It might just be going into a database right now, but what are you going to do with it next year? Better safe than sorry, right?
As for your validation routines, the one that stands out as being insufficient is the email validation routine. Consider using something a bit more robust, like CGI::Untaint::email perhaps.
-sauoq
"My two cents aren't worth a dime.";
| [reply] |
|
| [reply] |
Re: Do I have to untaint all user input in a form?
by Zaxo (Archbishop) on Nov 14, 2003 at 03:36 UTC
|
To add to sauoq's advice on #1, if you elect to put still-tainted data into a database, make it one which supports tainted data from select. You can untaint now, or when you need it. I like now, bucause I might forget later, and somebody else might not suspect.
On #2, your Validate.pm doesn't untaint anything you can use. You're shifting the argument into a lexical package global and untainting that. The argument variable remains tainted. You can pass your variables by reference to fix that, or else use the (\$) prototype.
| [reply] |
|
| [reply] |
|
A second look shows me that, the way you use Validate.pm, you are getting untainted data by immediately using the global $Validate::var. That is an awkward design which demands you call the validation/untaint function each time you need the variable (since $var may have changed in the interim). The solution is to write your functions to validate and untaint the variable you hand them. Here is how to write your val_alpha() function that way (untested),
# val_alpha validates as [:alpha:], spaces, and hyphens.
# Usage: val_alpha \$foo [, \$bar]
sub val_alpha {
for (@_) {
if ($$_ =~ /^([A-Za-z -]*)$/) {
$$_ = $1;
}
else {
error_page()
}
}
1;
}
That should validate and untaint for all time the variables you hand it.
| [reply] [d/l] |
|
Re: Do I have to untaint all user input in a form?
by chromatic (Archbishop) on Nov 14, 2003 at 04:07 UTC
|
package Validate;
sub word
{
my ($class, $tainted) = @_;
return $1 if $tainted =~ /^([-A-Za-z -]+)$/;
error_page();
}
You could call it more clearly with:
my $name = Validate->word( param('name' ) ); | [reply] [d/l] [select] |
Re: Do I have to untaint all user input in a form?
by Roger (Parson) on Nov 14, 2003 at 03:31 UTC
|
There was a recent node 304934 on Perl CGI scripts with tainting. Personally I think tainting is not really necessary for simple Perl scripts, -w should be enough. Besides, adding -T option in the #! line will cause mod_perl to complain, unless you follow what inman suggested in the discussion to modify your Apache configuration. Besides, you could test the CGI script from the command-line with 'perl -T script.pl' anyway, without the -T option in your #! line.
tilly has a recent node 306983 on place holders and data validation. I think you might be interested in it.
| [reply] |
Re: Do I have to untaint all user input in a form?
by dws (Chancellor) on Nov 14, 2003 at 06:40 UTC
|
1. For the purposes of untainting, do I have to validate every user input value, even though they are all going into a database?
It depends on your schema. If you have text fields that can legitimately contain any old pile of bits, then you can get by without validation. For all other types of fields, bitter experience says to assume the worst. That means validating.
| [reply] |
|
I strongly agree with the validating bit. It does not depend on your schema.
It is important to realise that when a form is posted to a server, there is no schema information sent with it. For instance, what you think may be a radio button or check box (and thus an expected string value) can be anything when it is posted from somewhere else.
A contrived example (which I unfortunately have seen happening in the real world):
<FORM METHOD=post ACTION=/doit.pl>
<INPUT NAME=action TYPE=checkbox VALUE="INSERT INTO table VALUES (1,2,
+3)">
<INPUT TYPE=submit>
</FORM>
may coerce you into thinking that the value on the server of the field "action" is either the INSERT statement or nothing.
However, anybody in the world with access to Perl and LWP.pm, might easily set up a request to "/doit.pl":
use LWP::UserAgent;
my $ua = LWP::UserAgent->new;
my $req = HTTP::Request->new(POST => 'http://your.server.com/doit.pl')
+;
$req->content(q{action=DELETE%20FROM%20table});
$ua->request($req); #kaboom, your table is gone
So, even if you thought you could trust the value of a checkbox, you can't. You always need to check values coming in from the form. Always.
Liz | [reply] [d/l] [select] |
Re: Do I have to untaint all user input in a form?
by hardburn (Abbot) on Nov 14, 2003 at 15:06 UTC
|
I do most of the heavy lifting in...(gulp)... Javascript
If the client has JavaScript shut off, what do you do? JS validation should only be used to save a server request that would be rejected anyway. It doesn't end your responsibility of having to do validation server-side.
do I have to validate every user input value
Taint mode won't force you to do so, but it's a good idea.
---- I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
-- Schemer
: () { :|:& };:
Note: All code is untested, unless otherwise stated
| [reply] [d/l] |
|
| [reply] |
|
The only problem i have with JS validation is that it is
going to produce redundant code. You see that JS validation
is no substitute for server-side validation, as tools such
as LWP and crew make it very easy to bypass your
front-end "security". As long as you don't mind maintaining
both the JS and server validation, then god-speed to you,
but i still think that JS validation is waste of my time and
my client's money. ;)
UPDATE: I can admit that what sauoq says is good -
if you do have some expensive submission that will benefit
from being validated on the client side first, well ...
go for it. Some redundancy isn't so bad, especially when
you compare the chore of keeping it consistant to ...
washing dishes. :D It just ain't so bad after all. ;)
| [reply] |
|
|
|
Re: Do I have to untaint all user input in a form?
by Notromda (Pilgrim) on Nov 17, 2003 at 01:23 UTC
|
Look up Data::FormValidator on CPAN. It really is worth the trouble to learn; Once you know how to use it, data validation becomes a snap. | [reply] |
Re: Do I have to untaint all user input in a form?
by jryan (Vicar) on Nov 19, 2003 at 19:03 UTC
|
Not really a comment on your tainting, but more on your design. What you really want to do here is call the sub associated with each param, right? Well, we can make this easier with a dispatch table! Change the top of Validate.pm to:
package Validate;
use Exporter::Dispatch;
Now, in your main code, you can change your validation code to:
my $validate = create_dptable Validate;
my %params; # store params in a hash. Use symrefs to obtain original
+behaivor.
# (i.e. $$_ = ...)
foreach (params()) {
$params{$_} = exists $validate->{"val_$_")
? $validate->{"val_$_"}->(param($_))
: $validate->{'error_page'}->($_)
}
Simple, elegant, and will scale if you ever add more parameters. All you will have to do is add the validation routine, and it will get called automatically, without any change needed in the original code. Better yet, if any bogus parameters sneak in, your error routine will get called.
Update: Oh, I didn't notice at first; val_alpha's name would need to be changed to val_name to be called correctly.
| [reply] [d/l] [select] |
|
|