More efficient reading of forms in CGI

bradcathey has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: More efficient reading of forms in CGI by Corion (Patriarch) on Sep 30, 2003 at 13:57 UTC
It also makes it easier for the potential intruder to wreak havoc with your script. Never ever use user input to determine a variable name! Consider what happens when a malicious abuser of your script submits a field name of, for example, `mailhost`, `dbname` or maybe simply `adminmail` - this method would directly overwrite the variable of that name, and it's just a matter of determination until the abuser finds out a combination that works for their purposes. For the bare minimum, you should check that each parameter is in your scripts list of known good parameters, and you most definitively must use taint mode (`#!/usr/bin/perl -wT` as the first line of the CGI script) to alert you of more pitfalls when handling insecure data. My approach would be to either use CGIs feature of importing the parameters into their own namespace or simply use the `param()` function where appropriate. Wildcarding the field names will give you much pain in the future. `perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web` [download]	[reply] [d/l]
Re: More efficient reading of forms in CGI by hardburn (Abbot) on Sep 30, 2003 at 14:05 UTC
Your solution might be fast, but it also allows anyone to clobber the value of any variable in the package's namespace. Depending on how you code, this may or may not be a problem, but me thinks it's better to avoid the problem entirely. Instead, explicitly put all the names of the fields you allow in an array. My arg-parsing routines usually look something like this: `{ my @FIELDS = qw( foo bar baz ); sub set_params { use CGI qw(:standard); map { $_ => param($_) \|\| '' } @FIELDS; } } # Other subroutines here { my %params = set_params(); # Call other subroutines here }` [download] It's much safer, and a change in the fields only means a change in the `@FIELDS` array. I use a similar form for object constructors: `{ my @FIELDS = qw( foo bar baz ); sub new { my $class = shift; my $in = shift \|\| { }; # Passed in a hashref my %self = map { $_ => $in->{$_} \|\| '' } @FIELDS; bless \%self, $class; } }` [download] ---- I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident. -- Schemer Note: All code is untested, unless otherwise stated	[reply] [d/l] [select]
Re: More efficient reading of forms in CGI by bear0053 (Hermit) on Sep 30, 2003 at 13:51 UTC
you could also use the following code snippet to retrieve all the field name and value pairs that are passed in from the form into a hash: `my %pairs = $cgi->Vars();` [download] Now it a field called "name" was in your form and it contains the value 'bob' you could do: `my $name = $pairs{name}; now $name will contain 'bob'` [download]	[reply] [d/l] [select]
Re: Re: More efficient reading of forms in CGI by bassplayer (Monsignor) on Sep 30, 2003 at 14:41 UTC
This method also elegantly handles input fields with multiple values, such as checkboxes and pulldown menus (multiple select). If the name parameter that bear0053 describes is from a set of checkboxes, and the names Bob and Joe were selected, `$pairs{name}` would contain 'Bob' and 'Joe' separated by NULL character ( `\0` ). Very easy to parse, if needed. I think that your way will squash all but the first value (not very nice if you're Joe). You will need CGI version 2.53 or higher to use `Vars()`. bassplayer	[reply] [d/l] [select]
Re: More efficient reading of forms in CGI by jonadab (Parson) on Sep 30, 2003 at 14:21 UTC
any pitfalls or better ways of doing this? Yes, two of them. First, this code won't run under strict, and second, it allows malformed input to set any global variable it wants, which could do unforseen things with your script. You can solve both of these problems in either of two ways. The first is to use a single hash, instead of many globals. This is what I do: `foreach my $field ( @params ) { $input{$field} = param( $field ); }` [download] This requires the rest of your script to use the hash values (like `$input{foo}` instead of just `$foo`), but it's straightforward and works. The other way is to expressly list what variables the user is permitted to set, and ignore any others. In order to get it to run under strict, you also have to add a line to allow the symbolic references: `foreach my $field qw(foo bar baz quux wibble) { no strict refs; $$field = param( $field ); }` [download] This still requires you to list all the variables, but you don't have to have a separate assignment line for each, and you don't have to change the rest of the script to use a hash. `$;=sub{$/};@;=map{my($a,$b)=($_,$;);$;=sub{$a.$b->()}} split//,".rekcah lreP rehtona tsuJ";$\=$ ;->();print$/` [download]	[reply] [d/l] [select]
Re: More efficient reading of forms in CGI by bean (Monk) on Sep 30, 2003 at 14:38 UTC
Others have already raised the security issues with this - I'll just add that I believe it's better not to play shell games with your variables. When the variable is in $q->param('fieldname'), you know exactly where it came from - $fieldname tells you nothing of the variable's origin. Moving variables around confuses things - at some point you may get the same theoretical variable from a trusted source (e.g. the database), and you'll probably rename that to $fieldname too. Keeping track of which variables came from where, can you trust them, are they up-to-date, etc., becomes a new set of problems you wouldn't have if you hadn't destroyed their original contexts. Update Here's an example from the quality system of a medical equipment division of a Fortune 500 company. The authors of the system decided to put all of the variables from the form submission into a $formVals hash. So far so good. Then they had the genius idea to put the values they got from the database into the same hash - that way they could just assign the $formVals hash to the template and the submitted values would go in there if form validation failed - otherwise the database values would go there. Needless to say this became my nightmare as the maintainer of the system. It's indicative of a broken mental model of client-server programming - they made all kinds of other errors, from trusting form values to be current (allowing the clobbering of new data with old) to updating the screen with new records that hadn't been saved to the database (it looked like they had been saved, but they were only preserved in the form).	[reply]
Re: More efficient reading of forms in CGI by dws (Chancellor) on Sep 30, 2003 at 15:24 UTC
So, fellow monks, any pitfalls or better ways of doing this? Yes. You're now at the whim of whoever forges a form. Leave it the way it was. You gain little by your change, except saving yourself a bit of typing, and you lose control over your namespace. Be explicit with what you expect in the form, and you'll make it easier on yourself when you pick the code up 6 months from now to make a change.	[reply]
Re: More efficient reading of forms in CGI with CGI::Lite by troy (Initiate) on Sep 30, 2003 at 17:31 UTC
This is one of the reasons why I like CGI::Lite. Besides CGI::Lite being very speedy, it puts all of the variables into a hashref (or hash if you like) so that you can use it in any context that you would use your variables in. Start with this: `use CGI::Lite; my $cgi = new CGI::Lite;` [download] Then, you can do either one of the following, depending on your preference: `$form = $cgi->parse_form_data; %form = $cgi->parse_form_data; $form = $cgi->parse_form_data ('GET', 'HEAD' or 'POST');` [download] Quick, easy, and all in a neat little hash, ready for you to use in statements like thise: `print "blabla bla bla $form->{entry} blabla<br>\n";` [download] Yes, I know that you could write something to put your CGI.pm params into a hash, but then I couldn't write a plug for CGI::Lite, could I? If you're not using some of the big features of CGI.pm, like html generation, you're better off with CGI::Lite. Either way, like others have said here, get your incoming variables out of the main namespace.	[reply] [d/l] [select]
Re: Re: More efficient reading of forms in CGI with CGI::Lite by bradcathey (Prior) on Sep 30, 2003 at 18:10 UTC
In 25 words or less, what exactly is "namespace?" From what I can tell, it is the area inside my script where I assign the variable names and what is in them. Anyone care to expound?	[reply]
Re: Re: Re: More efficient reading of forms in CGI with CGI::Lite by bart (Canon) on Sep 30, 2003 at 19:57 UTC
In 25 words or less, what exactly is "namespace?" Check out the perlfunc entry for package. Hmm... that's 7 words. I'll just give you a final example: `#!/usr/bin,/perl -w $x = "MAIN"; # in default package, "main" package Foo; $x = 123; # in package "Foo", so this is $Foo::x package Bar; $x = 456; # in package "Bar", so this is $Bar::x package main; # back to normal print "\$$x = '$x'\n"; print "\$main::$x = '$main::x'\n"; print "\$Foo::$x = '$Foo::x'\n"; print "\$Bar::$x = '$Bar::x'\n";` [download] This prints: $MAIN = 'MAIN' $main::MAIN = 'MAIN' $Foo::MAIN = '123' $Bar::MAIN = '456' In summary: it's a way to have global variables of the same name which nevertheless don't trample on each other. That way you don't have to worry if you happen to use the same name for a variable as used in some obscure module. Commonly modules have their own package, and the standard convention is that it's the same as the module name. use relies entirely on that convention for it's import mechanism — which is a way to make variables and subs visible unqualified across package boundaries, by making aliases to the same variables in the different packages. Thus, if you import $Foo::x into Bar, then $Bar::x will point to the same variable as $Foo::x does. Any modification of one will be reflected in a change in the other. You're probably familiar with that effect, but you likely didn't realise how simple the underlying implementation really is. Because it is.	[reply] [d/l]
Re: More efficient reading of forms in CGI by bradcathey (Prior) on Sep 30, 2003 at 16:56 UTC
I asked for pitfalls and had them pointed out. I knew there might be problems, but couldn't see them. Thanks monks. And now I have some interesting code to try out tonight. Blessings.	[reply]