Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked

Automatic Generation of Form Handling Code

by Ovid (Cardinal)
on Jun 06, 2001 at 21:21 UTC ( #86310=perlquestion: print w/replies, xml ) Need Help??

Ovid has asked for the wisdom of the Perl Monks concerning the following question:

I asked this in the chatterbox but somehow things got a bit "confused", so I'm providing a better explanation. Note that this is not a "do my job for me" post. I am planning on writing this (and posting it here), but if it's already written, I'd love to know!

Recently, I've been handed a mock-up of a huge Web-based application. Many of the forms have 40 or more elements in them. What I have been looking for is a script that will read in HTML forms and automatically generate a code skeleton that will:

  • Populate scalars or arrays based on the form structure.
  • Generate some basic taint-checking routines (perhaps even have it automatically use the Untaint module, but it's not standard).
  • Automatically have strict, warnings, and taint checking added to the top of the code to enforce better coding practices.

In short, I'd like something that will take the following HTML form and create a Perl skeleton for it:

<form action='' method=post enctype='multipart/form-data' +> <input type='hidden' name=somename value="asdf"> <input type=text name=name value=Ovid size="30" maxsize="30"> <br /> <br> <input type="checkbox" name="group1" value="1" checked /> box 1 gr +oup 1 <br> <input type="checkbox" name="group1" value="2"> box 2 group 1 <br> <input type="password" name="pass"> Password </form>

The HTML above is deliberately formatted poorly because I'd prefer a robust solution. A code template generated from this would resemble the following:

#!/usr/bin/perl -w use strict; use CGI; my $q = CGI->new; # read in form data my $_somename = $q->param( 'somename' ); # hidden my $_name = $q->param( 'name' ); # text my @_group1 = $q->param( 'group1' ); # checkbox my $_pass = $q->param( 'pass' ); # password # untaint the data my ( $somename ) = ( $_somename =~ /^(asdf)$/ ); my ( $name ) = ( $_name =~ /^(Ovid)$/ ); my @group1; ( $group1[$_] ) = ( $_group1[$_] =~ /^(1|2)$/ ) foreach ( 0 .. $#_grou +p1 ); my ( $pass ) = ( $_pass =~ /^(\w+)$/ );

Note that taint checking is based upon the values already present in the form with a default of \w+ if no value attributes are present in the HTML. Also, it would automatically change the scalar to an array for multi-valued elements (the checkbox group).

If something like this exists (okay, merlyn, which of your columns did I miss? :), please let me know. If it doesn't exist, advice welcome.

I think the benefits of such a script are obvious:

  • Faster development time.
  • Greater accuracy (never miss another form element!)
  • Taint checking automatically very restrictive.
  • Pretend to spend 5 hours writing a form-handling routine when you're really playing Quake.


Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Replies are listed 'Best First'.
Re: Automatic Generation of Form Handling Code
by Cirollo (Friar) on Jun 06, 2001 at 21:43 UTC
    Funny, I was contemplating something along those lines earlier this morning...

    To populate variables based on the form data structure, CGI::State might be a good place to start; it loads CGI parameters into a multidimensional hash. Of course, you would have to change the form element names in your HTML to conform with the module.

      CGI::State looks interesting. However, the time that it takes to go through all of the forms and update the field names might just negate the benefits. I can always get the form data in a hash like so:

      use CGI qw/:standard/; my %formdata = map { $_ => [ param( $_ ) ] } param;

      That's easy. However, I really dislike doing that as it makes it much easier to miss untainting/validating a particular variable (IMHO). Also, I explicitly like to see the checkbox groups represented as arrays and individual elements represented as scalars. It's more obvious to me how to handle them. Hmm... am I just being foolish? I guess I really can't see the difference between populating a bunch of scalars and populating a hash with all of the data aside from the fact that single value form elements are now represented as a one element array reference, removing the clear visual distinction between arrays and scalars:

      # The following is clearer for me: my $first_name = $in_name; foreach ( @in_colors ) { # do something } # This is less clear: my $first_name = $formdata{ 'in_name' }[0]; foreach ( @$formdata{ 'in_colors' } ) { # do something }

      Is that a matter of style over substance, or is this something that could actually be an issue (particularly with maintenance?).

      I'm beginning to think this should have been in meditations instead.


      Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

        Lately, most of the code I write for CGI's generally follows the same basic design pattern:

        • Receive the values with
        • Validate and detaint the parameters with HTML::FormValidator.
        • <shamless plug>Build a multi-dimensional hash with CGI::State</shameless plug>
        • If there is an error, or missing field, use HTML::FillInForm and HTML::Template to re-fill in the form with the submitted data and print an error.
        • Do some work, usually with DBI, etc
        • Use HTML::Template to display a page to the user either prompting the user for more info, or display the results of the work.

        (I hear that Apache::Pagekit encapsulates alot of this into a single framework, but I am not convinced that the platform is stable enough for my needs yet.)

        The great thing about HTML::FormValidator is that you can set up "validation profiles". By this I mean you can built a set of rules for a newsletter subscription, or an order submission, for example. The design of the module allows it to share these pre-made profiles with many other scripts. Once you write a profile for what I call a "web object", you can reuse it over and over for the same type of data.

        If your 40-odd scripts are asking for similar types of data, using this module might be a good way to factor out all the validation and detainting code.

        Update: Sorry, I see that you have 40 elements per form, not scripts. Either way, this module *will* save you enough time, so you can concentrate on Quake more =)

Re: Automatic Generation of Form Handling Code
by traveler (Parson) on Jun 06, 2001 at 23:12 UTC
    How about this as a starting point if you have to write it:
    1. Convert the HTML to XML (e.g. with Tidy)
    2. Select the fields in which you are interested
    3. Use perl to process the xml and generate the code, probably with something based on XSLT
    You might also want to check out CGI::XMLForm, which I have wanted to play with for some time.
Re: Automatic Generation of Form Handling Code
by shotgunefx (Parson) on Jun 07, 2001 at 10:09 UTC
    You might be interested in this post by merlyn. I'm working in some similar directions and what his program does now is take an html doc and spit out the code for it. Might be worth looking into. It has a few form handling issues right now but it's probably 95% of what you want.

    If the forms are coming from in-house non-programmers, maybe you could add a non-standard attribute to the form elements to tell you what type of attribute it is. Then modify the parse to recognize this extra key and generate the code to handle it. Perhaps a required attribute as well. Then strip them out.

    This way you can easily spit out a list of your required fields as well.


    "To be civilized is to deny one's nature."

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://86310]
Approved by root
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (4)
As of 2023-01-28 00:56 GMT
Find Nodes?
    Voting Booth?

    No recent polls found