Testing legacy CGI scripts using the "main() unless caller()" technique

davebaker has asked for the wisdom of the Perl Monks concerning the following question:

After reading the excellent Perl Testing: A Developer's Notebook by Ian Langworth and chromatic, I decided to implement a strategy described therein (pp. 163-166), to do some first-time testing on the many subroutines that are in a slew of cgi scripts (some of which are many hundreds of lines long). Most of them have many subroutines. One approach would be to put some or all of the subroutines into new modules, which makes for much easier testing of those subroutines. But the other strategy is to wrap the operational guts of the cgi script in a new "sub main() { # guts here }" that stops just before the subroutines are defined (which are always at the end of my cgi scripts). Then a testing script using Test::More can do a

require_ok('path-to-cgi-script.cgi');

... followed on the next line by a call to one of its subroutines, e.g.,

ok( add_two_numbers(2,3) == 5, "2 plus 3 yields 5");

... even though "add_two_numbers" isn't being exported by a module and is just a subroutine defined somewhere in the cgi script. Because the guts are embedded in a new "main()" subroutine, they don't actually get run (e.g., it doesn't generate the usual web page to STDOUT) when the testing script is run despite a require_ok() in the testing script, due to my adding a magic line to the top of the cgi script, as prescribed in the Perl Testing book:

   main() unless caller();
[download]

caller() returns the caller's package name when invoked in scalar context. There always will be a caller when the testing script does a "require_ok('path-to-cgi-script.cgi');" because, of course, the cgi script is being called by the testing script. Unless the testing script has done something fancy in the way of giving itself a package name, caller() returns the string 'main' in scalar context. In boolean context, it returns "true" (if the script is being called).

Now here's the interesting thing -- the book says it's necessary to have the script being tested (a cgi script, in my example) end with the statement "1;" so that the require_ok() in the testing script will always work (unless, of course, require_ok can't find the file or encounters a compilation error in the script). (Actually, the example in the book uses require() rather than require_ok(), but it's the same idea, that the script being called needs a "1;" at the end.)

The first cgi script I've modified for testing does NOT in fact have a "1;" at the end, and does not have any other reason to be returning "true" as the value of its last expression. It's just the guts (now embedded in main()) and a bunch of subroutines. But require_ok() passes! So what's the deal? The hunt began!

To make a long story short, it seems to me that, whenever require_ok() is run in a testing script, there is by necessity a caller from the perspective of the cgi script being tested, and what happens when the "main() unless caller();" line is encountered is that the "caller()" part/expression of the "main() unless caller();" statement is evaluated to "true", which turns out to be the value of the last expression evaluated by require_ok() -- the remaining lines of the tested script are just more subroutines that aren't being invoked. So now I think I know why require_ok() always gets a "true" value even without any final "1;", if a "main() unless caller();" statement is used at the "top" of the script being tested.

Did I find a "mistake" in the wonderful Perl Testing book?

I know that adding a "1;" to the end of the script is no big deal, but it rattled me that the book was so definite in this regard while my results differed.

And, to make it more interesting (but consistent with my conclusion), what happens if the first line is written with this syntax instead?

   main() if (! caller());
[download]

Answer: the require_ok() in the test script FAILS this time, unless there's a "1;" at the end of the script being tested, despite the logic in the foregoing alternative first line being identical to the logic of the "main() unless caller();" statement. The reason, I think: the boolean test of "(! caller())" will return "false" any time there is a caller (i.e., whenever the script is being tested). So require_ok() gets no "true" love from that line. Then it continues down the tested script slurping up the lines and looking for an expression that returns true, but when it finds no such expression after passing over the last of the subroutines (none of which are invoked), it causes the require_ok() test to fail -- there was no "true" value returned from the file, which is required (ahem) for success by require_ok().

(Edited in an effort to improve clarity and brevity.)

Comment on Testing legacy CGI scripts using the "main() unless caller()" technique Select or Download Code

Replies are listed 'Best First'.
Re: Testing legacy CGI scripts using the "main() unless caller()" technique by choroba (Cardinal) on Jul 20, 2022 at 21:48 UTC
You never know that the expression in `unless caller()` is the last evaluated one. There can be many more expressions scattered among the sub definitions, setting lexical variables or whatever. Putting a true value at the very end of the file means it will always behave correctly, regardless of the changes in the unless versus if ! line, or anywhere else. BTW, You might notice my pm files usually end with the following line: `__PACKAGE__` [download] It's a true value (I don't use 0.pm), but more importantly, there's no semicolon after it. If you try adding another sub or anything else below it, you'll get a syntax error. (I do the same thing with `return`). Your taste might be different. `map{substr$_->[0],$_->[1]\|\|0,1}[\\|\|{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^ARGV,3]`	[reply] [d/l] [select]
Re: Testing legacy CGI scripts using the "main() unless caller()" technique by haukex (Archbishop) on Jul 20, 2022 at 20:53 UTC
As far as I can tell, your analysis is correct. Did I find a "mistake" in the wonderful Perl Testing book? I unfortunately don't have the book so I can't check the exact wording, but you did write the following, to which I added some emphasis: the book says it's necessary to have the script being tested (a cgi script, in my example) end with the statement "1;" so that the `require_ok()` in the testing script will always work That sounds to me like maybe what the book means is "to play it safe, always end the script with `1;`, in case someone tries to get clever and writes `if !caller` instead of `unless caller`" `:-)`	[reply] [d/l] [select]
Re: Testing legacy CGI scripts using the "main() unless caller()" technique by etj (Deacon) on Jul 20, 2022 at 20:15 UTC
I couldn't see any indication above that you've written some small test scripts with various combinations of the components mentioned here, done a `require` on each of them, and seen what the results of each were?	[reply] [d/l]
Re^2: Testing legacy CGI scripts using the "main() unless caller()" technique by davebaker (Pilgrim) on Jul 20, 2022 at 20:55 UTC
Here's the script being tested: #!/opt/perl524 use strict; use warnings; main() unless caller(); sub main { use CGI; my $cgi = CGI->new; if ( !$cgi->param ) { print $cgi->header, my_form( 'action_url' => $cgi->url() ); exit; } my $num1 = $cgi->param('num1') \|\| ''; my $num2 = $cgi->param('num2') \|\| ''; # Next, add the value of the incoming parameters and display a result +(not shown here) # my $sum = add_two_numbers( $num1, $num2 ); # Then print a result to the web browser and display a fresh form unde +r the result (not shown here) exit; } ## end sub main sub add_two_numbers { my ( $num1, $num2 ) = @_; return $num1 + $num2; } sub my_form { my %params = @_; my $action_url = $params{'action_url'} \|\| ''; die "No URL specified for the 'action' attribute of the form, can' +t continue" if ( !$action_url ); my $form = qq\|<html> <head> <title>Add Two Numbers</title> </head> <body> <h1>Add Two Numbers</h1> <form action="$action_url" method="post"> <p>Enter some first number: <input name="num1"></p> <p>Enter some second number: <input name="num2"></p> <p><input type="submit" value="Add Now"></p> </form> \|; } ## end sub my_form [download] Here's the test script: `#!/opt/perl524 use strict; use warnings; use Test::More ('no_plan'); my $SCRIPT_LOCAL_LOCATION = 'C:\Users\davel\Documents\scripts\add_two_numbers_modified_as_ma +in.cgi'; require_ok($SCRIPT_LOCAL_LOCATION) or exit; ok( add_two_numbers( 3, 17 ) == 20, "3 plus 17 produces 20" ); done_testing();` [download] Here are the results of running the test script: `ok 1 - require 'C:\Users\davel\Documents\scripts\add_two_numbers_modif +ied_as_main.cgi'; ok 2 - 3 plus 17 produces 20 1..2` [download] But when I change the first line of the script being tested, so as to be the following: #!/opt/perl524 use strict; use warnings; main() if ( !caller() ); sub main { use CGI; my $cgi = CGI->new; if ( !$cgi->param ) { print $cgi->header, my_form( 'action_url' => $cgi->url() ); exit; } my $num1 = $cgi->param('num1') \|\| ''; my $num2 = $cgi->param('num2') \|\| ''; # Next, add the value of the incoming parameters and display a result +(not shown here) # my $sum = add_two_numbers( $num1, $num2 ); # Then print a result to the web browser and display a fresh form unde +r the result (not shown here) exit; } ## end sub main sub add_two_numbers { my ( $num1, $num2 ) = @_; return $num1 + $num2; } sub my_form { my %params = @_; my $action_url = $params{'action_url'} \|\| ''; die "No URL specified for the 'action' attribute of the form, can' +t continue" if ( !$action_url ); my $form = qq\|<html> <head> <title>Add Two Numbers</title> </head> <body> <h1>Add Two Numbers</h1> <form action="$action_url" method="post"> <p>Enter some first number: <input name="num1"></p> <p>Enter some second number: <input name="num2"></p> <p><input type="submit" value="Add Now"></p> </form> \|; } ## end sub my_form [download] Then when I run the identical test script, I get: not ok 1 - require 'C:\Users\davel\Documents\scripts\add_two_numbers_m +odified_as_main.cgi'; # Failed test 'require 'C:\Users\davel\Documents\scripts\add_two_num +bers_modified_as_main.cgi';' # at C:\Users\davel\Documents\scripts\add_two_numbers_modified_as_ma +in.cgi.t line 11. # Tried to require ''C:\Users\davel\Documents\scripts\add_two_numb +ers_modified_as_main.cgi''. # Error: C:\Users\davel\Documents\scripts\add_two_numbers_modifie +d_as_main.cgi did not return a true value at (eval 8) line 2. 1..1 # Looks like you failed 1 test of 1. [download]	[reply] [d/l] [select]