Preprocessor Pranks

cmilfo has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

A friend of mine and I were having a conversation about the -P command line switch. We both have a C background and were talking about preprocessor commands in Perl. The -P switch gives Perl the ability to do Preprocessor #ifdef, #if, etc. (it should be noted that anyone using -P for the first time may want to double check all your comments, weird things can happen). Here is an example (contents of my test.pl):

my $test = 1;

#ifdef UNDEF
   $test = 0;
#endif

print "\$test => $test\n";
[download]

Since I didn't put a #define UNDEF at the top of the code, the print statement prints $test => 1 when run with perl -P test.pl. Now, most of you are probably saying, "yeah, so what." Here's the debate. -P does the above preprocessing before compilation. This allows code to be added or ignored based on #define at the top of the program. In efforts to make the code safer and less system dependent, I suggested using use constant UNDEF => 0;. Here is the same code with the use constant pragma:

use constant UNDEF => 0;

my $test = 1;

if (UNDEF) {
   $test = 0;
}

print "\$test => $test\n";
[download]

Since the use constant pragma allows definition of a unchangable variable, if(), unless(), etc should be optimized out at the Top-Down Optimizer (constant folding) if there is no reference to a dynamic variable, function, etc. (i.e. &dothis if UNDEF == $maychange;).

We reasoned that the preprocessor may be faster because the code would never get to the compilation phase. We aren't so sure, however, because firing up the preprocessor could be more costly than a clear foldable constant such as &dothis if UNDEF;.

My questions for you, oh wise Monks, are as follows:

1) Which is faster, preprocessor (-P) or use constant pragma?
2) How did you find this out (better yet, aside from reading the internals of the perl interpreter -- I am not ready, is there a way to profile the compilation stage and where can I go to read/learn about such things)?
3) Once the optimizer strips out unnecessary code, at anytime in recompilation (such as when an eval() gets compiled at run-time), does this folded code effect the compilation time?

(And you thought I was going to start with, "What's your quest?") :)

As always, thank you. I've enjoyed the PerlMonks since I stumbled upon it a little over a year ago. The community amazes me over and over.

Cheers!
Casey

Comment on Preprocessor Pranks Select or Download Code

Replies are listed 'Best First'.
Re: Preprocessor Pranks by abstracts (Hermit) on May 05, 2002 at 07:56 UTC
Which is faster, preprocessor (-P) or use constant pragma? They both are the same. How did you find this out (better yet, aside from reading the internals of the perl interpreter -- I am not ready, is there a way to profile the compilation stage and where can I go to read/learn about such things)? You can use the B::Graph module to see how perl constructed the execution tree for the code you've written. The only difference between the two is that when you use constant, the constant needs to be initialized at the top of the parse tree, while the preprocessed code does not have such step. Once the optimizer strips out unnecessary code, at anytime in recompilation (such as when an eval() gets compiled at run-time), does this folded code effect the compilation time? I believe once a portion of code is discarded, it is discarded forever. In the example above, since you cannot change the value of the constant UNDEF, there is really no point in keeping the stuff inside the if statement. Here is how I tested this: `#!/usr/bin/perl -P -MO=Graph,-text use constant UNDEF => 0; my $test = 1; if(UNDEF){ $test = 'Hello World'; } print "\$test => $test\n";` [download] `#!/usr/bin/perl -P -MO=Graph,-text my $test = 1; #if 0 $test = 'Hello World'; #endif print "\$test => $test\n";` [download] You can run these 2 programs: `./prog.pl \| grep Hello` to see if the word "Hello" appears anywhere in the output. Also, if you use a variable instead of a constant, (as in `my $UNDEF = 0;` ), you'll see that the statement will not be folded and that "Hello" will appear in the output. Hope this answers your questions. PS: `-MO=Graph,-text` will show you an ascii drawing of the tree, while `-MO=Graph,-dot` will output a dot file that you can view if you have `dotty`, part of the Graphviz package.	[reply] [d/l] [select]
Re: Re: Preprocessor Pranks by ariels (Curate) on May 05, 2002 at 08:38 UTC
You show that both programs compile to the same thing. However, compilation time is a definite issue for a scripting language. If filtering with <samp>cpp</samp> compiles twice as quickly as using a <samp>constant</samp>, then it is faster for short-lived programs. To determine run time of a program, use one of the UN*X <samp>time</samp> commands (either the shell builtin or <samp>/usr/bin/time</samp> or <samp>/usr/bin/timex</samp>). This should give equivalent results to the suggestion given above for using <samp>Benchmark</samp>. However, almost surely it does not matter: for very many Perl programs, the compile time is dominated by the run time. Exceptions may include some CGI programs (and hence e.g. <samp>mod_perl</samp>, which seeks to reduce the number of times compilation time is paid).	[reply]
Re: Preprocessor Pranks by ariels (Curate) on May 05, 2002 at 13:15 UTC
Well, here's the direct way. (samtregar suggests a more Perlish way, which will do much the same, in a possibly nicer way). I created 2 files, <samp>cond-cpp</samp> and <samp>cond-prl</samp> which contain your codes. Neither ran <samp>-w</samp> or <samp>use strict</samp>; at least the latter will probably reduce the difference in execution time even further (<samp>cond-prl</samp>, however, runs the <samp>-P</samp> switch, for obvious reasons). `<hal4 143 [15:07] ~/Perl/Test >time csh -c 'repeat 500 ./cond-prl >/de +v/null' 0:17.75 elapsed, 14.410+3.200 user+system (99.2%), 0 (0+0) mem (avg. s +hared+unshared stack), 152908 faults, 0/0 I/O <hal4 144 [15:08] ~/Perl/Test >time csh -c 'repeat 500 ./cond-cpp >/de +v/null' 0:16.55 elapsed, 11.630+7.200 user+system (113.7%), 0 (0+0) mem (avg. +shared+unshared stack), 431514 faults, 0/0 I/O` [download] We see that <samp>cond-cpp</samp> finishes first, even though it takes more time to do so (to explain this, note that "113.7%" CPU utilization means we've got more than one processor). Load average was kept close to 0 (but this was not enforced in any way). Note also that the preprocessor method requires more system time -- it's starting a new process every time! On a slower uniprocessor machine, we get `<sylvie 113 [16:10] ~/Perl/Test >time csh -c 'repeat 500 ./cond-prl >/ +dev/null' 0:35.42 elapsed, 25.740+6.140 user+system (90.0%), 0 (0+0) mem (avg. s +hared+unshared stack), 154646 faults, 0/0 I/O <sylvie 114 [16:10] ~/Perl/Test >time csh -c 'repeat 500 ./cond-cpp >/ +dev/null' 0:39.19 elapsed, 16.490+18.530 user+system (89.3%), 0 (0+0) mem (avg. +shared+unshared stack), 462331 faults, 0/0 I/O` [download] Here too the CPP method appears to have a slight edge. Explanations? Only thing I can think of is "CPP is faster than Perl at doing this". Doubtless the real <samp>perl</samp> hackers out there can explain this. Importance? Probably almost nil, unless you're starting a great many very short-lived processes.	[reply] [d/l] [select]
Re: Preprocessor Pranks by Zaxo (Archbishop) on May 05, 2002 at 08:42 UTC
Benchmarking eval of strings, since you want to include compile time: `#!/usr/bin/perl -w use strict; my $foo = <<'BAR'; my $test = 1; #ifdef UNDEF $test = 0; #endif print "\$test => $test\n"; BAR my $bar = <<'BAZ'; use constant UNDEF => 0; my $test = 1; if (UNDEF) { $test = 0; } print "\$test => $test\n"; BAZ use Benchmark; =pod this is still wrong timethese (100000, { cpp => sub { eval "perl -P -e '$foo'";}, prl => sub { eval "perl -e '$bar'";} } ); =cut timethese (1000, { cpp => sub { system "perl -P -e '$foo'";}, prl => sub { system "perl -e '$bar'";} } );` [download] Results are: `=pod $ perl bmp.pl Benchmark: timing 1000 iterations of cpp, prl... cpp: 88 wallclock secs ( 0.08 usr 0.54 sys + 52.56 cusr 29.98 +csys = 83.16 CPU) @ 1612.90/s (n=1000) prl: 49 wallclock secs ( 0.09 usr 0.56 sys + 33.78 cusr 12.19 +csys = 46.62 CPU) @ 1538.46/s (n=1000) =cut` [download] Instead of forking a new interpreter each time, we can change the tests to: `timethese (10000, { cpp => sub { eval "$foo"; }, prl => sub { eval "$bar"; }}); =pod $ perl -P bmp.pl Benchmark: timing 10000 iterations of cpp, prl... cpp: 1 wallclock secs ( 0.50 usr + 0.02 sys = 0.52 CPU) @ 19 +230.77/s (n=10000) prl: 2 wallclock secs ( 2.40 usr + 0.03 sys = 2.43 CPU) @ 41 +15.23/s (n=10000) =cut` [download] ~~Pretty much a wash, at least for your code.~~ Update: Corrected code, was only firing up an interpreter before. U²: Still had it wrong, showing two ways of doing it with two very different results. It looks like the interpreter with preprocessor is slower to start, but compilation is indeed faster by a pretty wide margin. Your mileage will vary. ++ariels for catching my blunders so kindly. After Compline, Zaxo	[reply] [d/l] [select]
Re: Re: Preprocessor Pranks by ariels (Curate) on May 05, 2002 at 13:45 UTC
Comparing the results of this Benchmark with my benchmark just below yields a startling discrepancy. My computers can barely manage 25 iterations a second, whereas Zaxo's seem to finish 15_000 iterations a second easily! Actually, there's still a problem with the code above: it's measuring how long it takes to ```eval "perl -P -e '$foo'"`'' (and something similar for `$bar`). But that's just string concatenation, with a useless eval thrown in (it fails; that's not legal Perl code!). You need actually to run the command line, say by `s/eval/system/g;`.	[reply] [d/l] [select]
Re: Preprocessor Pranks by samtregar (Abbot) on May 05, 2002 at 06:38 UTC
The answer to #2 is to use the Benchmark module. You'll have to use system() or backticks to execute perl in a subprocess but aside from that the task should be quite straight-forward. I think you should take advantage of this opportunity to learn to use Benchmark and answer #1 (and possibly #3) yourself. Post your results here if/when you do. -sam	[reply]