Code that writes code

I was reading through some suggestions on good coding practice when I came across the following:

Write Code That Writes Code

Code generators increase your productivity and help avoid duplication.

This makes sense, but I have had limited experience with it. For example, I once stumbled across the following bit of code:

if ($main::stateIn{'sibling'}) {
    no strict 'refs';
    &{$main::stateIn{'sibling'}}();
}
[download]

This code allowed the programmers to simply name a "sibling" in a hidden field of Web pages and have the target CGI script call the appropriate subroutine based upon the sibling. This code actually opened up some significant security issues and was replicated throughout many scripts. Since there were quite a number of these scripts, and most of these scripts had quite a number of subroutines that could be called, recoding these by hand was likely to be tedious and error-prone.

To deal with this, I wrote the following program (which relies on my knowledge of our programming practices -- it's not likely to be portable):

#!C:/perl/bin/perl.exe -w
use strict;

open INV, "<$ARGV[0]" or die $!;
open OUT, ">out.txt" or die $!;

print OUT "SWITCH: {\n    if ( defined \$sibling ) {\n";
while ( <INV> ) {
    if ( /^\s*sub\s+([a-zA-Z][^\s{]+)/ ) {
        next if $1 eq 'AUTOLOAD' or $1 eq 'main';
        print OUT "        if ( \$sibling eq '$1' ) { &$1; last SWITCH
+ };\n";
    }
}
print OUT "\n\n        &main;\n        last SWITCH;\n    }\n";

close INV;
close OUT;
[download]

Essentially, every time I was converting a program, I would run this snippet and it would produce an output file similar to the following:

SWITCH: {
    if ( defined $sibling ) {
        if ( $sibling eq 'uFMProducts' ) { &uFMProducts; last SWITCH }
+;
        if ( $sibling eq 'catalogDealerProducts' ) { &catalogDealerPro
+ducts; last SWITCH };
        if ( $sibling eq 'prodByCatalog' ) { &prodByCatalog; last SWIT
+CH };
        if ( $sibling eq 'catalogProductView' ) { &catalogProductView;
+ last SWITCH };
        if ( $sibling eq 'catalogProductUpdate' ) { &catalogProductUpd
+ate; last SWITCH };
        if ( $sibling eq 'dealerProducts' ) { &dealerProducts; last SW
+ITCH };
        if ( $sibling eq 'categoryDetail' ) { &categoryDetail; last SW
+ITCH };
        if ( $sibling eq 'prodByCategory' ) { &prodByCategory; last SW
+ITCH };
        .
        .
        .

        &main;
        last SWITCH;
    }
[download]

Much better. While I had some manual clean-up to do pruning out subs that shouldn't be called directly, I had taken a long, tedious task and reduced it to a couple of minutes of work and closed a significant security hole.

Aside from a few other limited examples, this is the extent of my "code from code". I am curious as to how other Monks have used "code from code" and what examples they might be able to provide. I have limited experience in this area and am excited about the possibility of learning more.

Cheers,
Ovid

Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Comment on Code that writes code Select or Download Code

Replies are listed 'Best First'.
Re: Code that writes code by chromatic (Archbishop) on Dec 22, 2000 at 23:41 UTC
One of the things I've been experimenting with for both Jellybean and the Everything Engine is the idea of turning an eval()ed script/section into an anonymous subroutine. The approach is a little different for each. With Everything, there are pages with embedded code, embedded HTML, and calls to other nodes with embedded code. In the normal Engine, there's a regex that grabs token-delimited sections out and parses them accordingly, eventually finding and calling eval() on the embedded text. A normal superdoc may end up calling eval() a dozen or more times. That can get expensive. I have some experimental patches that pull out embedded code sections and construct an anonymous subroutine out of the whole document. It's compiled, once, and stored in a cache. (Luckily, the cache already had code to update the document text if someone edits it -- I got that for free!) All subsequent calls, while the node containing the embedded code is still in the cache, only have the overhead of a method call. The tricky part is dealing with all of the different types of stuff. Embedded code is supposed to execute as a single unit (one entry point, returns a simple string), so I have to wrap it in eval blocks. (That's eval BLOCK not eval STRING.) Within Jellybean, we don't allow embedded code. We have the beginnings of a scripting system where you build a method from named snippets. For example, you could open a form, display a message, use a textfield to get a parameter, show a submit button, and end a form with a recipe like the following: `startform hello textfield submit closeform` [download] Jellybean uses these as keys into a hash. The values are strings that can be concatenated into an anonymous subroutine as well. A simple `my $sub_ref = eval $sub_code;` will do what we want. (Yes, there's error checking.) All of a sudden, we have a new method. It doesn't have to be parsed each time. It's as fast as a built-in method. Users can't execute arbitrary code, because someone has to make it available for them to use in this method. This is a powerful technique -- with not a few pitfalls -- but it's certainly an elegant solution for a few situations.	[reply] [d/l] [select]
Re: Code that writes code by chipmunk (Parson) on Dec 22, 2000 at 22:46 UTC
I have a script that generates code which is evalled later on in the script. I had several reasons for this approach. First, the generated code has several variable regexes, which may change during the life of the script, but not during one execution of the evalled code. For example: `# code to match the stress regex $eval .= <<" EOT"; next unless /\$stress[$i]/o; EOT` [download] Second, I'm using conditionals to generate different code, depending on various options: `if (defined $stress[$i]) { # code to match the stress regex $eval .= <<" EOT"; next unless /\$stress[$i]/o; EOT }` [download] And third, I'm using loops to generate repeated blocks of code: `for (my $i=0; $i<=$#entry; ++$i) { # ... if (defined $stress[$i]) { # code to match the stress regex $eval .= <<" EOT"; next unless /\$stress[$i]/o; EOT } }` [download] Generating code and then executing it with eval makes the code more efficient; the various conditionals are executed once overall, rather than once per line of input; and the regexes are only compiled once per execution, rather than once per match. I also have an option that prints the generated code before it is evalled, to aid in debugging.	[reply] [d/l] [select]
Re: Code that writes code by mirod (Canon) on Dec 23, 2000 at 00:13 UTC
I really like generating code on the fly. It is fairly easy if you watch your back(slashes): generate a string with the code for a subroutine (that's the fun part) generate an anonymous subroutine from it `$sub= eval "sub { $sub_in_a_string }";` use the subroutine `$sub->( @args);` et voila! Here is a stupid example: #!/bin/perl -w use strict; # very simple rules: s="<string>" or v=<number> my %rules=( 's="toto"' => sub { print "toto!\n"; }, 'v=2' => sub { print "waouh! 2!\n"; }, ); # values to test my @pairs=( toto => 2, toto => 1, tata => 2, tutu => 1); my @rules= make_rules( %rules); # create the r +ules while (@pairs) { my $string= shift @pairs; my $value= shift @pairs; check_rules( $string, $value); # check all ru +les print "\n"; } sub check_rules { my( $string, $value)= @_; foreach my $rule (@rules) { my( $you_like, $do_it)= @{$rule}; $do_it->() if( $you_like->( $string, $value)); # run routine +if rule applies } } sub make_rules { my %rules= @_; my @rules; foreach my $exp (keys %rules) { my $sub_str= ' my( $s, $v)= @_;' ; # get the arg +uments if( $exp=~ m/^s=(".*?")$/) # s="<string> +" { $sub_str .= "return 1 if( \$s eq $1);" ; } # watch for t +he \, # $s is a +variable in the sub # while $ +1 is a variable in make_rules and a constant in the sub elsif( $exp=~ m/^v=(\d+)$/) # v=<number> { $sub_str .= "return 1 if( \$v == $1);" ; } # watch for t +he \ else { die "syntax error in rule $exp"; } my $sub= eval "sub { $sub_str }"; # create the +anonymous sub push @rules, [$sub, $rules{$exp}]; # store it wi +th the sub to run } return @rules; } [download] You can see an example in Ugly XML processing looking for a pure XML solution: I create a hairy regexp in `make_wrapper`, which is used in `wrap` -- calling the wrapper in wrap should really be written `$wrapper{$tag}->( @_);`	[reply] [d/l]
Re (tilly) 1: Code that writes code by tilly (Archbishop) on Dec 23, 2000 at 04:06 UTC
I have done this, and recommend it with warnings. The main problem is that you are creating a layer of indirection between you and the problem. This makes it harder to figure how how you should do the problem. But when you get the solution, the solutions tend to be better. There are many basic approaches that I have used. Each has advantages and disadvantages. You can write a little macro language which is turned into code and evalled. Damian Conway is fond of this approach. See for a random example, Class::Contract. Generally this takes a lot of work to do, and involves creating a macro language. But the result can be very powerful. (Usually I just use some subset of Perl as my macro language and use do or eval to parse and interpret it...hrm...) You may have boiler-plate code to insert in appropriate places. See, for instance, AbstractClass. Outside of very rigid problems, this approach makes symbolic references look downright sane. Your use of text manipulations to make code in one form turn into another is common. Particularly for mass edits, obfuscating code, so on and so forth. Turning data structures into code and back is often very useful. See Data::Dumper. (I have used similar techniques to freeze data structures in one language and re-instantiate them in another.) Often a large and complex script can be maintained as an auto-generated thing from a number of small ones through some sort of template. Perl's Configure shell script is maintained this way. Take a look at the make utility for some ideas on how to do that. Another key insight that may help is that a lot of functional techniques do the same thing as automatic code generation in a more controlled way. For instance in a similar situation to yours I might use a hash of subs. you can build up a complex regular expression using qr//. Turning "templates" of functions into real functions can be done with closures. You may want to take another gander at Why I like functional programming while thinking about how much it looks like having code that writes code. Hopefully this should give you some ideas for how to use automatic code generation, and a few places to look for examples of it being done.. :-)	[reply]
Re: Code that writes code by extremely (Priest) on Dec 23, 2000 at 16:18 UTC
/me remembers why they invented LISP. =) An entire language dedicated to being able to do that... -- $you = new YOU; honk() if $you->love(perl)	[reply]
Re: Code that writes code by cat2014 (Monk) on Dec 23, 2000 at 19:58 UTC
I've yet to write code that writes perl code, but about 70% of my programming involves parsing out our html code/ berkley dbs & writing (non perl) test scripts based on the content. Perl is really, really, really good at this- I'm basically just setting up a bunch of rules, then parsing the content & substituting the relevant pieces into my test programs (hooray for regexen & s///!).	[reply]
Re: Code that writes code by jima (Vicar) on Dec 27, 2000 at 19:43 UTC
Excellent topic! When I first saw this node, the first thing that came to my mind was compilers. Sure, they compile your programs to executables, but as anyone who's taken a compiler construction class knows, they can also work by generating code for a virtual machine (often some kind of stack machine), and then passing the generated code to a previously written implementation of the VM on the target machine (usually provided by the instructor, so you can concentrate on worrying about lexers and grammars and whatnot). I used this approach in a Perl module implementation of The Dada Engine, which is a simple BNF-like grammar for writing out random text (example here). Instead of just writing a module that simply executes the grammar, the Perl port was written so that it translates the grammar to stack machine instructions. When these instructions are written out to an external file, along with the Perl implementation of that machine, you have a small and fast-running standalone script that outputs your random text.	[reply]

Back to Meditations