comment on

Thanks for this nifty piece of code.

I have been playing around and made your code into a module - certainly one which pollutes the callers namespace, since all entity and attribute functions have to be exported. I dont know how to keep the clear syntax and have it OO at the same time. Hmm...

But apart from this, I suggest some changes in the rules of your game:

To add an element called name, we call the function NAME and pass it a function that will construct its attributes and children.
a dash in attributes must be replaced by an underscore. Similar means must be provided for any character not allowed in perl subroutine declarations, but present in attribute names (e.g. xml:lang)

The first change improves legibility and avoids clashes with perl core functions (e.g tr/// vs. <tr>). The second is necessary per DTD..

Next would be completion of the entities and attributes array, and syntax checking - XML::DTDParser provides a simple way to do this.

use LWP::Simple;
use XML::DTDParser;
my $dtdfile = "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
+";
# or any other, like xhtml1-frameset.dtd
my $dtd = get($dtdfile);
$dtd =~ s/.*=== Imported Names =+-->//s; # avoid 'die' in XML::DTDPars
+er
$DTD = ParseDTD $dtd;
my $elems = [ map {uc($_) } keys %$DTD ];
my %s;
my $attrs = [
    grep { s/-/_/g; ! $s{$_}++ }
    map { keys %{$DTD->{$_}->{'attributes'}} } keys %$DTD
];
define_vocabulary($elems,$attrs);
[download]

Oh yes. About eval vs. symbol table - I don't see much difference here. Recompiled every time? no, each entity/attribute sub is created once via eval, and done. What shows up via perl -MO=Deparse is that every block as an argument to a subroutine prototyped as & gets its own code reference at each call of the sub if it occurs in a different context, but that's regardless of the use of eval. All blocks in calls to p in this snippet

            # it's just Perl, so we can mix in other code
            for (2..5) {
                p { text "Plus paragraph number $_." }
            }
[download]

will have the same CODE reference, but the next call to p will have it's own.

After all, eval is not evil. The hairy monster is -- perl, the father of perl's eval. What eval is used for is up to the programmer, and if people use eval to compile insecure code - then I guess the surrounding code isn't any better.

--shmem

Update

hmm, this doesn't seem to be a minimial-cost interface because of the reasons stated above - because each block gets its own reference which doesn't get destroyed or re-used. Wrap render_doc { }; into a sub and call it 10000 times and you'll end up with +192 MB of memory...

Update

The problem seems to be the anonymous $render_fn which doesn't get deallocated. Changing the block to

sub render_via_xml_writer {
    my $doc = shift;
    my $writer = XML::Writer->new(@_);  # extra args go to ->new()
    # my $render_fn;
    # $render_fn = sub {
    sub render_fn {
        my $frag = shift;
        my ($elem, $attrs, $children) = @$frag;
        $writer->startTag( $elem, map {@$_} @$attrs );
        for (@$children) {
            # ref() ? $render_fn->($_) : $writer->characters($_);
            ref() ? render_fn($_) : $writer->characters($_);
        }
        $writer->endTag($elem);
    };
    # $render_fn->($doc);
    render_fn($doc);
    $writer->end();
}
[download]

solves this issue. Sub in a sub? yes, render_fn has to see the my()-variables of render_via_xml_writer.

In reply to Re: Embedding a mini-language for XML construction into Perl by shmem
in thread Embedding a mini-language for XML construction into Perl by tmoertel

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Keep It Simple, Stupid
	PerlMonks