Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
In XSLT vs Templating?, I asked fellow monks for their opinion on using either a templating solution, like Template Toolkit 2, for the processing of output from a DB or similar prior to page display, or using XML output along with XSLT to work up the page display. Both have benefits and disadvantages, and no real forerunner came out of that discussion per se, beyond giving everyone else food for thought.

Now, I've been playing with XSLT over the last few days, and I thought that I could use a quick test to bookmark which methods might be better than others. Thus, I took 6 different ways one could work up output from a database that typically would be displayed as a web page, and timed the results to see which is the most efficient.

The database is Postgres, and I am using the monk locations as compiled by jcwren to fill it. The application that I do is to call the following query on the db:
SELECT id, name, xp, lat, long FROM monks ORDER BY lat LIMIT 25
The output then must be compiled into a table, as demonstrated below:

116014pjf827- 37.6725144.843333
...and so forth.

The 6 methods that I tried were:

  • Strict use of DBI and the builtin print command
  • Use of DBI and CGI's generation functions, after massaging the data into a form usable for these calls
  • Use of DBI and TT2 with no extra modules to process the data (Do note that with all the TT2-based methods, I used the non-XS variable cache, partially because the XS one appears to be broken in the current version)
  • Use of XML::Generator::DBI, TT2, and TT2's XML::Simple plugin to parse the XML. Note that because of the way XML::Simple read in the data, the order of the data was not explicitly kept, and a sort statement was required in the template file (and even then, it sorted on text, not numbers)
  • Use of XML::Generator::DBI, TT2, and TT2's XML::XPath plugin to parse the XML
  • Use of XML::Generator::DBI and XML::LibXSLT to transform the XML. I do realize that there are other XML/XSLT engines out there, and strictly speaking, this method employs non-pure Perl parts (as it piggybacks on GNOME's libxslt library), but this is still not an unreasonable test to perform.
Note in the code below that I tried to prep each method as much as possible before running the tests, including prepping the TT2 variable cache, creating the DBI handle, creating the XML parser, etc. The only thing that each subroutine should do is execute the SQL statement, process the data, and print it out to a target (here being /dev/null since I didn't want to flood my fs with a hugh file of the same text over and over).

In addition to the code, I've included the various templates that I used for the TT2 and XSLT methods.

The results from timing 50 runs a piece of each method are below (Box is Linux, perl 5.6.1, 200Mhz with 128Megs Ram):

Benchmark: timing 50 iterations of DBI and CGI, DBI and Print, DBI and + TT2, XML and TT2/Simple, XML and TT2/XPath, XML and XSLT... DBI and CGI: 5 wallclock secs ( 2.02 usr + 0.02 sys = 2.04 CPU) @ 2 +4.51/s (n=50) DBI and Print: 2 wallclock secs ( 0.23 usr + 0.02 sys = 0.25 CPU) @ + 200.00/s (n=50) (warning: too few iterations for a reliable count) DBI and TT2: 13 wallclock secs (10.91 usr + 0.07 sys = 10.98 CPU) @ +4.55/s (n=50) XML and TT2/Simple: 56 wallclock secs (43.87 usr + 0.23 sys = 44.10 C +PU) @ 1.13/s (n=50) XML and TT2/XPath: 154 wallclock secs (144.17 usr + 0.73 sys = 144.90 + CPU) @ 0.35/s (n=50) XML and XSLT: 27 wallclock secs (24.23 usr + 0.11 sys = 24.34 CPU) @ + 2.05/s (n=50) XML and TT2/XPath 0.345/s -- -70% -83% -92% -99% -100% XML and TT2/Simple 1.13/s 229% -- -45% -75% -95% -99% XML and XSLT 2.05/s 495% 81% -- -55% -92% -99% DBI and TT2 4.55/s 1220% 302% 122% -- -81% -98% DBI and CGI 24.5/s 7003% 2062% 1093% 438% -- -88% DBI and Print 200/s 57860% 17540% 9636% 4292% 716% --
To no surpise, straightforward solutions overwhelm those that do extra things (like generating XML). Of course, when you look at the code, and think of separation of presentation and content, this is only a CPU efficiency, and not a development efficiency.

Typically, the XML solutions fared worse than the straight DBI solutions. Of course, there is probably overhead in the XML::Generator::DBI methods, however, given the fair distance between the two XML/TT2 solutions and XML/XSLT, compared with the DBI solutions, this isn't probably very large. Instead, probably much of the overhead for the XML/TT2 solutions is the fact that they are using pure perl libraries to tackle the XML, and that means a lot of inching forward regexs. With the XML/XSLT solution, we have a precompiled library, and while there's probaby inch-by-inch regexes there as well, the lack of time needed to recompile the code is very significant.

What I do find surprising and yet satisfactory is that performace-wise, the XML/XSLT solution isn't terribly off from the best solution in terms of bother CPU and development efficiency, the DBI/TT2 solution. Yes, we are looking at what appears to be between 200-250% increase in time per cycle for the XSLT solution, but not insurmountable. For the purposes that I would be considering XML for (non-commerical, low traffic site), that would certainly acceptable.

Certainly, this is a simple test, but I think it does show that any solution of content generation that tried to separate content from the logic is going to run into CPU bottlenecks, obviously. And here, in the decision to go with a template solution vs XML, I think both are still up in the air, though templating has the edge in CPU usage.

However, I do read about many benefits of XSLT. For example, the process that I transform the data here is as follows:

XSLT Transform DBI --> XML -----------------> XHTML
which is about as easy as you can get. But because transforms can go to any space, not just XHTML, it's possible to insert special steps along the way to get where you need. For example, if I was going to allow to customization in which data was to be present (a 'fast' view vs a 'full' view) I could simply add a few more steps to this process to get to that:
CGI-->userinfo -----| | Data Pruning XSLT DBI--> DataXML -----+--> XML ------------------> XML2 | XHTML Transform | | V XHTML
whereby the data pruning transformation would take the various <monk> elements, and using info from the userinfo, convert those into a new <tablerow> tag; at this point, regardless of the customization of the user, we have several <tablerow>s in the XML. Then we simply plug those into the XHTML transformer, taking the <tablerow>s into alternating colored <tr>s. Other customizations can be done here as well, and furthermore, other levels of transforms can be easily chained to make a complex but easily servicable web page. While you can coerce TT2 to do a similar functionality, it's just not as simple as chaining XSLT transformations.

So the end result of this study is that both Templating and XSLT have their place. If CPU efficiency is absolutely critical, it's much better to go with TT2 (or any other template solution). On the other hand, if you want to have a lot more flexibility in the transformations, using XML/XSLT may be the right way, but as pointed out in recent threads, this is still new to a lot of people and may take a lot of getting used to.

And now for the code: - The actual program
#!/usr/bin/perl -w use strict; use DBI; use CGI qw/-no_xhtml :standard/; use XML::Generator::DBI; use XML::Handler::YAWriter; use XML::LibXML; use XML::LibXSLT; use Template; use Data::Dumper; use Benchmark qw( cmpthese ); $Template::Config::STASH = 'Template::Stash'; my $dbh = DBI->connect( "dbi:Pg:dbname=monksdb", "", "" ) or die $DBI::errstr; my $query = "SELECT id, name, xp, lat, long FROM monks ORDER BY lat LI +MIT 25"; my $sth = $dbh->prepare( $query ) or die $DBI::errstr; my $ya = XML::Handler::YAWriter->new( AsString => 1 ); my $generator = XML::Generator::DBI->new( Handler => $ya, dbh => $dbh, RowElement => "monk" ); my $tt2 = Template->new; my $tt2_nonXML = "template1.tt2"; my $tt2_XML = "template2.tt2"; my $tt2_XPath = "template3.tt2"; my $parser = new XML::LibXML; my $xslt = new XML::LibXSLT; my $sheet = "xslt_sheet.xsl"; my $slt = $parser->parse_file( $sheet ); my $stylesheet = $xslt->parse_stylesheet( $slt ); open FILE, ">/dev/null" or die "Cannot write out: $!"; my $target = \*FILE; cmpthese( 50, { "DBI and Print" => \&generate_from_straight_dbi_and_print, "DBI and CGI" => \&generate_from_straight_dbi_and_cgi, "DBI and TT2" => \&generate_from_straight_dbi_and_tt2, "XML and TT2/Simple" => \&generate_from_xml_and_tt2_and_xmlsimp +le, "XML and TT2/XPath" => \&generate_from_xml_and_tt2_and_xpath, "XML and XSLT" => \&generate_from_xml_and_xslt } ); close FILE; # Here, we use straight DBI calls and print calls to mark up # the table sub generate_from_straight_dbi_and_print { # my $target = shift; $sth->execute() or die $DBI::errstr; print $target "Content-Type: text/html\n\n"; print $target "<html><body><table>\n"; my $colorrow = 0; while ( my ( $id, $name, $xp, $lat, $long ) = $sth->fetchrow_array() + ) { $colorrow = !$colorrow; my $color = ( $colorrow ) ? "#FFFFFF" : "#D0D0FF"; print $target <<ROW; <tr> <td bgcolor="$color">$id</td> <td bgcolor="$color">$name</td> <td bgcolor="$color">$xp</td> <td bgcolor="$color">$lat</td> <td bgcolor="$color">$long</td> </tr> ROW ; } print $target "</table></body></html>"; } # Here, we group the results as to make it easier for CGI # to print out (avoiding large HERE docs...) sub generate_from_straight_dbi_and_cgi { # my $target = shift; $sth->execute() or die $DBI::errstr; my @data; while ( my @row = $sth->fetchrow_array() ) { push @data, \@row; } my $colorrow = 0; print $target header('text/html'), start_html, table( map { $colorrow = !$colorrow; my $color = ( $colorrow ) ? "#FFFFFF" : "#D0D0FF"; Tr( td( {-bgcolor=>$color}, $_ ) ) } @data ), end_html; } # Here, we pass the results to Template Toolkit for printing sub generate_from_straight_dbi_and_tt2 { # my $target = shift; $sth->execute() or die $DBI::errstr; my @data; while ( my @row = $sth->fetchrow_array() ) { push @data, \@row; } print $target header; $tt2->process( $tt2_nonXML, { monks => \@data }, $target ) or die $tt2->error(),"\n"; } # Use TT2 again, but now pass it XML and use the XPath module # for parsing sub generate_from_xml_and_tt2_and_xmlsimple { # my $target = shift; my $xml = $generator->execute( $query ); print $target header; $tt2->process( $tt2_XML, { results => $xml }, $target ) or die $tt2->error(), "\n"; } # Use TT2 again, but now pass it XML and use the XPath module # for parsing sub generate_from_xml_and_tt2_and_xpath { # my $target = shift; my $xml = $generator->execute( $query ); print $target header; $tt2->process( $tt2_XPath, { results => $xml }, $target ) or die $tt2->error(), "\n"; } # Use LibXML/LibXSLT to parse the results sub generate_from_xml_and_xslt { # my $target = shift; my $xml = $generator->execute( $query ); print $target header; my $source = $parser->parse_string( $xml ); my $results = $stylesheet->transform( $source ); print $target $stylesheet->output_string( $results ); }
template1.tt2 - The straight-forward TT2 Template
[% colorrow = 0 %] <html> <body> <table> [% FOREACH monkinfo = monks %] <tr> [% colorrow = !colorrow %] [% IF colorrow %] [% color = "#FFFFFF" %] [% ELSE %] [% color = "#D0D0FF" %] [% END %] [% FOREACH item = monkinfo %] <td bgcolor="[% color %]"> [% item %] </td> [% END %] </tr> [% END %] </table> </body> </html>
template2.tt2 - The TT2/XML::Simple Template
[% USE xml = XML.Simple( results ) %] [% xml %] [% colorrow = 0 %] <html> <body> <table> [% orderedmonks = %] [% FOREACH monkinfo = orderedmonks %] <tr> [% colorrow = !colorrow %] [% IF colorrow %] [% color = "#FFFFFF" %] [% ELSE %] [% color = "#D0D0FF" %] [% END %] <td bgcolor="[% color %]"> [%$ %] </td> <td bgcolor="[% color %]"> [% monkinfo %] </td> <td bgcolor="[% color %]"> [%$monkinfo.xp %] </td> <td bgcolor="[% color %]"> [%$ %] </td> <td bgcolor="[% color %]"> [%$monkinfo.long %] </td> </tr> [% END %] </table> </body> </html>
template3.tt2 - The TT2/XPath Template
[% USE xpath= XML.XPath( results ) %] [% colorrow = 0 %] <html> <body> <table> [% FOREACH monk = xpath.findnodes('/database/select/monk') %] <tr> [% colorrow = !colorrow %] [% IF colorrow %] [% color = "#FFFFFF" %] [% ELSE %] [% color = "#D0D0FF" %] [% END %] <td bgcolor="[% color %]"> [% xpath.find('id',monk) %] </td> <td bgcolor="[% color %]"> [% xpath.find('name',monk) %] <td> <td bgcolor="[% color %]"> [% xpath.find('xp',monk) %] </td> <td bgcolor="[% color %]"> [% xpath.find('lat',monk) %] </td> <td bgcolor="[% color %]"> [% xpath.find('long',monk) %] </td> </tr> [% END %] </table> </body> </html>
xslt_sheet.xsl - The XSLT Tranform
<xsl:stylesheet version = '1.0' xmlns:xsl=''> <xsl:template match="/database/select"> <table> <xsl:for-each select="//monk"> <tr> <td> <xsl:if test="position() mod 2 = 0"> <xsl:attribute name="bgcolor">#D0D0FF</xsl:attribute> </xsl:if> <xsl:if test="position() mod 2 = 1"> <xsl:attribute name="bgcolor">#FFFFFF</xsl:attribute> </xsl:if> <xsl:value-of select="id"/> </td> <td> <xsl:if test="position() mod 2 = 0"> <xsl:attribute name="bgcolor">#D0D0FF</xsl:attribute> </xsl:if> <xsl:if test="position() mod 2 = 1"> <xsl:attribute name="bgcolor">#FFFFFF</xsl:attribute> </xsl:if> <xsl:value-of select="name"/> </td> <td> <xsl:if test="position() mod 2 = 0"> <xsl:attribute name="bgcolor">#D0D0FF</xsl:attribute> </xsl:if> <xsl:if test="position() mod 2 = 1"> <xsl:attribute name="bgcolor">#FFFFFF</xsl:attribute> </xsl:if> <xsl:value-of select="xp"/> </td> <td> <xsl:if test="position() mod 2 = 0"> <xsl:attribute name="bgcolor">#D0D0FF</xsl:attribute> </xsl:if> <xsl:if test="position() mod 2 = 1"> <xsl:attribute name="bgcolor">#FFFFFF</xsl:attribute> </xsl:if> <xsl:value-of select="lat"/> </td> <td> <xsl:if test="position() mod 2 = 0"> <xsl:attribute name="bgcolor">#D0D0FF</xsl:attribute> </xsl:if> <xsl:if test="position() mod 2 = 1"> <xsl:attribute name="bgcolor">#FFFFFF</xsl:attribute> </xsl:if> <xsl:value-of select="long"/> </td> </tr> </xsl:for-each> </table> </xsl:template> </xsl:stylesheet>

Dr. Michael K. Neylon - || "You've left the lens cap of your mind on again, Pinky" - The Brain
"I can see my house from here!"
It's not what you know, but knowing how to find it if you don't know that's important

In reply to XSLT vs Templating, Part 2 by Masem

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others chanting in the Monastery: (6)
    As of 2021-01-19 14:55 GMT
    Find Nodes?
      Voting Booth?