Template Toolkit 2 Optimizing

Now that I've worked with the Template Toolkit 2 a lot with a website with a backend database, I was contemplating how powerful it is in terms of where you can put any data manipulations prior to user display; in the perl code, in the template code, or a combination of both. I decided to write some quick code to see where bottlenecks in using Template Toolkit 2 might occur, and while the results might not be that surprising to long-time perl hackers, it does help to clarify how one should use TT2 to it's maximum benefit.

I'll put the code at the end of this, but the basics of what I did were as follows:

Generate n data items, consisting of a userid and an array of random values. When we display this, we want to show the userid, the list of values, and the average of the values.
Method 1 - Generate a new TT2 base for each item, process a join and the average for each, then send that off to the template. It's expected that this will be slow since TT2 does use some cache'ing, and by recreating the TT2 base each time, it should be very slow.
Method 2 - Generate the TT2 base only once, and then follow the steps above. Watch the cache'ing in action!
Method 3 - Instead of sending each item at a time to the template file, the entire array is sent to the TT2. Processing of the average and the join are done with TT2 functions.
Method 4 - Like Method 3, but the average is precalculated and added to each data item before the entire array is sent to TT2. The join is still done in TT2.
Method 5 - Like Method 4, but the join is also predone before sending the entire array to TT2.

Note that 4 and 5 were done last since they actually modify the hash elements and this might cause some problems with Methods 1-3.

I ran this with n=10, 100, 1000, and 10000 items on a 200Mhz Pentium, 128Megs ram, Linux 2.2.17 with Perl 5.6 with minimal other processes running, and here are the CPU times I got for each in secs...

            n=10    100      1000      10000
Method 1    1.52   5.65     45.35     419.53
       2    0.08   0.54      4.90      45.05
       3    0.39   2.49     21.93     209.96
       4    0.10   0.40      3.31      32.30
       5    0.09   0.34      2.88      26.70

The results are not that all surprising. As stated above, Method 1 should be very slow since TT2 does cache'ing of the template files, as well as any startup code. However, comparing 2 and 5 also shows that there's significant start up costs for the $tt->process function, so that if you can send most of your data at once to a template file, the better. TT2 does not appear to do a good job of handling loops built into it's language, since method 3 is rather poor, or at least 'short' loops like that used to calculate the average of numbers; this could also be related to math performance. Surprisingly, another 'loop'-like operation, join, appears to perform better since the performace penalty between 4 and 5 is not that much different; most likely it's using the built in perl join which is optimized to be speedy. But there is still some penalty here for using that , possibly due to simple parsing.

So, my conclusions from this test are as follows: if you are going to be sending large amounts of data through a TT2 template, you such

Collect as much of it into a single array, and let a TT2 foreach-loop over the data; you gain the benefit here of cach'ing and avoiding startup costs.
Do as much data process of the data inside of perl as opposed to within TT2 where it might be possible. TT2 can do a good amount of primative data handling (like string operations) without too much cost, but math and other options are best done in perl.

From a practical usage point, I would also suggest trying to 'objectify' your data as much as possible as to avoid rewriting what you have in perl when you decide to rewrite the template. For example, if you have a list of names, stored as last name, first name, and middle initial, it's easier to send these as a hash (eg last => $last, first => $first ), than to merge them as a string ahead of time ("$first $middle $last") as you can then decide in the teplate if you want to display the whole name, just the last name, or some combination thereof. Not necessarily the best example, but you can see the possibilities.

Please note that I believe I've chosen a good test conditions to explore this and implemented them as best as possible. If you think there's other cases or a better way to do this, please let me know, and I'll see what I can do. But as with nearly everything, YMMV - if TT2 performance is that vital to your website, make sure you test it yourself and see what conditions optimize it for you.

Here's the code as promised: first, the perl script:

#! /usr/bin/perl -wT

use strict;

use Benchmark;
use Template;

# Generate some 'data'

my @data;
for (my $i = 0 ; $i < 10 ; $i++ ) {
  my %hash;
  $hash{ 'id' } = $i;
  my @values;
  my $num = 5 + rand 20;
  for (my $j = 0 ; $j < $num; $j++ ) {
    push @values, rand 10000;
  }
  $hash{ 'datum' } = \@values;
  push @data, \%hash;
}

my $string;

# Method 1: Regenerate the template every time (duh, should be slow...
+)
timethis( 1, sub
          {
              foreach my $datum ( @data ) {
                  my $tt1 = Template->new;
                  # get the average...
                  my $average = 0;
                  foreach my $j ( @{ $datum->{ 'datum' } } ) {
                      $average += $j;
                  }
                  $average = $average / @{ $datum->{ 'datum' }};
                  $tt1->process( 'method_1', {
                      id => $datum->{ 'id' },
                      values => join(',', @{$datum->{ 'datum' }} ),
                      average => $average
                      }, \$string );
              }
          });

# Method 2: No template regeneration...
timethis( 1, sub
          {
              my $tt2 = Template->new;
              foreach my $datum ( @data ) {
                  # get the average...
                  my $average = 0;
                  foreach my $j ( @{ $datum->{ 'datum' } } ) {
                      $average += $j;
                  }
                  $average = $average / @{ $datum->{ 'datum' }};
                  $tt2->process( 'method_1', {
                      id => $datum->{ 'id' },
                      values => join(',', @{$datum->{ 'datum' }} ),
                      average => $average
                      }, \$string );
              }
          });

# Method 3: Let Template Toolkit handle some functions
timethis( 1, sub
          {
              my $tt3 = Template->new;
              $tt3->process( 'method_3', {
                      data => \@data
                      }, \$string );
          });

# Method 4: Doing some processing in perl before sending to Template
timethis( 1, sub
          {
              my $tt4 = Template->new;
              foreach my $datum ( @data ) {
                  # get the average...
                  my $average = 0;
                  foreach my $j ( @{ $datum->{ 'datum' } } ) {
                      $average += $j;
                  }
                  $datum->{ 'average' } = $average / @{ $datum->{ 'dat
+um' }};
              }
              $tt4->process( 'method_4', {
                      data => \@data
                      }, \$string );
          });

# Method 5: Doing all processing in perl before sending to Template
timethis( 1, sub
          {
              my $tt5 = Template->new;
              foreach my $datum ( @data ) {
                  # get the average...
                  my $average = 0;
                  foreach my $j ( @{ $datum->{ 'datum' } } ) {
                      $average += $j;
                  }
                  $datum->{ 'average' } = $average / @{ $datum->{ 'dat
+um' }};
                  $datum->{ 'values' } = join( ',', @{ $datum->{ 'datu
+m' }} );
              }
              $tt5->process( 'method_5', {
                      data => \@data
                      }, \$string );
          });
[download]

method_1

[% id %] - [% values %] ==> [% average %]
[download]

method_3

[% FOREACH n = data %]
  [% average = 0 %]
  [% FOREACH j = n.datum %]
    [% average = average + j %]
  [% END %]
  [% average = average / n.datum.size %]
[% n.id %] - [% n.datum.join(',') %] ==> [% average %]
[% END %]
[download]

method_4

[% FOREACH n = data %]
[% n.id %] - [% n.datum.join(',') %] ==> [% n.average %]
[% END %]
[download]

method_5

[% FOREACH n = data %]
[% n.id %] - [% n.values %] ==> [% n.average %]
[% END %]
[download]

Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain

Comment on Template Toolkit 2 Optimizing Select or Download Code


P is for Practical
	PerlMonks