Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Profiling/Benchmarking web applications

by jryan (Vicar)
on Aug 24, 2004 at 19:03 UTC ( [id://385484]=perlquestion: print w/replies, xml ) Need Help??

jryan has asked for the wisdom of the Perl Monks concerning the following question:

I'm currently working on a web application at work using Perl, DBI, and the Template Toolkit. Its very nice, and its near completion, but the performance isn't as fast as I'd like. I was wondering if anyone knew of a code profiler (kinda like Devel::DProf) that could be attached to a web application so that I can find the slower sections of my code. Thanks in advance.

  • Comment on Profiling/Benchmarking web applications

Replies are listed 'Best First'.
Re: Profiling/Benchmarking web applications.
by gmax (Abbot) on Aug 24, 2004 at 21:21 UTC

    Use DBI::ProfileDumper::Apache .

    Quoting from the docs:

    DBI::ProfileDumper::Apache - capture DBI profiling data from Apache/mod_perl

    SYNOPSIS

    Add this line to your httpd.conf:

    PerlSetEnv DBI_PROFILE DBI::ProfileDumper::Apache

    Then restart your server. Access the code you wish to test using a web browser, then shutdown your server. This will create a set of dbi.prof.* files in your Apache log directory. Get a profiling report with dbiprof:

    dbiprof /usr/local/apache/logs/dbi.prof.*

    When you're ready to perform another profiling run, delete the old files

    rm /usr/local/apache/logs/dbi.prof.*

    and start again.

    Additionally, if you wish to fine tune your profiling needs, see Speeding up the DBI for some examples on how to include profiling code (using DBI::Profile) in your application.

     _  _ _  _  
    (_|| | |(_|><
     _|   
    
Re: Profiling/Benchmarking web applications.
by dragonchild (Archbishop) on Aug 24, 2004 at 19:44 UTC
    I would start with the usual suspects.
    • How are you using DBI? Are you following the best practices as outlined in its POD?
    • Are you creating and destroying lots and lots of nested datastructures? This can be expensive.
    • Are you using mod_perl? If you are, are you taking advantage of its more advanced features?
    • Is your schema normalized? Do you have the correct indices on it?
    • Is your server the correct size for your application? If you have your database on the same machine as your webserver and it's a 1-CPU machine with 1GHz and 1 10k RPM harddrive ... there's not that much improvement that can come from improving the code. You're going to be IO-bound no matter what way you cut it. (Adding a CPU, interestingly, can improve that. Adding striped disks is better. Moving the database to another machine is best.)

    I would suspect that if you examined the above items, you would get a 50% or higher speed improvement.

    An example - I came on board at my current company to speed up some reports. The first thing I looked at was the performance of the SQL. By reorganizing the schema, I took the time spent in the database from 243 seconds to 3 seconds. Not a single other thing changed.

    Then, I looked at the presentation layer. Converting from Oracle's Application Server to mod_perl + CGI::Application + PDF::Template took the report presentation from 30 to 2 seconds.

    So, just by examining the architecture, the reports went from just under 5 minutes to around 5 seconds.

    After that, I went ahead and ticked off every item in the checklist I listed above. The webapp now does about 100x what it used to do in about a tenth of the time.

    ------
    We are the carpenters and bricklayers of the Information Age.

    Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

    I shouldn't have to say this, but any code, unless otherwise stated, is untested

      Thanks for the tips, but the bottleneck turned out to be Template Toolkit. It's taking up an unbelievable 60% of the total invocation time. I'm actually pretty shocked; I knew TT was pretty heavyweight, but I'm not even embedding Perl code within the templates! I'm totally stumpted.

        You don't mention whether your application runs under CGI or a persistent framework such as mod_perl, FastCGI or PersistentPerl.

        If you care about performance, I assume you're avoiding CGI so you can reuse database connections and other things that take some time to initialise. Template Toolkit objects are such things: calling Template->new() for each request will make your application run more slowly.

        Template Toolkit is used on plenty of high volume Web sites: it's certainly possible to have it run quickly. Unless your application is very simple, TT shouldn't take up 60% of the application's run time.

        Thanks for the tips, but the bottleneck turned out to be Template Toolkit. It's taking up an unbelievable 60% of the total invocation time. I'm actually pretty shocked

        That's high. Are you:

        • Only creating the Template instance once (I'm assuming you're using mod_perl)?
        • Caching compiled templates (take a look at the COMPILE_EXT and COMPILE_DIR configuration options)?

        If you're not using some of the more advanced features of TT, perhaps you can switch to a lighter-weight Templating engine? We use HTML::Template, and it does the job for us. Obviously it has a lot less features. I can't guarantee that it's faster, though. Perhaps you can determine what features you MUST-HAVE in your templating system, then install all the ones that meet your needs, and do a comparison of their performance? If you do this, report your findings, since I'm sure others would be interested in the results.

        --
        edan

        The other replies have addressed some of the reactions I have regarding TT taking 60% of the response time. I have a few further questions.
        • 60% of what? Is it 10 seconds? 1 minute? 1 second?
        • Are you using mod_perl? That one single item will often cut 50% of your response time. And, it's a change that's transparent to the CGI scripts.1
        • Are you using a ton of nested BLOCK directives?
        • Are you doing things like
          [% foo = $bar.baz %] [% baz = qux.$foo %]

          That does a lot of eval work behind the scenes, which can be expensive.

        • How deep are your nested loops? Nesting loops don't scale linearly, in any templating system. HTML::Template, which is arguably the most efficient commonly used templating system has serious performance problems with loops nested 3+ deep.
        • Can you compile and/or cache the output from some of your templates?
        • Are you making calls to the database in your templates using the DBI plugin?
        1. Assuming, of course, you used sane coding standards. Persistence can be a bitch if you're converting the first CGI script you ever wrote to run under MP::Registry.

        ------
        We are the carpenters and bricklayers of the Information Age.

        Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

        I shouldn't have to say this, but any code, unless otherwise stated, is untested

      Are you suggesting optimizing non-profiled code? While experience usually helps you identify bottlenecks with mere eye-grep, you'd better not encourage others to do the same.

      One of the most strong laws of performance tuning is: never ever even think of optimizing before profiling. You just gonna spend your time on the code that could potentially not affect performance at all. And you usually skip those things that are considered fast and quick, but in fact suck. You probably know about those gotchas with Cache::SharedMemoryCache or HTML::Template's global_vars. Weren't they surprises? Just examples of how important profiling is (and how rarely it is really carried on).

        You are absolutely correct - nothing is a substitute for good profiling, and everyone should become familiar with ways of profiling their application. (The same goes for testing, too.)

        However, there are certain constructs which are known to be performance hogs. For example, using $sth->fetchall_arrayref({}); is practically the slowest way to get data from a database, when compared with other methods. Another is H::T's global_vars, as you mentioned. These don't need to be profiled because they are known performance hits.

        I would suggest merging our two approaches. Setting up a good profiling scenario can be time-consuming. In my experience, tackling the usual suspects has almost always provided enough improvement without needing to do a full profiling of a webapp.

        However, after hitting the usual suspects, profiling is most definitely the way to go. And, frankly, I'm not surprised that output generation is expensive. But, I suspect that most webapps are getting bogged down in other areas.

        ------
        We are the carpenters and bricklayers of the Information Age.

        Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

        I shouldn't have to say this, but any code, unless otherwise stated, is untested

Re: Profiling/Benchmarking web applications.
by cosimo (Hermit) on Aug 24, 2004 at 21:58 UTC

    For quick and dirty profiling, I usually just add a -d:DProf switch to my web scripts in the first line, as:

    #!/usr/bin/perl -d:DProf # my usual cgi script #

    After cgi execution, you will find the usual `tmon.out' in the same folder as your cgi script.

    Might be a good idea to run your httpd server in single process mode (-X for apache).

      Ah, thanks! I can't believe I didn't think of doing that. I'm always so used to invoking it from the command line. I feel pretty stupid now!

      The -x option is a good one. Another is to make a tmon.out FIFO. Then have a perl script or something read from the pipe and write out the output to a new name.

        Could someone explain how this would help ? Writing to a file or writing to a FIFO which is read by a script that writes to a file basically just puts the output into a file. What am I missing ?

Re: Profiling/Benchmarking web applications.
by perrin (Chancellor) on Aug 24, 2004 at 21:14 UTC
    You can use Devel::DProf or Devel::Profiler with web apps. Do a quick search for Devel::DProf with CGI or mod_perl and come back if you don't find something clear enough.
Re: Profiling/Benchmarking web applications.
by diotalevi (Canon) on Aug 24, 2004 at 19:42 UTC
    I know of nothing ready-made. Perhaps you could call your code as a CGI and have a wrapper script invoke your perl script under the profiler, then shuffle the tmon.out data out somewhere convenient.
Re: Profiling/Benchmarking web applications
by kappa (Chaplain) on Aug 25, 2004 at 17:19 UTC

    I routinely use this little script to emulate webapps running under Apache::Registry. It evals another perl source file as a subroutine and then calls it many times. This is exactly what Apache::Registry does to all scripts for which it is configured to be a PerlHandler.

    #! /usr/bin/perl -w use strict; use File::Slurp; use lib '/site/perl'; @ARGV >= 2 or die "Args: <perl program> <rep count> ...\n"; my $victim = read_file(shift @ARGV); eval " sub victim { " . $victim . " } "; open STDOUT, ">", "/dev/null"; open STDERR, ">", "/dev/null"; victim() for 1 .. shift @ARGV;

    This approach works best with CGI.pm enabled scripts so I can pass query-params on the command line. I use the above script (called profile.pl) this way:

    perl -d:Profile tools/profile.pl mail.cgi 500 mode=mailbox mbox=INBOX +page=4 sort=vSUBJ

    So I run 500 iterations of one of my most intensive code paths under Devel::Profile (alas, Devel::DProf breaks Unicode::String in a very weird way). Try it.

    Btw, I use HTML::Template and it needs lots of CPU time. I was very surprised to find my most annoying bottleneck in output page generation.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://385484]
Approved by bgreenlee
Front-paged by cchampion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (5)
As of 2024-04-19 03:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found