http://qs321.pair.com?node_id=101786

We're currently working on an application that is having some performance issues. Some of these issues are significant enough that they're killing the application and simply analyzing the error, access, and event logs has let us know where to turn. Now that we've gotten many of the serious issues out of the way, we still need to improve its performance, but since the application is no longer crashing, no one knew where to look in the logs.

Meetings were getting ugly. Everyone would have some sort of theory as to what was slowing things down.

"It's foo that's killing performance."
"No, it's bar I tell you.

I decided to avoid the arguments and simply start gathering evidence for whatever was going on. I did a little research and discovered Devel::Dprof. I ran it against a program that analyzes Web pages and here's the output I (after running it through dprofpp and asking for the running time of the 15 longest running subs).

Total Elapsed Time = -0.00541 Seconds User Time = 0.299186 Seconds Exclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 26.7 0.080 0.169 5 0.0160 0.0338 main::BEGIN 13.3 0.040 0.079 6 0.0067 0.0132 IO::File::BEGIN 10.0 0.030 0.040 3 0.0100 0.0133 HTML::TokeParser::BEG +IN 6.68 0.020 0.030 7 0.0029 0.0043 vars::BEGIN 6.68 0.020 0.030 4 0.0050 0.0074 IO::Seekable::BEGIN 6.68 0.020 0.050 3 0.0067 0.0166 CGI::BEGIN 3.34 0.010 0.010 1 0.0100 0.0100 Exporter::export 3.34 0.010 0.010 1 0.0100 0.0100 warnings::BEGIN 3.34 0.010 0.010 3 0.0033 0.0033 File::Spec::Win32::BE +GIN 3.34 0.010 0.010 6 0.0017 0.0016 IO::Handle::BEGIN 0.00 0.000 0.020 2 0.0000 0.0100 CGI::Util::BEGIN 0.00 0.000 0.010 20 0.0000 0.0005 Exporter::import 0.00 0.000 -0.000 2 0.0000 - warnings::register::i +mport 0.00 0.000 -0.000 4 0.0000 - warnings::register::m +kMask 0.00 0.000 -0.000 1 0.0000 - Exporter::heavy_expor +t

After reviewing results, playing with the code, and rerunning Devel::Dprof several times, I discovered chunks of old debugging code and plenty of code that could be subtly tweaked for a performance boost. In less than a quarter hour I had the following:

Total Elapsed Time = -0.00433 Seconds User Time = 0.191344 Seconds Exclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 26.1 0.050 0.060 3 0.0167 0.0199 HTML::TokeParser::BEG +IN 15.6 0.030 0.079 6 0.0050 0.0132 IO::File::BEGIN 10.4 0.020 0.030 4 0.0050 0.0074 IO::Seekable::BEGIN 5.23 0.010 0.010 16 0.0006 0.0006 Exporter::import 5.23 0.010 0.010 1 0.0100 0.0100 IO::bootstrap 5.23 0.010 0.010 28 0.0004 0.0004 main::update_element_ +hash 5.23 0.010 0.010 6 0.0017 0.0017 vars::BEGIN 5.23 0.010 0.009 74 0.0001 0.0001 HTML::TokeParser::get +_tag 5.23 0.010 0.010 11 0.0009 0.0009 main::untainting_rege +xes 5.23 0.010 0.020 3 0.0033 0.0066 File::Spec::Win32::BE +GIN 5.23 0.010 0.070 1 0.0100 0.0697 main::BEGIN 0.00 0.000 -0.000 1 0.0000 - warnings::BEGIN 0.00 0.000 -0.000 1 0.0000 - warnings::register::i +mport 0.00 0.000 -0.000 2 0.0000 - warnings::register::m +kMask 0.00 0.000 -0.000 2 0.0000 - HTML::Parser::BEGIN

This wasn't any of our production code (that's next), but according to elapsed User Time, performance was boosted over 30% -- I ran the before and after results several times to verify the gains. Interestingly, this was code that I had written. I knew it well and I knew that there was little, if anything, that I could optimize.

Boy, was I wrong.

Check out Devel::Dprof next time you have performance issues. It's well worth learning.

Cheers,
Ovid

Vote for paco!

Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.