Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

(Aighearach) Re: Efficient Perl Programming

by Aighearach (Initiate)
on Nov 10, 2000 at 18:14 UTC ( [id://40941]=note: print w/replies, xml ) Need Help??


in reply to Efficient Perl Programming

What I would advise if you want to get your program running faster, is to use Devel::OpProf. This ties right into the perlfaq 3 question, "How do I make my Perl program run faster?" The advice given is to improve your algorithm. Okay, but how to do that? One of the best ways to compare different ways of doing things is to look at the resources they take up. Benchmark.pm will tell you how fast a code fragment runs, but that information isn't very portable. On a machine with lots of RAM, you might be developing programs that are very fast, but will generally lag on other people's machines. Devel::OpProf lets you identify these bottlenecks, even when they aren't slowing you down.

As an example, lets consider that we want to make a list of 1000 elements, with each element set to 1.

#!/usr/bin/perl -w use strict; use Devel::OpProf qw( profile print_stats zero_stats ); use Benchmark qw( timethese ); profile(1); print "*** one() ***\n"; my @one = one(); print_stats(); zero_stats(); print "\n*** two() ***\n"; my @two = two(); print_stats(); zero_stats(); print "\n*** three() ***\n"; my @three = three(); print_stats(); zero_stats(); print "\n*** four() ***\n"; my @four = four(); print_stats(); zero_stats(); timethese( 0, { test_one => 'one()', test_two => 'two()', test_three => 'three()', test_four => 'four()' } ); sub one { my @list; for ( my $i = 0; $i < 1000; $i++ ) { $list[$i] = 1; } return @list; } sub two { my @list; for ( 0..999 ) { $list[$_] = 1; } return @list; } sub three { my @list = map { 1 } (1..1000); return @list; } sub four { my @list; @list[0..999] = (1) x 1000; return @list; }
On my machine, this outputs:
*** one() ***
private variable         3002
constant item            2003
next statement           1009
private array            1004
numeric lt (<)           1001
logical and (&&)         1001
scalar assignment        1001
preincrement (++)        1000
iteration finalizer      1000
array element            1000
pushmark                 8
glob value               4
subroutine entry         3
conditional expression   1
list assignment          1
block entry              1
array dereference        1
loop entry               1
loop exit                1
return                   1
print                    1

*** two() ***
next statement           2009
private array            1004
constant item            1003
foreach loop iterator    1001
logical and (&&)         1001
array element            1000
scalar variable          1000
iteration finalizer      1000
scalar assignment        1000
pushmark                 9
glob value               5
subroutine entry         3
block entry              1
list assignment          1
conditional expression   1
array dereference        1
foreach loop entry       1
loop exit                1
return                   1
print                    1

*** three() ***
null operation           1002
constant item            1002
map iterator             1000
block                    1000
pushmark                 11
next statement           8
private array            4
glob value               4
subroutine entry         3
array dereference        2
list assignment          2
map                      1
block entry              1
conditional expression   1
return                   1
print                    1

*** four() ***
pushmark                 12
next statement           9
private array            5
glob value               4
constant item            4
subroutine entry         3
array dereference        2
list assignment          2
repeat (x)               1
conditional expression   1
null operation           1
array slice              1
block entry              1
return                   1
print                    1
Benchmark: running test_four, test_one, test_three, test_two, each for at least 3 CPU seconds...
  test_one:  3 wallclock secs ( 3.16 usr +  0.00 sys =  3.16 CPU) @ 274.68/s (n=868)
  test_two:  3 wallclock secs ( 3.25 usr +  0.00 sys =  3.25 CPU) @ 359.08/s (n=1167)
test_three:  3 wallclock secs ( 3.01 usr +  0.00 sys =  3.01 CPU) @ 408.64/s (n=1230)
 test_four:  3 wallclock secs ( 3.27 usr +  0.01 sys =  3.28 CPU) @ 501.83/s (n=1646)

So, while Benchmark could tell us which is faster, it doesn't really tell us why. Devel::OpProf shows us that it is because of the number of temporary variables, &&'s, assignments, etc.

(while not really relevant to the point of this post, I should point out that probably a better idea than the slice in #4 is to just my @list = (1) x 1000;, because it is more readable. But, you are welcome to profile them to see if there is an algorithmical difference... ;)

Paris Sinclair    |    4a75737420416e6f74686572
pariss@efn.org    |    205065726c204861636b6572
http://sinclairinternetwork.com

Replies are listed 'Best First'.
RE: (Aighearach) Re: Efficient Perl Programming
by Fastolfe (Vicar) on Nov 10, 2000 at 19:05 UTC
    I was a little more interested in finding documentation (perldoc-style if nothing else), or a niche waiting to be filled by some useful documentation, on general optimization tips (e.g. use !/\S/ instead of /^\s*$/ and why). This information might still be useful to others though. Thanks.
      Here's one rule you might like: It is almost always a mistake to use the non-greedy regex quantifiers like *? and +?.

      Typical code looks like this:

      $churchill = qq{"If I were your husband," he said, "I should drink + it."}; while ($churchill =~ /"(.*?)"/g) { print $1, "\n"; }

      The purpose of the (.*?) is to capture a quoted part of the string in $churchill. Beginners usually try (.*) which doesn't work, because it captures everything from the first quotation mark to the last, including the non-quoted part in between. So then they ask how to fix this and are advised to use (.*?) instead. This does work, but it's much slower than it needs to be. A faster solution is:

      while ($churchill =~ /"([^"]*)"/g) { print $1, "\n"; }

      This says that what you're interested in is a quote character, followed by a sequence of characters that are not quotes ([^"]) followed by another quote. The description of what you want is more exact, and it enables Perl to do the matching more efficiently.

      So a good rule of thumb is to avoid .*? wherever possible and to use something like [^"]* instead when you can.

        I'm going to ignore the attitude for the moment. Perhaps you misunderstand what I was requesting.

        How do I work with Perl references? is to perldoc perlref as What are some approaches I can take while writing code to ensure it is efficient? is to _______________

        Lots of others have pointed out some very good books and other various online resources, which is basically what I was looking for. Yes, I see that your Devel::OpProf module can be useful in post-development optimization, but I was looking for something more along the lines of education. I apologize if that was unclear, but I don't appreciate the smart-alec attitude.

        A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://40941]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (4)
As of 2024-04-18 20:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found