bliako has asked for the wisdom of the Perl Monks concerning the following question:
Hello lockdown ones,
Everytime I declare a variable I believe a flop dies in my CPU. Probably because I fell in the C cauldron in my formative years. But now, and in Perl, the paradigm is to "lexicalise" variables within the innermost scope. For example, for:
my $x;
for $x (1,2,3){ print "x=$x\n" }
perlcritic a.pl says Loop iterator is not lexical at line 6, column 1. See page 108 of PBP. (Severity: 5)
But this pacifies perlcritic:
for my $x (1,2,3){ print "x=$x\n" }
But wakes in me primordial fears of will that variable be created 3 times and decrease performance? (for the sake of readability and, perhaps, stability and not introducing subtle bugs). Does anyone know the difference in performance between the two scripts? Even if it is tiny!
bw, bliako
Re: declaring lexical variables in shortest scope: performance?
by tobyink (Canon) on Mar 31, 2020 at 10:54 UTC
|
As well as what others have said, remember that Perl does have optimizations in place for common idioms. For example
my @things = ( ... ); # some list, whatever
@things = sort @things;
You may think that sort gets passed @things, then returns the sorted list of things, and that gets assigned back to the @things array as a list assignment. But you'd be wrong. Perl notices that you're sorting an array and assigning it back to itself, and uses an optimized code path that doesn't involve having to build a new list and do list assignment; it does an in-place sort.
Common idioms do get optimized for when possible, so there are benefits to sticking with them.
With for my $var (...) {...}, Perl knows that $var won't be leaking outside the body of the loop, so can at least potentially optimize based on that.
| [reply] [d/l] [select] |
Re: declaring lexical variables in shortest scope: performance?
by haukex (Archbishop) on Mar 31, 2020 at 10:38 UTC
|
At least on my 5.28, the difference is negligible:
use warnings;
use strict;
use Benchmark 'cmpthese';
cmpthese(-2, {
predecl => sub {
my $y;
my $x;
for $x (1,2,3) { $y+=$x }
},
lexical => sub {
my $y;
for my $x (1,2,3) { $y+=$x }
},
});
__END__
Rate predecl lexical
predecl 9175035/s -- -1%
lexical 9275893/s 1% --
But wakes in me primordial fears
Yes, I know the feeling well. But my philosophy has become: first, code so that it works, avoiding only the really obvious performance mistakes (like scanning an array instead of using a hash and the like). Then, if it's fast enough for your puproses, you're done. But if you want to optimize, remember that optimization is a science: measure the performance, identify the hotspots, benchmark the alternatives, modify the code accordingly, measure the difference in performance, and repeat until the performance becomes good enough for your purposes.
| [reply] [d/l] |
|
Thanks for the compare script. Indeed the lexical is a bit faster (probably what tobyink said about internal optimisations) and is confirmed by choroba but when I declare a lexical variable inside the loop for my $x (1,2,3) {my $z=12; $y+=$x } in predecl, it's 50% slower :(
I keep what you said about optimisation is a science
| [reply] [d/l] |
|
| [reply] [d/l] [select] |
|
|
|
|
Maybe it is negligible but that is Benchmark pitfall, overhead drowns out what you're measuring, in line sub contents as strings not sub calls
| [reply] |
|
| [reply] |
Re: declaring lexical variables in shortest scope: performance?
by choroba (Cardinal) on Mar 31, 2020 at 10:40 UTC
|
As usually, when you are interested in performance, benchmark or profile.
#! /usr/bin/perl
use warnings;
use strict;
use Benchmark qw{ cmpthese };
sub outer {
my $s = 0;
my $x;
for $x (1 .. 5) {
$s += $x;
}
$s
}
sub inner {
my $s = 0;
for my $x (1 .. 5) {
$s += $x;
}
$s
}
outer() == inner() or die 'Error in implementation';
cmpthese(-2, {
outer => 'outer()',
inner => 'inner()',
});
The result on my machine shows inner is about 6% faster. Such a small difference is insignificant and usually has no real impact on real performance.
Why is that? Remember that for localises its variable when it's not lexical, i.e. it has to store its previous value before entering the loop and restore it at the loop's end. It seems to take a bit more time than just creating a fresh new lexical variable.
map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
| [reply] [d/l] [select] |
|
Benchmark pitfall, overhead drowns out what you're measuring, in line sub contents as strings not sub calls
| [reply] |
|
use Benchmark 'cmpthese';
cmpthese(-2, {
predecl => '
my $y;
my $x;
for $x (1..10000) { $y+=$x }
',
lexical => '
my $y;
for my $x (1..10000) { $y+=$x }
',
});
__END__
Rate lexical predecl
lexical 3996/s -- -0%
predecl 4001/s 0% --
vs
use Benchmark 'cmpthese';
cmpthese(-2, {
predecl => sub {
my $y;
my $x;
for $x (1..10000) { $y+=$x }
},
lexical => sub {
my $y;
for my $x (1..10000) { $y+=$x }
},
});
Rate predecl lexical
predecl 4011/s -- -0%
lexical 4015/s 0% --
Which is more or less what haukex and choroba demonstrated. | [reply] [d/l] [select] |
Re: declaring lexical variables in shortest scope: performance?
by LanX (Saint) on Mar 31, 2020 at 11:38 UTC
|
In general:
- Declaration happens at compile time which should be neglectable.
- Assignment should be the same. (Or even faster°)
- The only overhead could be destruction or resetting at end of scope.
Your example with a for loop is a bit unfortunate, because aliasing is complicating things considerably.
Together with various optimizations performance gains or loses should be unpredictable, especially between different versions of Perl.
So benchmark it, I don't think it's worth the effort.
°) BTW, for many years it was commonplace that private variables are faster, not sure if accessing package variables has been optimized in the mean time. ...
| [reply] |
Re: declaring lexical variables in shortest scope: performance?
by GrandFather (Saint) on Mar 31, 2020 at 20:51 UTC
|
In C/C++ the equivalent outer/inner code would perform identically. The compiler allocates space for all the local variables that might be needed on the stack on entry to the sub, usually by effectively incrementing the stack pointer. The "overhead" to create space for the local variables is typically the execution time of one processor instruction on each entry to the sub.
Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
| [reply] |
Re: declaring lexical variables in shortest scope: performance? (on Code Optimization and Performance References)
by eyepopslikeamosquito (Archbishop) on Apr 01, 2020 at 07:56 UTC
|
Don't diddle code to make it faster -- find a better algorithm
-- The Elements of Programming Style
Don’t Optimize Code -- Benchmark It
-- from Ten Essential Development Practices by Damian Conway
It's important to be realistic: most people don't care about program performance most of the time
-- The Computer Language Benchmarks Game
The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times;
premature optimization is the root of all evil (or at least most of it) in programming
-- Donald Knuth
Don’t pessimize prematurely.
All other things being equal, notably code complexity and readability, certain efficient design patterns and coding idioms should just flow naturally
from your fingertips and are no harder to write than the pessimized alternatives. This is not premature optimization; it is avoiding gratuitous pessimization.
-- Andrei Alexandrescu and Herb Sutter
Rule 1: Bottlenecks occur in surprising places, so don't try to second guess and put in a speed hack until you have proven that's where the bottleneck is.
Rule 2: Measure. Don't tune for speed until you've measured, and even then don't unless one part of the code overwhelms the rest.
Rule 3. Fancy algorithms are slow when n is small, and n is usually small. Fancy algorithms have big constants. Until you know that n is frequently going to be big, don't get fancy. (Even if n does get big, use Rule 2 first.)
Rule 4. Fancy algorithms are buggier than simple ones, and they're much harder to implement. Use simple algorithms as well as simple data structures.
Rule 5. Data dominates. If you've chosen the right data structures and organized things well, the algorithms will almost always be self-evident. Data structures, not algorithms, are central to programming.
Note: Pike's rules 1 and 2 restate Tony Hoare's "Premature optimization is the root of all evil". Ken Thompson rephrased Pike's rules 3 and 4 as "When in doubt, use brute force". Rules 3 and 4 are instances of KISS. Rule 5 was stated by Fred Brooks in The Mythical Man-Month and is often shortened to "write stupid code that uses smart objects" (see also data structures vs code).
-- Rob Pike
Without good design, good algorithms, and complete understanding of the
program's operation, your carefully optimized code will amount to one of
mankind's least fruitful creations -- a fast slow program.
-- Michael Abrash
A couple of related general guidelines from On Coding Standards and Code Reviews:
- Correctness, simplicity and clarity come first. Avoid unnecessary cleverness. If you must rely on cleverness, encapsulate and comment it.
- Don't optimize prematurely. Benchmark before you optimize. Comment why you are optimizing.
On Interfaces and APIs cautions that library interfaces are very difficult to change once they become widely used - a fundamentally
inefficient interface cannot be easily fixed later by optimizing.
So it is not "premature optimization" to consider efficiency when designing public library interfaces.
See Also
These experiences convinced me of don't assume measure and especially find a better algorithm!
Perl Performance References
High Performance and Parallel Computing References
- VTune - Intel VTune, part of the Intel oneAPI Base Toolkit
- Intel oneAPI - a unified API used across different computing accelerator (coprocessor) architectures, including GPUs, AI accelerators and field-programmable gate arrays
Extra Performance/Optimization References Added Later
Bitwise Operations:
Benchmark:
Other:
From BrowserUk (2012-2015):
Two spookily similar nodes posted late 2021 (both requesting XS C code, both by new monks who won't show us their code):
Some old classics:
Other:
On CPAN:
Some external references:
Mathematical:
Memory:
Sorting:
Multi-threading:
- Re: Rosetta Code: Long List is Long - JudySL code by marioroy (2023) - JudySL array implementation
- Solving the Long List is Long challenge, finally? by marioroy (2023) - Perl versions in replies use Sort::Packed, Tie::Hash::DBD, DB_File, IPC::MMA, Crypt::xxHash, Tokyo Cabinet, Kyoto Cabinet, Tkrzw sharding, Tkrzw llil4tkh, Tkrzw llil4tkh2
Compiler switches/flags:
I/O:
PDL and Array Processing References
See: Re^2: Organizational Culture (Part II): Meta Process (BioPerl/PDL/AI/Embedded/Data Science References)
RPerl References
See Also
Updated: Added Donald Knuth premature optimization quote and Alexandrescu/Sutter premature pessimization quotes and Rob Pike quotes. Mentioned efficiency of interfaces. Added more references.
| [reply] [d/l] [select] |
|
| [reply] |
Re: declaring lexical variables in shortest scope: performance?
by vr (Curate) on Mar 31, 2020 at 17:53 UTC
|
Isn't it the case that lexical loop iterator doesn't create a "lexical pad"? If a block, which creates scope, can be written so it doesn't have to create a "pad", then it's surely faster, no?
use strict;
use warnings;
use Benchmark 'cmpthese';
cmpthese -2, {
1 => sub {
my ( $s, $x, $y ) = 0;
# for ( 1 .. 1e6 ) { # doesn't matter, if written so
for my $i ( 1 .. 1e6 ) {
$x = rand; # suppose intermediate
$y = rand; # variables are required
$s += $x + $y;
}
$s
},
2 => sub {
my ( $s, $i ) = 0;
# for $i ( 1 .. 1e6 ) { # doesn't matter, neither
for ( 1 .. 1e6 ) {
my $x = rand;
my $y = rand;
$s += $x + $y;
}
$s
},
};
__END__
Rate 2 1
2 6.69/s -- -42%
1 11.6/s 74% --
| [reply] [d/l] |
Re: declaring lexical variables in shortest scope: performance?
by dsheroh (Monsignor) on Apr 01, 2020 at 07:46 UTC
|
Perl is not, never has been, and almost certainly never will be a "high performance" language. If you're really that concerned about saving every flop, your first move should not be to question where to put my, your first move should be to switch to a different language.
As long as you avoid the big mistakes like using inefficient algorithms, optimizing for correct operation and readability is generally going to be more than sufficient. Micro-optimizing things like the placement of variable declarations is almost never worth the effort, regardless of language, because the time you spend doing the optimization will generally be many, many orders of magnitude larger than the handful of microseconds that will actually be saved by the code speedup. | [reply] [d/l] |
A reply falls below the community's threshold of quality. You may see it by logging in. |
|
|