in reply to No garbage collection for my-variables
Maybe it's time for the fabled use less to allow this memory-for-speed optimisation to be disabled?
That said, most of the types of routines for which this could become a significant problem, things like your examples of encode and decode that take string and return it modifed in some way, ought to be written to use the pass-by-reference aliasing affects of @_ anyway. It would make this 'problem' go away.
Of course, an orthodoxy has grown up around this place that pass-by-reference and side-effects are some how bad karma and that directly accessing @_ is premature optimisation. That modifying your arguments is bad because it is action at a distance that can surprise the caller.
But, as long as subroutines are documented as modifying their argument(s), it really does make the most sense in many cases. The caller knows what subsequent use it will make of the arguments it passes you, and if it needs for them to be preserved, it can make copies as and when it needs to. Which makes more sense than every subroutine, copying every parameter, every time, 'just in case'.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
Re^2: No garbage collection for my-variables
by kyle (Abbot) on Sep 16, 2008 at 19:11 UTC
|
In addition to moritz's excellent point that a function that modifies its arguments then could not be called with a literal, I'd also point out that a lot of Perl programmers probably don't know that @_ is full of aliases. I'd been programming in Perl off and on for over ten years before I came to the Monastery and learned that @_ is aliases. I've asked about this feature in interviews I've conducted, and the prospects out there have always been surprised at this feature. Documentation helps, of course, but someone who doesn't know this is possible could spend an awful lot of time debugging before discovering this (as you say) action at a distance.
Thumbs up on the use less, however.
| [reply] [d/l] [select] |
|
Done right, you can have both (see Re^3: No garbage collection for my-variables). That way, the unaware are not caught out, but when the facility is needed it is available.
It's the same mechanism that sort uses for in-place sorting in 5.10. I've thought about patching List::Util::shuffle() in the same way.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] |
|
>perl580\bin\perl -MO=Concise -e"@a = sort @a" 2>&1 | find "sort"
7 <@> sort lK ->8
>perl588\bin\perl -MO=Concise -e"@a = sort @a" 2>&1 | find "sort"
7 <@> sort lK/INPLACE ->8
>perl5100\bin\perl -MO=Concise -e"@a = sort @a" 2>&1 | find "sort"
7 <@> sort lK/INPLACE ->8
I don't have 5.8.1 to 5.8.7, so let's consult the perldeltas.
perl584delta:
In place sort optimised (eg @a = sort @a)
But it was buggy in 5.8.4. perl585delta:
The in-place sort optimisation introduced in 5.8.4 had a bug. For example, in code such as @a = sort ($b, @a), the result would omit the value $b. This is now fixed.
| [reply] [d/l] [select] |
|
% perl5.10.0 -lwe '@a = (2,1); sort @a; print @a'
Useless use of sort in void context at -e line 1.
21
As I understand it, perl (5.10) will detect that in @a = sort @a, the destination array is the same as the source array, so it uses a more efficient algorithm (but it's still in list context). | [reply] [d/l] [select] |
|
|
| [reply] |
Re^2: No garbage collection for my-variables
by moritz (Cardinal) on Sep 16, 2008 at 18:55 UTC
|
There's much more perlish reason not modify the arguments of sub by default. If you don't, you can write stuff like this:
other_function(decode 'latin-1', 'string_literal'))
# and if you want to change a variable
$var = decode('latin-1', $var);
On the other hand if you do change the the arguments of the sub, the first one requires another variable, which is a real kludge (visually, at least)
do {
my $var = 'string_literal';
decode('latin-1', $var);
other_function($var);
}
# and the other one
decode('latin-1', $var)
| [reply] [d/l] [select] |
|
I think that you've overplayed the case. Using a do block instead of an anonymous block makes it look more complicated than it is.
Even wrapping a local var in a bare block is rarely necessary. Most code is nested at some level in a if or while or other loop block or subroutine body.
On the rare occasions that it is at the top level of a program or module, if you really want it to be garbage collected, undef is better (in that it will actually achieve something) anyway.
Even the use of a constant is a emphasising the rare case. Mostly data is read in from external sources and is in a variable already, so:
while( my $var = <$fh> ) {
mutate( $var );
use( $var );
}
is hardly onerous, but even that can be avoided. Thanks to perl's context sensitivity, you can have the best of both worlds. For the simple case, subroutines behave as passthru pass-by-value, but when the need arises to minimise memory allocation and copying, using it ina void context does the right thing:
#! perl -slw
use strict;
sub mutates {
my $ref = defined wantarray ? \shift : \$_[ 0 ];
$$ref =~ s[(?<=\b[^ ])([^ ]+)(?=[^ ]\b)][scalar reverse $1]ge;
return $$ref if defined wantarray;
return;
}
sub doSomething {
print shift;
}
doSomething( mutates( 'antidisestablishmentarismania' ) );
my $var = 'The quick brown fox jumps over the lazy dog';
mutates( $var );
doSomething( $var );
__END__
c:\test>junk
ainamsiratnemhsilbatsesiditna
The qciuk bworn fox jpmus oevr the lzay dog
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] [select] |
|
sub do_stuff {
...
do_other_stuff($variable);
# remove that debugging statement, and do_other_stuff
# will behave very differently if do_stuff is not
# called in void context
print "still here\n";
}
Admittedly that's a fairly artificial situation and won't show up in real code very often, but if it does it's very nasty to debug.
Designing interfaces around performance optimizations and memory management oddities just doesn't seem right to me. | [reply] [d/l] |
|
|
|
|
|
|
Agreed a thousand times over. If I had a penny for every time I'd been forced to write tedious and ugly code because chomp modifies its argument instead of returning the chomped version, I'd have several pennies.
| [reply] [d/l] |
|
chomp( my $var = <$fh> );
and
chomp( my $dst = $src );
really more tedious and uglier than
my $var = chomp( scalar ( <$fh> ) );
and
my $dst = chomp( $src );
| [reply] [d/l] [select] |
|
Re^2: No garbage collection for my-variables
by betterworld (Curate) on Sep 16, 2008 at 18:51 UTC
|
Maybe it's time for the fabled use less
Good point. Maybe there just isn't a way for perl to detect how a particular variable could be optimized, but it would be possible if the user could decide.
things like your examples of encode and decode that take string and return it modifed in some way, ought to be written to use the aliasing pass-by-reference aliasing affects of @_ anyway.
Unfortunately I don't think it's realistic to demand that all modules be written this way. In the case of Encode, I'd rather use the module than my own memory-conserving code; and it's not convenient to change the module's source code. (I would probably even have to change it if "use less" worked, because it's lexically scoped afaik.)
(However I could encode
the text line by line as Joost suggested.)
| [reply] |
|
|