Re: references best practice

The first form returns a list, each item of which is copied. The other two forms copy a single scalar, a reference to an array. I would expect that either of these two to have similar performance.

Suspicion, however is no substitute for benchmarking so I ran the following script.

use warnings;
use strict;

use Benchmark qw(cmpthese);

sub prepData1 {
    my @data = ( map { log $_ } 1 .. 100 );
    return @data;
}

sub runPrep1 {
    my @result = prepData1();
}

sub prepData2 {
    my $refData = shift;
    $refData = [ map { log $_ } 1 .. 100 ];
}

sub runPrep2 {
    my @result;
    prepData2( \@result );
}

sub prepData3 {
    my @data = ( map { log $_ } 1 .. 100 );
    return \@data;
}

sub runPrep3 {
    my $result = prepData3();
}

sub prepData4 {
    my $data = [ map { log $_ } 1 .. 100 ];
    return $data;
}

sub runPrep4 {
    my $result = prepData4();
}

cmpthese(
    -5, # Run each function for at least 5 seconds
    {
        array_out      => \&runPrep1,
        array_ref_in   => \&runPrep2,
        array_ref_out1 => \&runPrep3,
        array_ref_out2 => \&runPrep4,
    }
);
__END__
                  Rate     array_out array_ref_out1  array_ref_in arra
+y_ref_out2
array_out      11270/s            --           -30%          -43%     
+      -44%
array_ref_out1 16137/s           43%             --          -19%     
+      -20%
array_ref_in   19873/s           76%            23%            --     
+       -1%
array_ref_out2 20101/s           78%            25%            1%     
+        --
[download]

I added a fourth style since that is the form I prefer where the data is directly put in an array reference. Once again benchmarking proves me wrong;-) That's why I always do it rather than guess.

Update 1: I prefer my idiom because I create a single container of related values but I can't pass that back since the function returns a list of values. To retain the relationship between them I use an array reference so that the caller of the function gets back a group of related values. If, for some reason, I need to extend the function to return a second group of data I can pass back two references thereby maintaining the logical grouping of data.

Update 2: As Haarg kindly pointed out I got the second case wrong, I was creating a new anonymous array reference and assigning it to the array ref that was passed int. Updating the code & benchmarks I get

sub prepData2 {
    my $refData = shift;
    @$refData = ( map { log $_ } 1 .. 100 );
}
__END__
                  Rate     array_out array_ref_out1  array_ref_in arra
+y_ref_out2
array_out      11801/s            --           -28%          -28%     
+      -39%
array_ref_out1 16406/s           39%             --           -1%     
+      -15%
array_ref_in   16496/s           40%             1%            --     
+      -14%
array_ref_out2 19238/s           63%            17%           17%     
+        --
[download]

Looks my original guess was correct which just shows if benchmarking gives an unexpected result you should check both your assumptions and your benchmarking.

Comment on Re: references best practice Select or Download Code

Replies are listed 'Best First'.
Re^2: references best practice by amarquis (Curate) on Apr 25, 2008 at 18:47 UTC
Benchmarking doesn't prove you wrong, it proves that something might be slower in a given case. If I had to choose between a marginal speed increase an an idiom I am comfortable with, I'd say the latter is correct. (Until I have proof that speed is a problem and profiling tells me this is the bottleneck).	[reply]
Re^2: references best practice by Haarg (Priest) on Apr 25, 2008 at 22:51 UTC
Your second example isn't doing what you intend. The array outside of the sub isn't modified. The first line of prepData2 sets $refData to the incoming reference. The second replaces that reference with a different one, leaving the contents of the first reference unchanged. You'd want something more like: `sub prepData2 { my $refData = shift; @$refData = map { log $_ } 1 .. 100; }` [download] With that change, the benchmark changes somewhat, making it almost identical to array_ref_out1 in my tests.	[reply] [d/l]


Keep It Simple, Stupid
	PerlMonks