array pre-allocation trick

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I've been trying to see if pre-allocating a array would make a script more efficient. So far my timing tests have not been conclusive. Should the following work?

my $preAllocDummy = 5000;

# Fake call in order to pre-allocate the array.
example();

# Real call.
example();

sub example {
   # Trying a subtle pre-allocation hack.  Supposedly the space
   # allocated for @array will be preserved for later calls.
   my @array;
   if ($preAllocDummy) {
      print "Trying to preallocate to max size $preAllocDummy\n";
      $#array = $preAllocDummy - 1;
      $preAllocDummy = 0;
      return;
   }

   # Push stuff on the array here.
}
[download]

Comment on array pre-allocation trick Download Code

Replies are listed 'Best First'.
Re: array pre-allocation trick by rir (Vicar) on Oct 31, 2002 at 02:04 UTC
You have a good idea. The details are tripping you up. Your code on the first call: creates @array, allocates @array, returns destroying @array as it goes out of scope On future calls it just creates @array and starts using it. So your tests should not show any advantage. Move the "my @array;" outside of the function so that it survives from call to call: `{ my $initialized; my @array; sub example { unless ( $initialized) { $#array = $size_needed; $initialized++; } # fill the array } }` [download] Update follows As soon as I submitted the above I realize that may not be what you want. You may be making this way harder than you need to. If you need a new @array on each call. Just preallocate it each time: `sub example { my @array; $#array = $size_needed; # fill array -- this is where you may save time }` [download]	[reply] [d/l] [select]
Re: Re: array pre-allocation trick by strat (Canon) on Oct 31, 2002 at 11:08 UTC
As far as I know, memory-preallocation with arrays generally doesn't improve speed very much. But it often may reduce memory fragmentation... Best regards, perl -e "s>>F>e=>y)\martinF)stronat)=>print,print v8.8.8.32.11.32"	[reply]
Re: array pre-allocation trick by BrowserUk (Patriarch) on Oct 31, 2002 at 02:25 UTC
~~A nice way of keeping things together when pre-allocating arrays is to use the fact that my is a function and do the declaration and preallocation in the same statement like this.~~ Update: I swear it worked 10 minutes ago when I tested it.But I tried it in a perl shell, and first time I tried it I did `$#{\@a} = 50; print scalar @a;` which printed 51. Great! Then I thought "Ah! Should have used my", so I recalled the line, and added the my, and again it printed 51.........so I posted. My little perl shell has its uses, but ...... Thanks to sauoq who brought me to book. ~~$#{\my @array} = 100;~~ </Update> The idea can also be extended to pre-allocating hash buckets like this `keys my %hash = 2**8;` [download] Though this is tougher to verify as if you immediatly evaluate the hash in a scalar context `print scalar %hash;` you get 0, which is disconcerting. However, if you then add one key to the hash `%hash= (a=>1); print scalar %hash;` you get `1/256` showing that it did work but simply fails to report the fact until it has at least one key. Nah! Your thinking of Simon Templar, originally played by Roger Moore and later by Ian Ogilvy	[reply] [d/l] [select]
Re: Re: array pre-allocation trick by sauoq (Abbot) on Oct 31, 2002 at 02:43 UTC
Are you sure that trick with array pre-allocating does what you think it does? My short experiment suggests otherwise: `$ perl -le '$#{\my @array}=100;print scalar @array' 0 $ perl -le 'my @array;$#array=100;print scalar @array' 101` [download] Using strict clears things up... `$ perl -le 'use strict;$#{\my @array}=100;print scalar @array' Global symbol "@array" requires explicit package name at -e line 1. Execution of -e aborted due to compilation errors.` [download] The `my` lexically scopes `@array` to the block which you then use as a reference. -sauoq "My two cents aren't worth a dime.";	[reply] [d/l] [select]
Re: array pre-allocation trick by BrowserUk (Patriarch) on Oct 31, 2002 at 07:25 UTC
The benefits of pre-assignment seem to be marginal at best. However, the method by which you assign your values to the array is considerable more important. Unfortunately, the best method is not a simple choice either. ~~If you are filling an array with essentially static data, then assignment to an array slice~~ seems ~~to be optimal,~~ but if you need to generate your data in a loop, building a list prior to slice assignment suddenly becomes your worst option. In this case, indexed assignment in a for loop wins hands down with push beating slice assignment (pre-allocated or no) into last place. This appears to be consistant for a wide range of array size from 100, to 100_000 elements. The results below for a arrays of 10_000 elements seem typical. Of course, this is only part of the story. If your source of data for the array is a very fast process as typified my use of rand below and your arrays large or the re-filling frequent, then the differences could be significant. On the other hand, if the source of the data was an expensive process (RDMS query or similar), then the cost of the assignment becomes insignificant relative to that process and the assignment method should be chosen as that which most closely fits the way in which you aquire the data to put in the array. If you are receiving the data en-bloc as a list or array then using a slice assignment will probably be a win. If you are getting the values 1 at a time, then indexed assignment is probably best though some algorithms, such as where you don't know how much data you are going to receive, then the simplicity and scalability of push is the natural choice. One final thing regarding your sample code. You show the array being declared and filled within a subroutine but you don't indicate how you are intending to return this to the calling code. If efficiency is in anyway important, ensuring that you return a reference to the array and not return the array as a list will probably save you more time than any difference you might achieve through pre-allocation or assignment method combined. Update: Bah! Humbug. Don't Read more... (4 kB) Update: See Abigail-II's following post for the real gen on pre-allocation and assignment method. My benchmark was garbage it seems. Nah! Your thinking of Simon Templar, originally played by Roger Moore and later by Ian Ogilvy	[reply] [d/l]
Re: array pre-allocation trick by Abigail-II (Bishop) on Oct 31, 2002 at 11:10 UTC
Your benchmark is utterly flawed. You aren't measuring what you think you are measuring. The arrays `@empty` and `@prealloc` are lexical. This means, they are known in the file, and only in the file. They cannot be accessed from `Benchmark.pm`. After all, you are passing in a string, not a coderef. The `@empty` and `@prealloc` in the passed strings will be evalled (in the main package) and will hence be package variables (as they aren't `my`'ed). `@::empty` just grows and grows. Meaning you need the allocate more and more memory. Of course it will be slower! It will be enlightening to print the sizes of `@::empty` and `@::prealloc` after the benchmark was run. You aren't measuring the influence of the pre-allocation at all. Except for the very first run of `"assign"`, the `@::empty` array will contain at least `$::size` elements. `(q{data} x $::size)` returns a single string, even in list context. This explains why `"slice"` appears to be much faster with strings than with numerals. The slice shouldn't be a win as you state, because that forces Perl to build two relatively large lists. The following benchmark suggest that preallocating is actually a tad bit slower, and that using slices loses. Assigning is a little faster than pushing. #!/usr/bin/perl use strict; use warnings; use Benchmark qw /cmpthese timethese/; our $size = 100_000; cmpthese timethese (-10 => { push => 'my @arr; push @arr => $_ for 0 .. $::size - 1', assign => 'my @arr; $arr [$_] = $_ for 0 .. $::size - 1', assignpre => 'my @arr; $#arr = $::size - 1; $arr [$_] = $_ for 0 .. $::size - 1', slice => 'my @arr; @arr [0 .. $::size - 1] = (0 .. $::size - +1)', slicepre => 'my @arr; $#arr = $::size - 1; @arr [0 .. $::size - 1] = (0 .. $::size - +1)', } => 'none'); __END__ Rate slicepre slice push assignpre assign slicepre 3.82/s -- -1% -37% -40% -41% slice 3.87/s 1% -- -36% -39% -40% push 6.08/s 59% 57% -- -4% -5% assignpre 6.36/s 66% 64% 5% -- -1% assign 6.44/s 68% 66% 6% 1% -- [download] Abigail	[reply] [d/l] [select]
Re: Re: array pre-allocation trick by Anonymous Monk on Oct 31, 2002 at 16:41 UTC
Thanks everyone for the help. I tried the benchmark Abigail-II did on my machine, with a map thrown in for good measure, and got slightly different results -- strangely, the simple push won! #!/usr/bin/perl use strict; use warnings; use Benchmark qw /cmpthese timethese/; our $size = 100_000; cmpthese timethese (-10 => { push => 'my @arr; push @arr => $_ for 0 .. $::size - 1', map => 'my @arr; @arr = map {$_} (0 .. $::size - 1)', assign => 'my @arr; $arr [$_] = $_ for 0 .. $::size - 1', assignpre => 'my @arr; $#arr = $::size - 1; $arr [$_] = $_ for 0 .. $::size - 1', slice => 'my @arr; @arr [0 .. $::size - 1] = (0 .. $::size - +1)', slicepre => 'my @arr; $#arr = $::size - 1; @arr [0 .. $::size - 1] = (0 .. $::size - +1)', } => 'none'); __END__ Rate map slicepre slice assignpre assign + push map 7.68/s -- -34% -36% -45% -47% + -48% slicepre 11.7/s 52% -- -3% -17% -19% + -21% slice 12.1/s 57% 3% -- -14% -17% + -18% assignpre 14.1/s 83% 20% 16% -- -3% + -5% assign 14.5/s 89% 24% 20% 3% -- + -2% push 14.7/s 92% 26% 22% 5% 2% + -- [download] Here is what I was running: `This is perl, v5.6.1 built for i386-linux Linux cchan.acsys.com 2.4.8-26mdk #1 Sun Sep 23 17:06:39 CEST 2001 i68 +6 unknown` [download]	[reply] [d/l] [select]
Re: array pre-allocation trick by Abigail-II (Bishop) on Nov 01, 2002 at 10:59 UTC
Re: Re: Re: array pre-allocation trick by BrowserUk (Patriarch) on Oct 31, 2002 at 16:56 UTC
Re: Re: Re: Re: array pre-allocation trick by Anonymous Monk on Nov 02, 2002 at 00:24 UTC
Some notes below your chosen depth have not been shown here
Short note by Aristotle (Chancellor) on Oct 31, 2002 at 20:00 UTC
Re: array pre-allocation trick by Anonymous Monk on Oct 31, 2002 at 02:49 UTC
Thank you for your ideas. I realize that I could keep the variable in an outer scope, but what I was trying to do was preallocate space without having it filled up with undefs, so I could use push on the array. My example code should have included a loop like the following inside the subroutine after the "Push stuff on the array here" comment: `for (1..5000) { push @array,$_; }` [download]	[reply] [d/l]
Re: Re: array pre-allocation trick by rir (Vicar) on Oct 31, 2002 at 04:41 UTC
I apologize, I didn't read that comment strictly enough. No, push is going to grow your array. Try something like: `for ( 0..4999) { $array[$_] = $_; }` [download] Although if you really are really tight, you may want to test if a while or C-style loop is faster than building the list of integers.	[reply] [d/l]
Re^2: array pre-allocation trick by Aristotle (Chancellor) on Oct 31, 2002 at 19:22 UTC
Sounds like you want to have a look at the map function, actually. `my @array = map { do_stuff_with($_); $_ } 1 .. 5000;` Makeshifts last the longest.	[reply] [d/l]

Back to Seekers of Perl Wisdom