http://qs321.pair.com?node_id=11139338

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

How is most efficient way to allocate some memory for global array ?
firstly array max is 40 - 41 KB which is pushed beyond then
thanks in advance

Replies are listed 'Best First'.
Re: Best way to allocate some memory first
by Fletch (Bishop) on Dec 02, 2021 at 22:52 UTC

    You could do something like assigning to the last element ($array[39_999] = undef;), but if you're so worried about efficiency that handling 40k items is an issue perhaps perl isn't the best implementation choice. If you gave an Short, Self-Contained, Correct Example that might help with more specific suggestions.

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

        I found a test script from 20 years ago here Re: array pre-allocation trick (top of thread array pre-allocation trick) and tweaked it to add a version explicitly assigning to the last item (with "last" in the name). Machine is of course faster so the rate's much different, but the numbers are maybe similar.

        #!/usr/bin/perl use strict; use warnings; use Benchmark qw /cmpthese timethese/; our $size = 100_000; cmpthese timethese( -10 => { push => 'my @arr; push @arr => $_ for 0 .. $::size - 1', assign => 'my @arr; $arr [$_] = $_ for 0 .. $::size - 1', assignpre => 'my @arr; $#arr = $::size - 1; $arr [$_] = $_ for 0 .. $::size - 1', assignlast => 'my @arr; $arr[ $::size - 1 ] = undef; $arr [$_] = $_ for 0 .. $::size - 1', slice => 'my @arr; @arr [0 .. $::size - 1] = (0 .. $::siz +e - +1)', slicepre => 'my @arr; $#arr = $::size - 1; @arr [0 .. $::size - 1] = (0 .. $::size - ++1)', slicelast => 'my @arr; $arr[ $::size - 1 ] = undef; @arr [0 .. $::size - 1] = (0 .. $::size - ++1)', } => 'none' ); __END__ Rate slicepre slice slicelast assignpre assign assignla +st push slicepre 124/s -- -10% -11% -21% -27% -2 +8% -35% slice 138/s 11% -- -0% -12% -19% -2 +0% -28% slicelast 138/s 12% 0% -- -11% -18% -2 +0% -28% assignpre 156/s 26% 13% 13% -- -8% - +9% -19% assign 169/s 37% 23% 22% 8% -- - +2% -12% assignlast 172/s 39% 25% 24% 10% 2% +-- -10% push 192/s 55% 39% 39% 23% 13% 1 +1% -- ## Same comparisons node 209382 on my machine (maybe bit more apples t +o apples) Rate slicepre slice assignpre assign push slicepre 123/s -- -10% -21% -25% -36% slice 137/s 11% -- -12% -17% -29% assignpre 156/s 27% 14% -- -5% -19% assign 165/s 34% 20% 6% -- -14% push 192/s 56% 40% 23% 17% --

        The cake is a lie.
        The cake is a lie.
        The cake is a lie.

Re: Best way to allocate some memory first
by BillKSmith (Monsignor) on Dec 03, 2021 at 02:16 UTC
    You probably should not even think about array allocation until you know that your program is to slow and a profiler such as Devel::NYTProf suggests that this is a major source of your problem.
    Bill
Re: Best way to allocate some memory first
by choroba (Cardinal) on Dec 02, 2021 at 23:11 UTC
    Crossposted to StackOverflow. It's considered polite to inform about crossposting so that people not attending both sites don't waste their efforts solving a problem already solved at the other end of the internets.

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Re: Best way to allocate some memory first
by davido (Cardinal) on Dec 03, 2021 at 16:15 UTC

    Can you demonstrate code that is taking too long because the array hasn't been pre-allocated? Seeing that would help in making suggestions on how to mitigate the problem.


    Dave

      I actually think you would be hard pressed to even demonstrate an advantage in pre-allocating the array.

      Update: The short answer is even with five million elements, and growing the non-pre-allocated array one element at a time, pre-allocating is a little SLOWER. What follows is my testing methodology and findings.

      If I build up an array by pre-allocating, then adding elements one by one within the range of indexes that was preallocated, and then build up another array just by adding elements one by one, I can compare elapsed times. When I do, the non-preallocation method seems to win every time, presumably because there's one less statement to execute. Here's an example that attempts to isolate just the parts that involve preallocation and building up of the array's elements:

      #!/usr/bin/env perl use strict; use warnings; use Time::HiRes qw(time); use Devel::Size qw(size); my $array_size = 5_000_000; my @array0; my @array1; print "Testing preallocation.\n"; my $t0 = time(); $#array0 = $array_size-1; for (my $i = 0; $i < $array_size; $i++) { $array0[$i] = $i; } my $t0_delta = time() - $t0; print scalar(@array0), " elements. Last element value is $array0[-1]\n +"; printf "Array size is %-0.2f megabytes\n", size(\@array0)/1024/1024; printf "With preallocation: %-0.4f seconds\n", $t0_delta; print "\nTesting without preallocation.\n"; my $t1 = time(); for (my $j = 0; $j < $array_size; $j++) { $array1[$j] = $j; } my $t1_delta = time() - $t1; print scalar(@array1), " elements. Last element value is $array1[-1]\n +"; printf "Array size is %-0.2f megabytes\n", size(\@array1)/1024/1024; printf "Without preallocation: %-0.4f seconds\n", $t1_delta;

      On my system the output is:

      Testing preallocation. 5000000 elements. Last element value is 4999999 Array size is 38.15 megabytes With preallocation: 0.3005 seconds Testing without preallocation. 5000000 elements. Last element value is 4999999 Array size is 44.55 megabytes Without preallocation: 0.2775 seconds

      As you can see, without preallocation is actually faster. One might wonder why I didn't use push. The reason was that push was adding elements to the end of the preallocated array, skipping past the part that had been preallocated. So that raises one other reason not to preallocate (aside from it not actually being faster): It may trick you.

      It's interesting that the size of the non-preallocated array is larger in terms of memory consumption than the preallocated one. This is both interesting and can be explained by explaining why preallocation isn't significantly faster than building the array up incrementally.

      It would be grossly inefficient to grow memory for an array one element at a time. The reason is that as an array grows, if a new contiguous chunk needed to be allocated every time, the contents of the previous contiguous chunk would need to be copied over to the new one, for each new element. When Perl allocates space for arrays, it leaves the array space to grow to avoid this issue. I don't know the growth plan it uses, but let's say initially Perl leaves enough room for 50 elements. When those 50 are used, Perl allocates a chunk large enough for 100 elements and copies the array into that new chunk. When that is filled, it allocates enough space for 200, and so on. These numbers are totally made up, but the point is it almost always allocates enough space for the array to grow past its current size.

      In the case of pre-allocation, the array growth algorithm hasn't applied. You requested a million elements, the actual array size in memory will be a million elements. Not 1.2 million (allowing for growth without further allocations). So in terms of memory consumption, the pre-allocation method was a little more efficient. But in terms of speed, the growth algorithm is already good enough to make the time it takes to grow an array inconsequential when compared to the time it takes to interpret and execute the statement that preallocates the array.

      So the pre-allocation microoptimization, even for an array of five million elements, is useless if your goal is to improve speed. Perl's array growth algorithm is similar to a C++ std::vector container class. And has similar time complexity characteristics. For adding elements at the end, allowing the array to grow as needed, the operation is considered O(1) amortized time.

      If speed is an issue, profile your code with Devel::Profile and solve the hotspots. Array allocation isn't going to be one of them.

      I modified the code above to remove "extra" stuff and to put the guts into a main() subroutine so that I could profile that subroutine using Devel::NYTProf. When I profiled, I see that the loop over the pre-allocated array takes 601 milliseconds. And the loop over the non-preallocated takes 603 milliseconds. However, the preallocation step takes 6.69 milliseconds. So the preallocated array operation (allocating, then iterating) takes a total of 608ms, versus 603 for non-preallocated. Therefore, my assertion earlier, that the act of pre-allocating actually consumes more wallclock time interpreting and processing the line of code than just letting Perl allocate as needed.

      # spent 1.30s (1.30+49µs) within main::main which was called: # once (1.30s+49µs) by main::RUNTIME at line 33 sub main { 8 1 600ns my $array_size = 5_000_000; 9 10 1 200ns my @array0; 11 my @array1; 12 13 1 12µs 1 6µs my $t0 = time(); # spent 6µs making 1 call to Time::HiRes::time 14 1 6.69ms $#array0 = $array_size-1; 15 1 601ms for (my $i = 0; $i < $array_size; $i++ +) { 16 $array0[$i] = $i; 17 } 18 1 21µs 1 5µs my $t0_delta = time() - $t0; # spent 5µs making 1 call to Time::HiRes::time 19 20 1 2µs 1 300ns my $t1 = time(); # spent 300ns making 1 call to Time::HiRes::time 21 1 603ms for (my $j = 0; $j < $array_size; $j++ +) { 22 $array1[$j] = $j; 23 } 24 25 1 16µs 1 6µs my $t1_delta = time() - $t1; # spent 6µs making 1 call to Time::HiRes::time 26 27 1 26µs 1 18µs print scalar(@array0), " elements. + Last element value is $array0[-1]\n"; # spent 18µs making 1 call to main::CORE:print 28 1 12µs 1 9µs printf "With preallocation: %-0.4f +seconds\n", $t0_delta; # spent 9µs making 1 call to main::CORE:prtf 29 1 4µs 1 2µs print scalar(@array1), " elements. L +ast element value is $array1[-1]\n"; # spent 2µs making 1 call to main::CORE:print 30 1 89.3ms 1 3µs printf "Without preallocation: %- +0.4f seconds\n", $t1_delta; # spent 3µs making 1 call to main::CORE:prtf 31 } 32 33 1 7µs 1 1.30s main(); # spent 1.30s making 1 call to main::main

      You can do this too: See the documentation for Devel::NYTProf.


      Dave

Re: Best way to allocate some memory first
by LanX (Saint) on Dec 02, 2021 at 22:53 UTC
Re: Best way to allocate some memory first
by Marshall (Canon) on Dec 03, 2021 at 22:11 UTC
    I would not consider 41 KB as "large".

    Some years ago I spent literally days benchmarking and working with pre-sizing hash structures. Perl starts off with 8 hash "buckets". Every time that the hash doubles in size, 8,16,32,64, buckets etc. Perl has to re-visit every item in the hash to find it's new "hash index" (that's an entry in the hash table which is a pointer to a linked list).

    My conclusion was that pre-sizing a hash table to hold say 100K hash keys didn't matter at all in the scheme of things. I don't remember all of the initial "bucket array" sizes that I experimented with.

    A major factor in my analysis was, that the time spent actually using the hash completely dwarfed the time spent creating it in the first place. I took out the pre-sizing because it made no significant application perceptible difference.

    Perl itself will call low level C functions to do its internal data structure management. These are "very fast" functions. Yes, I can write an ASM function that will beat C memcpy(), but so what? (memcpy()is good C code, it uses byte operations to align onto a memory boundary and then proceeds from there. The compiler, except at extreme optimization levels, cannot recognize that there is a specialized ASM instruction that is much faster at these block memory operations.) Again, so what?

    Anyway, I am very skeptical that "pre-sizing a 41K Perl array" will make any difference at all in terms of overall application performance.

    Update: I don't think that this will achieve the intended purpose, because undef is a value. This will mess things up when trying to decide how many elements are in the array now.
    I don't know what Perl "magic" is. But I would guess that when your program initializes, if you use a loop to set every element to 1 and then use a loop to set every element to undef, that you will wind up with an array with memory allocated for each element. You can't even blink as fast as this will happen. Benchmark the heck of this and let us know what happens.