Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^3: Could we save the memory occupied by "undef" in an array?

by perrin (Chancellor)
on Nov 23, 2008 at 18:24 UTC ( [id://725456]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Could we save the memory occupied by "undef" in an array?
in thread Could we save the memory occupied by "undef" in an array?

It could mark a range as empty without actually filling in each bucket.

I just tried it. Setting element 10000 of an array takes 0.35 MB more RAM than setting element 0 on perl 5.8.8 for OS X on Intel. So, there's a cost, although not a large one.

  • Comment on Re^3: Could we save the memory occupied by "undef" in an array?

Replies are listed 'Best First'.
Re^4: Could we save the memory occupied by "undef" in an array?
by JavaFan (Canon) on Nov 23, 2008 at 21:30 UTC
    It could mark a range as empty without actually filling in each bucket.
    Yes, and then you'd need a data structure to keep track of which ranges are empty. Wait, I got it! We'd use a hash to keep track which elements in the array are 'valid'! But then, you might as well use a hash to begin with.

    Furthermore, it doesn't save any memory. Even if you don't initialize in the first 1000 elements if you do:

    my @array; $array[1000] = 1;
    you still need to allocate space for 1001 pointers. Initialized or not.
    I just tried it. Setting element 10000 of an array takes 0.35 MB more RAM than setting element 0 on perl 5.8.8 for OS X on Intel. So, there's a cost, although not a large one.
    That's about 36 bytes/element. More that I expected. An undefined value takes 20 bytes in Perl. A pointer takes 4 bytes (32-bit platform). So, even if undefined values aren't shared, I'd expect less.

    However, I cannot recreate this:

    $ perl -MDevel::Size=:all -wE '$a[1] = 1; say total_size \@a' 148 $ perl -MDevel::Size=:all -wE '$a[10000] = 1; say total_size \@a' 40136
    which is a difference of about 40000 bytes, aka 10000 pointers. Replacing Devel::total_size with <c>`grep VmRSS /proc/$$/status` confirms the difference of 40000 bytes.
      There are many ways to implement an array in C. You don't have to allocate pointers immediately for every bucket. It's irrelevant though, since Perl's implementation clearly does something that causes empty buckets to take up space.

      40000 bytes is close enough to the 0.35 MB I saw that inaccuracies in measurement (I glanced at Apple's system monitor to check it) cover it. Looks like 5.10 has the same cost as 5.8.8.

        There are many ways to implement an array in C.
        Really? And for decades, I assumed there was just one way: just line up the elements of the array in consecutive memory locations.

        And that's how Perl implements arrays as well. A sequence of pointers to SVs in consecutive memory locations. Each pointer taking 4 (32-bit platform) or 8 (64-bit platform) bytes.

        You don't have to allocate pointers immediately for every bucket.
        Not quite sure what you mean by 'allocate pointers'. But even if you don't initialize the array elements (which in the Perl cause would mean, not fill the slot with a pointer to an SV - but note that since Perl doesn't keep track of which pointers are valid and which aren't, it has to), you still have to allocate memory for it. To get to $a[10000] fast, Perl will use the pointer found 40000 (80000 on 64-bit platforms) from the start of the array.

        40,000 is about 1/10th of 0.35 MB (358,400)


        Perl reduces RSI - it saves typing

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://725456]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (4)
As of 2024-04-18 18:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found