Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re^4: Could we save the memory occupied by "undef" in an array?

by JavaFan (Canon)
on Nov 23, 2008 at 21:30 UTC ( [id://725473]=note: print w/replies, xml ) Need Help??


in reply to Re^3: Could we save the memory occupied by "undef" in an array?
in thread Could we save the memory occupied by "undef" in an array?

It could mark a range as empty without actually filling in each bucket.
Yes, and then you'd need a data structure to keep track of which ranges are empty. Wait, I got it! We'd use a hash to keep track which elements in the array are 'valid'! But then, you might as well use a hash to begin with.

Furthermore, it doesn't save any memory. Even if you don't initialize in the first 1000 elements if you do:

my @array; $array[1000] = 1;
you still need to allocate space for 1001 pointers. Initialized or not.
I just tried it. Setting element 10000 of an array takes 0.35 MB more RAM than setting element 0 on perl 5.8.8 for OS X on Intel. So, there's a cost, although not a large one.
That's about 36 bytes/element. More that I expected. An undefined value takes 20 bytes in Perl. A pointer takes 4 bytes (32-bit platform). So, even if undefined values aren't shared, I'd expect less.

However, I cannot recreate this:

$ perl -MDevel::Size=:all -wE '$a[1] = 1; say total_size \@a' 148 $ perl -MDevel::Size=:all -wE '$a[10000] = 1; say total_size \@a' 40136
which is a difference of about 40000 bytes, aka 10000 pointers. Replacing Devel::total_size with <c>`grep VmRSS /proc/$$/status` confirms the difference of 40000 bytes.

Replies are listed 'Best First'.
Re^5: Could we save the memory occupied by "undef" in an array?
by perrin (Chancellor) on Nov 24, 2008 at 01:13 UTC
    There are many ways to implement an array in C. You don't have to allocate pointers immediately for every bucket. It's irrelevant though, since Perl's implementation clearly does something that causes empty buckets to take up space.

    40000 bytes is close enough to the 0.35 MB I saw that inaccuracies in measurement (I glanced at Apple's system monitor to check it) cover it. Looks like 5.10 has the same cost as 5.8.8.

      There are many ways to implement an array in C.
      Really? And for decades, I assumed there was just one way: just line up the elements of the array in consecutive memory locations.

      And that's how Perl implements arrays as well. A sequence of pointers to SVs in consecutive memory locations. Each pointer taking 4 (32-bit platform) or 8 (64-bit platform) bytes.

      You don't have to allocate pointers immediately for every bucket.
      Not quite sure what you mean by 'allocate pointers'. But even if you don't initialize the array elements (which in the Perl cause would mean, not fill the slot with a pointer to an SV - but note that since Perl doesn't keep track of which pointers are valid and which aren't, it has to), you still have to allocate memory for it. To get to $a[10000] fast, Perl will use the pointer found 40000 (80000 on 64-bit platforms) from the start of the array.
        Trading some speed and adding some complexity to an array implementation to save memory on sparse arrays is a pretty straightforward concept. It's obviously not how Perl does it, based on the report of ps. No big surprise, since Perl normally favors speed over memory.

      40,000 is about 1/10th of 0.35 MB (358,400)


      Perl reduces RSI - it saves typing
        Oh, right, off by one decimal place. But ps also shows a difference of ~400KB, so it's consistent with Apple's tool (which probably uses the same APIs). This is RSS when the program is in memory, with a sleep(10) keeping it alive.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://725473]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (4)
As of 2024-04-24 12:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found