Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re^2: Could we save the memory occupied by "undef" in an array?

by lightoverhead (Pilgrim)
on Nov 23, 2008 at 03:58 UTC ( [id://725385]=note: print w/replies, xml ) Need Help??


in reply to Re: Could we save the memory occupied by "undef" in an array?
in thread Could we save the memory occupied by "undef" in an array?

Thank you for your reply, the reason I would like to use array is as you said it is fast, however at the expense of eating memory. Also, for these undef elements, I may want to fill in something later.

So,what if I can only use array to implement, and Is there a method or package that can squash the memory occupied by "undef" in an array and return the the positions back when I try to fill something into the position that was occupied by a undef, sort of like autovivification?

  • Comment on Re^2: Could we save the memory occupied by "undef" in an array?

Replies are listed 'Best First'.
Re^3: Could we save the memory occupied by "undef" in an array?
by ikegami (Patriarch) on Nov 23, 2008 at 04:26 UTC

    Before you complain about the speed of a hash, benchmark it. The difference between a hash and an array shouldn't be much. I doubt a sparse array would be any faster.

    Your program is already heavily invested in the performance of hashes. Most objects use hashes as their underlying data structure. Every reference to a package variable results in multiple hash lookups (I think). And every call to a subroutine results in multiple hash lookups (I think).

    Is there a method or package that can squash the memory occupied by "undef" in an array and return the the positions back when I try to fill something into the position that was occupied by a undef, sort of like autovivification?

    The benefit of arrays come from knowing exactly where to look into memory for a given element knowing only a base address and the element index. That becomes impossible if some of the elements are missing.

    Now, there could be something more efficient data structure than a hash for your needs, but accessing it would require calling a subroutine/method. I believe that results in a hash lookup to find the code pointer associated with the subroutine name, so you're worse off than you started.

    On the plus side, arrays use the most efficient of two kinds of undef by default if I read what follows correctly. Instead of creating a new undef SV, uninitialized array elements refer to a global undef constant. That means only sizeof(void*) bytes are wasted per unused slot instead of sizeof(void*) + sizeof(SV).

    >perl -MDevel::Peek -e"$a[2]=undef; $a[3]=3; Dump \@a" SV = RV(0x183c518) at 0x22609c REFCNT = 1 FLAGS = (TEMP,ROK) RV = 0x226db0 SV = PVAV(0x22bfcc) at 0x226db0 REFCNT = 2 FLAGS = () IV = 0 NV = 0 ARRAY = 0x1833f9c FILL = 3 MAX = 3 ARYLEN = 0x0 FLAGS = (REAL) Elt No. 0 # Not allocated Elt No. 1 # Not allocated Elt No. 2 # Allocated but undef SV = NULL(0x0) at 0x225f7c REFCNT = 1 FLAGS = () Elt No. 3 # Allocated and defined SV = IV(0x18287b4) at 0x22606c REFCNT = 1 FLAGS = (IOK,pIOK) IV = 3
      Your program is already heavily invested in the performance of hashes. Most objects use hashes as their underlying data structure. Every reference to a package variable results in multiple hash lookups (I think). And every call to a subroutine results in multiple hash lookups (I think).
      Well, then, think again. :)

      To the first approximation, only packages use hashes for their internal structure, and most package symbol lookups happen only at compile time. (Any stored GV* is post-hash-lookup.) Apart from symbolic references (which are discouraged), the only major operation that historically used hash lookups implicitly at run time was method calls, and I think even that has been heavily optimized these days when the method name is known in advance.

      There are other implicit uses of hashes, but these typically come into play only when you start using higher Unicodes characters in patterns, and even there the internals typically try to avoid using the hash in the common cases.

      In any event, normal sub calls don't use any implicit hashes. They're already too slow as it is...

        Thanks for the correction. I can't figure out what lead me to think function lookups were made at run-time.
      This looks interesting

      Would you mind if I request you to explain that?

      Could you please explain the Dumper output if you have some free time?

      Thanks

        Could you please explain the Dumper output if you have some free time?

        By "I think" and "if I read what follows correctly", I meant to communicate that I'm not clear on the details relevant to this conversation. I've already described the relevant details beyond my understanding.

        The only other relevant details might be FILL = 3 (4 elements are visible) and MAX = 3 (4 elements have been allocated). Perl sometimes allocate a larger array than it needs to speed up future push statements. It also has the option to speed up pop statements by not resizing the allocated block.

        >perl -MDevel::Peek -le"@a=qw(a b c); Dump(\@a, 1); pop @a; Dump(\@a, +1)" SV = RV(0x183c550) at 0x2260b4 REFCNT = 1 FLAGS = (TEMP,ROK) RV = 0x226dbc SV = PVAV(0x22bfcc) at 0x226dbc REFCNT = 2 FLAGS = () IV = 0 NV = 0 ARRAY = 0x1833fc4 FILL = 2 # 0..2 in use MAX = 3 # 0..3 allocated ARYLEN = 0x0 FLAGS = (REAL) SV = RV(0x183c550) at 0x2260a8 REFCNT = 1 FLAGS = (TEMP,ROK) RV = 0x226dbc SV = PVAV(0x22bfcc) at 0x226dbc REFCNT = 2 FLAGS = () IV = 0 NV = 0 ARRAY = 0x1833fc4 FILL = 1 # 0..1 in use MAX = 3 # 0..3 allocated ARYLEN = 0x0 FLAGS = (REAL)

        Would you mind if I request you to explain that?

        Did you ask the same question twice, or is this a different "that"?

Re^3: Could we save the memory occupied by "undef" in an array?
by JavaFan (Canon) on Nov 23, 2008 at 12:19 UTC
    So,what if I can only use array to implement, and Is there a method or package that can squash the memory occupied by "undef" in an array and return the the positions back when I try to fill something into the position that was occupied by a undef, sort of like autovivification?
    Why won't a hash do if you don't want to spend the memory? Why do you think someone will have written a module that basically does what a hash does?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://725385]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (3)
As of 2024-04-25 23:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found