Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Devel::Size reports different size after hash access

by Cristoforo (Curate)
on Oct 25, 2016 at 19:40 UTC ( [id://1174724]=perlquestion: print w/replies, xml ) Need Help??

Cristoforo has asked for the wisdom of the Perl Monks concerning the following question:

While working on a problem, I came across something I couldn't explain.

When asking for the size of a newly created hash, it gave me one size. But after accessing the hash, it gave a size almost 50% larger.

#!/usr/bin/perl use strict; use warnings; use Devel::Size 'total_size'; my $s = 'AAAAAAAAAAAAAAA'; my %hash = map {$s++ => 1} 1 .. 1000; print total_size(\%hash). ' ' . keys(%hash) . "\n"; open my $fh, '>', 'j1.txt' or die $!; for my $key (keys %hash) { print $fh "$key $hash{$key}\n"; } print total_size(\%hash). ' ' . keys(%hash) . "\n";
The results of running this code was:
105248 1000 145288 1000

I'm using perl version 5.014 and the version of Devel::Size is .08.

Replies are listed 'Best First'.
Re: Devel::Size reports different size after hash access
by BrowserUk (Patriarch) on Oct 25, 2016 at 20:25 UTC

    The problem is that you are using the numeric values of the hash in a string context: print $fh "$key $hash{$key}\n";; thus perl converts what were IVs (internal integers) into PVs (internal strings), caching the result in the expectation you might use them in a string context again.

    As an IV, each value requires 24 bytes, but once converted to a string and stored in a PV, each value requires ~56 bytes.

    You can avoid the conversion being cached by making perl convert a temporary value to a string for printing like this:

    printf $fh "$key %d\n", 0+$hash{$key};

    That will avoid the memory growth.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Devel::Size reports different size after hash access
by GrandFather (Saint) on Oct 25, 2016 at 20:20 UTC

    You store a number then access it as a string so Perl caches the stringised version of the number along with the number. Try quoting the value so it's stored as a string instead of as a number to check (works for me - size is stable).

    Premature optimization is the root of all job security
Re: Devel::Size reports different size after hash access
by stevieb (Canon) on Oct 25, 2016 at 20:19 UTC

    I am not experienced enough to know most of the inner details, but it's fair to say that perl does a lot of work internally when the items are first accessed. If you change to Devel::Peek, you can see how the structure changes after access:

    use strict; use warnings; use Devel::Peek qw(Dump); my $s = 'AAAAAAAAAAAAAAA'; my %hash = map {$s++ => 1} 1 .. 2; print "before:\n"; Dump \%hash; open my $fh, '>', 'j1.txt' or die $!; for my $key (keys %hash) { print $fh "$key $hash{$key}\n"; } print "after:\n"; Dump \%hash;

    Output:

    before: SV = IV(0x1e5cca8) at 0x1e5ccb8 REFCNT = 1 FLAGS = (TEMP,ROK) RV = 0x1e88420 SV = PVHV(0x1e63ef0) at 0x1e88420 REFCNT = 2 FLAGS = (PADMY,SHAREKEYS) ARRAY = 0x1ed7880 (0:6, 1:2) hash quality = 125.0% KEYS = 2 FILL = 2 MAX = 7 Elt "AAAAAAAAAAAAAAA" HASH = 0xa39c9065 SV = IV(0x1e7a9a8) at 0x1e7a9b8 REFCNT = 1 FLAGS = (IOK,pIOK) IV = 1 Elt "AAAAAAAAAAAAAAB" HASH = 0x7747aa6e SV = IV(0x1e7a990) at 0x1e7a9a0 REFCNT = 1 FLAGS = (IOK,pIOK) IV = 1 after: SV = IV(0x1e883e0) at 0x1e883f0 REFCNT = 1 FLAGS = (TEMP,ROK) RV = 0x1e88420 SV = PVHV(0x1e63ef0) at 0x1e88420 REFCNT = 2 FLAGS = (PADMY,OOK,SHAREKEYS) ARRAY = 0x1e75110 (0:6, 1:2) hash quality = 125.0% KEYS = 2 FILL = 2 MAX = 7 RITER = -1 EITER = 0x0 RAND = 0x69863a90 Elt "AAAAAAAAAAAAAAA" HASH = 0xa39c9065 SV = PVIV(0x1e7f860) at 0x1e7a9b8 REFCNT = 1 FLAGS = (IOK,POK,pIOK,pPOK) IV = 1 PV = 0x1ed7810 "1"\0 CUR = 1 LEN = 16 Elt "AAAAAAAAAAAAAAB" HASH = 0x7747aa6e SV = PVIV(0x1e7f878) at 0x1e7a9a0 REFCNT = 1 FLAGS = (IOK,POK,pIOK,pPOK) IV = 1 PV = 0x1e98b90 "1"\0 CUR = 1 LEN = 16

    Here's a brief on memory usage.

      It's not the "first access" that does it, it's the "give it to me different". Stored as a number and accessed as a string in the OP's case.

      Premature optimization is the root of all job security
      stevieb

      Thanks for the nice link to Mastering Perl. It was helpful and I learned from all the replies given here. In a thousand years, I don't think I would have caught the distinction between using the hash value as a string vs. a number.

Re: Devel::Size reports different size after hash access
by Cristoforo (Curate) on Oct 25, 2016 at 22:05 UTC
    Thanks to all who gave me valuable information. It never crossed my mind about the distinction between a number accessed as a string versus a number. I will examine stevieb's reply more when I get back home.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1174724]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (4)
As of 2024-03-28 22:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found