Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Garbage Collection on Hash delete

by netoli (Initiate)
on Jan 12, 2003 at 12:23 UTC ( [id://226251]=perlquestion: print w/replies, xml ) Need Help??

netoli has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I am using a lot of hash structures and would be interested if it makes a difference for the GC how the hash is cleared/deleted.

1.Approach  %hash=() assign empty hash
Question: will the GC recognize this and free the memory ?

2. Approach (is this necessary ?)

foreach $key (keys %hash) { delete ( $hash {$key} ); }

This works but is longer.
So what is the recommended approach ?

Thanks for the wisdom
netoli

Replies are listed 'Best First'.
Re: Garbage Collection on Hash delete
by BrowserUk (Patriarch) on Jan 12, 2003 at 12:47 UTC

    From perlfunc:delete

    The following (inefficiently) deletes all the values of %HASH: foreach $key (keys %HASH) { delete $HASH{$key}; } foreach $index (0 .. $#ARRAY) { delete $ARRAY[$index]; } And so do these: delete @HASH{keys %HASH}; delete @ARRAY[0 .. $#ARRAY]; But both of these are slower than just assigning the empty list or und +efining %HASH or @ARRAY: %HASH = (); # completely empty %HASH undef %HASH; # forget %HASH ever existed @ARRAY = (); # completely empty @ARRAY undef @ARRAY; # forget @ARRAY ever existed

    Examine what is said, not who speaks.

    The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.

Re: Garbage Collection on Hash delete
by Zaxo (Archbishop) on Jan 12, 2003 at 12:32 UTC

    Assignment, %hash = (); is quicker and neater. It may be neater yet to make the hash lexical and just let it go out of scope:

    { my %hash = ( foo => 'bar' ); # do stuff }
    I generally use delete only when I know I want what remains of the hash later on.

    After Compline,
    Zaxo

      Thanks. This is a neat approach you showed.
      In my case I have a hash with a lot of entries which is used in as object and I want to reuse the hash object afterwards.
      So I cannot use your approach.
      Any hints ?
      netoli
Re: Garbage Collection on Hash delete
by Coruscate (Sexton) on Jan 12, 2003 at 12:39 UTC

    If you are following good programming guidelines, then you don't even have to worry about this. What do I define as 'good programming guidelines' in regards to this case? Put simply: the use of subroutines. If you split up your scripts into several subroutines as any well-behaving programmer would do, then your structures will not exist for long, as perl's garbage collecting schemes will take care of everything on its own.

    Just minimize/vaporize your use of global package variables and you'll be happier than an ape with a bunch of bananas (I was thinking along the lines of 'more fun than a barrel of monkeys', with 'happy' instead of 'fun').

    Update: As bart mentions in his reply, the actual area to focus on is scope rather than subroutines. This is what I was implying by the use of well-structured scripts with well-scripted subroutines. Subroutines themselves contain within themselves a scope, so that once execution of a subroutine is complete, the scope is exited as well. IMO, subroutines are simply the best way to enforce scope, as opposed to using many { my $a = "hi"; } type blocks. As an additional comment, packages themselves are another great way of implementing scoping. If you've got a lot of code you reuse all the time, create a package out of it. A thank you goes out to bart for further clarifying.

      Actually, the focus should be on variables' scope, not on the subroutines themselves. Scope in Perl can be tuned much finer than that.
Re: Garbage Collection on Hash delete
by CountZero (Bishop) on Jan 12, 2003 at 12:58 UTC

    Everything of course depends on what you mean by free the memory.

    PERL will not release any memory back to the operating system as long as the script runs.

    Of course if you are careful in using variables (esp. hashes and arrays) and "release" them one way or another as soon as they have served their purpose, PERL will not have to needlessly allocate additional memory for itself.

    CountZero

    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

      Perl will release memory back to your C runtime library. Depending on what that is, memory may well get released back to the system. This has worked for years on VMS and MacOS, and reasonably recent versions of glibc will also release memory in some circumstances back to the system.

        Well I have to work within a Windows environment and use pre-compiled versions of PERL, so I don't know if my version of PERL (Activestate) is bright enough to release memory back to the pool.

        CountZero

        "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

      CountZero got the point
      =>PERL will not release any memory back to the operating system as long as the script runs
      My script is should run forever and that's why I cannot just terminate the programm or go out of a block.
      Will Perl release the memory if delete is used ?
      netoli
        So long as you don't use excessive amounts of memory, it shouldn't be a problem. Any way you clear the hash, the memory will be returned to Perl's pool. Even though Perl doesn't give any of its pool back to the OS, it will recycle memory from the pool for later use in the script. So if you accumulate a lot of data in the hash, the Perl process will grow by (say) 1MB, and not shrink again even if you clear the hash. However, if you fill the hash again, the Perl process will not grow again - it will simply reuse the memory it had allocated before.

        Makeshifts last the longest.

Re: Garbage Collection on Hash delete
by MarkM (Curate) on Jan 12, 2003 at 13:00 UTC

    Perl will reclaim memory for later use regardless of which way you dereference the data structures.

    If you are looking to see the memory usage drop, you may be disappointed to find that Perl does not always release memory back to the system. It does this on the assumption that you may use the memory again later.

    In terms of the 'recommended' approach, it is really your preference, although I would strongly suggest you avoid the "delete each key" approach, as it is not very efficient."

    Try out the suggestions made by other people, and see which one you feel comfortable with. Stick with it for a while. Experiment.

      Thanks all for the discussion.
      I checked it out. If you just assign an empty list it will release the memory.
      netoli
Re: Garbage Collection on Hash delete
by Elian (Parson) on Jan 13, 2003 at 18:24 UTC
    Perl will, as people have noted, generally clean up memory as need be, and you don't usually have to worry about it. As has also been noted, occasionally you may want to do a
    undef %foo;
    if %foo has had a lot of elements in it. This is because perl does some caching and, while all the elements inside %foo are freed, the data structures perl has allocated for %foo itself will not normally be freed, as perl assumes that if you stuffed a half-zillion things in it once that you'll do it again. undeffing the array/hash/scalar forces perl to dump all the cache for the variable, which can be a significant amount of memory if you've put a million elements into the hash at one point.
Re: Garbage Collection on Hash delete
by rdfield (Priest) on Jan 13, 2003 at 10:54 UTC
    A word of warning: when using deeply nested HoH type structures with hundreds of key values per hash, the memory de-allocation process can take an inordinate amount of time (40 minutes in one case I had recently).

    rdfield

      Did that happen with perl5.6 or perl5.6.1?
      Some months ago I had the same problems with perl5.6 and 5.61 and decided to downgrade to perl5.005_03 (was the same behaviour under Solaris, Linux and Win2k (=AS 522)) and got RAM usage decreased from 800 MB to 500 MB, and Runtime from about 1:20 to 0:30. The data structure was a Hash of Hashes of Arrays (a simplified LDAP structure).

      But I'd like to run some tests about the behaviour of perl5.8...

      Best regards,
      perl -e "s>>*F>e=>y)\*martinF)stronat)=>print,print v8.8.8.32.11.32"

        My results were when using AS623 (5.6.0) on a 2GHz P4 with 512MB RAM. The data structure was a HoHoHoHoHoHoA (a total of approx 2.4M elements over ~ 600000 arrays), but sharing the RAM with a Oracle database (about 150MB resident in memory). Watching the datastructure destruct I saw 140000+ page faults. I'm looking to move the software to a Linux box, Perl 5.8 and put the DB on another box, and see how long things take.

        rdfield

      This is due entirely to a glibc bug in versions of glibc before 2.3. It had horrible performance when dealing with lots of small memory deallocations, and it will kill perl's performance in some cases. The easy solution is to upgrade your glibc if you can, which doesn't require any action for perl as it's a dynamically linked library, or build perl with perl's malloc if you can't.
        TVM. I experienced the problem with W2K, but I'll try it with a handy Linux box and get back to you.

        rdfield

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://226251]
Approved by rob_au
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (5)
As of 2024-04-24 08:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found