Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re^2: No garbage collection for my-variables

by kyle (Abbot)
on Sep 15, 2008 at 20:47 UTC ( #711538=note: print w/replies, xml ) Need Help??


in reply to Re: No garbage collection for my-variables
in thread No garbage collection for my-variables

I don't see the current behaviour changing until someone completes a perl with a garbage collector instead of the current refcounting scheme.

The OP is saying that you can allocate a large string, let the variable go out of scope, and the memory is not freed and not reused. The memory allocated to the variable "sticks" to it even if you never use it again. (If I have this wrong, betterworld, please correct me.)

I don't see what garbage collection has to do with this. The strings in question don't have any references to them, so the reference counter shouldn't have any problem knowing that they're not in use.

I don't know what method perl uses to grow strings. The general method I recall from my CS classes was to double the size of a string when it grows out of its buffer and halve it when it shrinks to less than a quarter of the buffer size. Maybe someone more familiar with the internals can shed some light on why that wouldn't be a good design choice for Perl.

  • Comment on Re^2: No garbage collection for my-variables

Replies are listed 'Best First'.
Re^3: No garbage collection for my-variables
by Joost (Canon) on Sep 15, 2008 at 20:55 UTC
    I don't see what garbage collection has to do with this. The strings in question don't have any references to them, so the reference counter shouldn't have any problem knowing that they're not in use.
    Reference counting has everything to do with it, since it means that the only time perl can free the memory is when the last reference to the scalar goes out of scope. All without knowing if that scalar is every going to be reused.

    That means it either has to keep it there always, or free it always (or do some kind of heuristic, which should usually mean keep it, since allocating memory is expensive, and if you're using a large string now, chances are, you'll be using a large string again some time soon).

    What perl currently cannot do, is free "old, unused" scalars when it's running out of memory. It has to decide when the scalar is going out of scope. allocating and freeing each scalar every time that happens would probably slow down the interpreter a lot.

      That means it either has to keep it there always, or free it always (or do some kind of heuristic, which should usually mean keep it, since allocating memory is expensive, and if you're using a large string now, chances are, you'll be using a large string again some time soon).

      Especially for a large string I wonder in how far the gain of avoiding the deallocation is considerable with regard to filling the buffer with the string and working with it.

      Of course you're right: I shouldn't copy 500MB strings around too much, however I chose such a drastic length to make the effect clear. Even if the strings were smaller, I'd say I could use 1MB of memory for better things than storing 32 long-forgotten scalars of 32kB each (or even smaller ones). I am not really good at making up realistic scenarios, but I'm interested to know: Would you have anticipated perl's behaviour if you had just seen my code samples above?

      I'm glad that kyle seems to agree that it would be nice if perl dealt better with these unused scalars. Besides, it obviously doesn't reuse the buffer if a subroutine calls itself recursively... but I haven't tested the memory consumption for this case yet.

      Oh, I think I see what you're saying now. When the last reference goes out of scope is the only time it gets to make a decision about whether to deallocate the memory used for the variable. We don't necessarily know then whether the variable will be used again or not, or what for, so it's not a very good time to make that decision.

      I'm not sure I'm convinced that deallocating would be a bad thing. Obviously, it depends on what the program is doing. I'd be tempted to take some kind of heuristic approach, but even then I'd want to do some testing to find where the cost/benefits are.

        But it wouldn't be terribly hard to implement some improvements here.

        Perl could keep a LRU of still-allocated but unused "large" SVs and periodically free ones that haven't been re-used in the last period of time.

        Note that what might look like the easiest "fix", freeing SVs for lexical variables if they are above a certain size, could have serious draw-backs, at least on some systems. Having done something like this on Win32, this can be a great way for heap fragmentation to cause your process to run out of virtual memory. Perhaps most other systems have smarter malloc()s and so aren't susceptible, but I'm not certain of that.

        It also might be tricky to pick the proper parameters for what consistutes "large" and what the right minimum duration should be before the large allocation is declared "unlikely to be re-used" and free()d.

        - tye        

Re^3: No garbage collection for my-variables
by ikegami (Pope) on Sep 16, 2008 at 08:03 UTC

    The strings in question don't have any references to them

    Not true. The pad that refers to them when the function is being executed still refers to them when the function isn't being executed.

    It could be changed to be true, so this nit pick is not relevant to the conversation.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://711538]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (5)
As of 2020-11-26 21:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?