Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re: On timely destruction?

by jepri (Parson)
on Aug 28, 2002 at 03:05 UTC ( [id://193343]=note: print w/replies, xml ) Need Help??


in reply to On timely destruction?

Could someone please explain why counting references is so bad? I have heard many people bag out perls GC but I don't really see the problem. I appreciate the 'circular references' issue, but it does seem to be a fairly minor issue dealt with by avoiding it or manually breaking the circle. How often do people use circular refs?

I take it there is no clever way to spot a circular reference?

This is all in contrast to my lisp manual in which I just read "Don't worry about occasional pauses, it's just the GC firing". Eeek!

Does delayed destruction give some advantage like reusing the memory more quickly?

____________________
Jeremy
I didn't believe in evil until I dated it.

Replies are listed 'Best First'.
Re: Re: On timely destruction?
by Elian (Parson) on Aug 28, 2002 at 04:33 UTC
    Reference counting has a number of problems:
    1. Circular garbage leaks
    2. It's horribly error-prone
    3. It requires a lot of code
    4. It's slow
    5. It blows your cache
    To take those in more detail:

    1 It's not tough to get circular structures at the user (i.e. the application programmer) level. This means that either a program leaks, or each application that potentially has to generate circular structures has to kill them.

    2 For refcounting to work right, every increment must have a matching decrement. Historically (which is to say, in every release of perl I've seen) this doesn't happen. Sometimes its only a few variables, sometimes it's a lot of variables. Either way, there's leakage. It also means that every extension writer must get refcounting right. For anything other than trivial extensions, they don't.

    3 To support refcounting, each and every time a variable is inserted into a container or removed from a container the refcount must be changed. This is a lot of code. Go browse through the perl 5 source (And don't forget about extensions) to see how often this happens. This is essentially makework code--it does almost nothing useful, and is entirely bookkeeping.

    4 Reference counting is the slowest form of garbage collection for all but the most extremely degenerate code. Reference counting requires you to spend time touching each variable at least twice. This is all your variables, not just the ones that are alive. Tracing garbage collectors (which include the mark & sweep, compacting, and generational style collectors) generally only touch live data. Yes, it's only a quick increment or decrement, but it's a lot of them. Refcount schemes usually spend about twice as much time doing garbage collection work than tracing collectors.

    5 Refcounting makes the internal data structures and code less dense, and that means that your program spends more time waiting on main memory. (Fetching data from RAM can cost up to 200 cycles on some systems) Your code is less dense, since you have refcount twiddling code all through it. Your data is also less dense, since you need to have space for refcounts in it. And even worse, it means more of your data structure has to be touched when its dealt with. This screws your L1 cache. Generally data comes into your CPU's L1 cache in small chunks--8 or 16 bytes. You want to use as few of these chunks as you can since, while they're fastest to access (usually a few cycles at most) there's not much of it. And loading a cache line can cost 10 or 20 cycles if the data you're looking for is in your L2 cache. (And 200 or more if it isn't) The less memory that needs to be touched to do something means the more efficiently the cache is used and the faster your program runs.

    Tracing collectors, which is what parrot's using, generally have different characteristics.

    • No mainline code does anything with refcounts or other 'liveness' bookkeeping data. That means the mainline code is denser, with more of it actually dedicated to executing your program.
    • Internal data structures are either smaller (if the GC bookkeeping fields are kept out of band) or at least use less cache (since while still in the structure they aren't touched and thus fetched into L1 cache).
    • There's less code dedicated to GC itself and, while a tracing garbage collector isn't exactly trivial, it's not a big deal either. More to the point, the code required to do the GC is collected together in one single spot, rather than sprinkled througout the entire codebase. (It took me only a few hours to write the first cut of Parrot's GC)
    • Tracing collectors take less time to execute. Since they make more efficient use of the cache (there's only a small chunk of code doing GC, so it gets pinned in L1 cache while it runs, and it generally restricts itself to a smallish chunk of memory when it runs so it gains some cache locality that way) you waste less time on the memory bus, and since tracing GCs only look at the live data, their runtime's proportional to the amount of live data you have, generally a much smaller number than the number of data objects you've allocated since the last GC run.
    So, tracing collectors are generally faster, less error-prone, have less code, and put less burden on the people writing the core and extensions in C. The one (and only) advantage that refcounting has is deterministic destruction. (Assuming no bugs, of course) Other than that it's a massive pain to deal with for the internals folks.

    All of that is why we don't like refcounts. :)

    BTW: Your LISP book's numbers may well be really, really old. Much of GC's bad reputation for pauses come from benchmarks run 20 or more years ago, on VAX 11/780 machines with 2M of RAM and an RA60 as a swap disk. These days things run rather a lot faster. A quick run of one of parrot's benchmarks shows it did 43K GC runs in 22 seconds on my machine, in addition to the sample program's real work. That's not bad... (Though, yes, it does argue that the sample program needs more work to generate less garbage)

      Thanks for the detailed explanation. Now I can see what you are trying for.

      To put the quote in context, it came from the manual for the Lisp machine on my Palm Pilot. Which is probably only slightly faster than the VAX you quote in your example.

      ____________________
      Jeremy
      I didn't believe in evil until I dated it.

        The palm's a bit faster than an old 11/780. (Which is more than 25 years old now) It's also possible that the Lisp you're using considers the entire memory part of its root set, which can slow things down a lot, as can poor structure allocation policies. (Parrot's dodged both those issues)

        It may also just be that the GC on that version of lisp sucks, and the folks writing it are used to long pauses while running. :)

      Assorted comments:

      I've read for years that the field of gc is quite mature, and some killer algorithms exist (way beyond simple mark&sweep).

      How is access kept local? Chasing all the pointers would hit all that main memory, even if the marks are collected somewhere central.

      I like the idea of incremental garbage collection, so there will be no pauses.

      What about the case of local variables that never have their ref taken, and are not inside a closure that itself gets "taken"? If detected at compile-time, those can be stacked most efficiently and not generate garbage at all.

      On a machine with VM, must figure out when is "out" of memory! I can use gobs more than the machine has, transparantly to the program. Perhaps moniter page fault frequency? Best to check for garbage before a page gets swapped out at all.

      —John

        In order:
        • Yes, there are a number of algorithms well-past the old mark & sweep. Throwing their names around isn't much use since most people are barely (if at all) familiar with m&s, let alone generational or threaded collectors of various sorts
        • Base structures are all allocated out of arenas, and most pointers to other base structures come from structures in the arena, so we're not walking through all of memory, instead usually being restricted to the arenas.
        • We don't currently have an incremental collector, though there's no reason we can't have one in the future. (The easy stuff first, after all... :)
        • The compilers are certainly within their rights to explicitly destroy variables. Given how introspective perl can be, though, that's not always wise. (With, say, caller %MY peeking and such)
        • We allocate memory out of pools. Out of memory for us is when we have no more pool space, not when the system has no more memory handy. (Memory usage can be restricted this way as well, but that's a side benefit)

      What a wonderfully clear response. Might I suggest a quick cut and paste into the Parrot FAQ?

        Good point. I'll get it PODded up and added to the FAQ.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://193343]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (5)
As of 2024-04-23 15:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found