Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Good metrics for Memoization

by Trizor (Pilgrim)
on Aug 08, 2007 at 20:36 UTC ( [id://631401]=perlquestion: print w/replies, xml ) Need Help??

Trizor has asked for the wisdom of the Perl Monks concerning the following question:

Last night in #perl vhold and I got to talking about a module that automatically determined good memoization candidates and took the appropriate action.

I'm now seriously considering writing this module, but I'm having some trouble coming up with what should be memoized, as opposed to the Memoize docs which tell what shouldn't be memoized.

So Monks, I put it to you: what are measurable criteria for memoization? Memoizing everything that passes the don't memoize tests seems impractical, inefficent, and an overzealous application of memoization.

As an aside, I'm open to module name suggestions because I'm not too good at naming things...

Update Fixed typo in title

Replies are listed 'Best First'.
Re: Good metrics for Memoization
by suaveant (Parson) on Aug 08, 2007 at 20:55 UTC
    What you are talking about is nearly impossible, since just because a function returns the same data for the same parameters 99 times, it doesn't mean it will on the hundredth...

    Take for example a function today_plus_x in a long running script... all of today today_plus_x(1) returns 2007/08/09... but after midnight it will return 2007/08/10... assume someone is calling this 1000 times a day... obviously it could use some help, but blanket memoization isn't going to do it.

    Probably your best be would be something like Devel::DProf, something which analyzes your running code and wraps all function calls... at the end you print out a list of the calls that had multiple calls that returned the same data for the same inputs, maybe combined with how long a typical call takes, then you can suggest memoizations, but it'll have to be up to the programmer to decide if it is appropriate on a case by case basis.

                    - Ant
                    - Some of my best work - (1 2 3)

      I fail to see the impossibility of this task. With B you can examine the optree of a given sub and see what it does, its calls, variable accesses, complexity, returns. It would be very easy to see that today_plus_x depended upon variable data from another sub. The classifier would simply examine the calls made and note that today_plus_x calls something not in an explicit list of safe-to-call subs (a time function of some sort) and then throw it out. The problem I face is that simply excluding subs will result in too broad an application of memoization, so I need reasons to include a function in memoization rather than exclude it.

        In a language such as Haskell where you can identify impure functions by their type signatures, you know which ones you can memoize and which you can't. Perl's not quite as easy.

        You might be looking for data flow analysis, which is not as easy as it sounds. (For example, proving terminating conditions in the general case is a hard problem.) You can make certain approximations, though.

        Well... good luck. :) It just seems to me that the realm of Perl programming is too varied and creative to be mapped in any easy way... you may write something that works in 99% of cases but that 1% of misses will be good enough to keep it out of production code.

        I am interested to hear the take of those more learned than I however... I am by no means an expert on the subject of Perl internals.

                        - Ant
                        - Some of my best work - (1 2 3)

Re: Good metrics for Memoization
by BrowserUk (Patriarch) on Aug 09, 2007 at 06:23 UTC

    Lots of subroutines get called many times but with different parameters, or maybe just one different parameter each time. Any auto-memoisation of such subroutines would just accumulate space and never provide benefit.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://631401]
Approved by Joost
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (2)
As of 2024-04-20 02:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found