Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Is modifying the symbol table to redefine subroutines evil?

by tlm (Prior)
on Apr 11, 2007 at 21:47 UTC ( [id://609501]=perlquestion: print w/replies, xml ) Need Help??

tlm has asked for the wisdom of the Perl Monks concerning the following question:

Is this evil:
sub foo { call_me_only_once(); no warnings 'redefine'; *foo = sub { ... }; # the final foo goto &foo; }

?

Update: inserted the missing "no warnings" line.

the lowliest monk

Replies are listed 'Best First'.
Re: Is modifying the symbol table to redefine subroutines evil?
by Limbic~Region (Chancellor) on Apr 11, 2007 at 21:59 UTC
    tlm,
    Depends on your definition of evil and your perspective. Messing with the symbol table like this can certainly be useful. I think a more common version of call_me_only_once() for this would be:
    sub foo { no warnings 'redefine'; if ($cond1) { *foo = sub { ... }; } elsif ($cond2) { *foo = sub { ... }; } else { *foo = sub { ... }; } goto &foo; }
    The objective being to determine which version of foo() to use (perhaps XS, module, roll-your-own) only once so that subsequent calls are faster.

    Cheers - L~R

      call_me_only_once() was meant to represent code that must be executed only on the first call to the function. I hadn't thought of the scenario you present, but I can see that the self-redefinition technique could be useful there too.

      the lowliest monk

Re: Is modifying the symbol table to redefine subroutines evil?
by perrin (Chancellor) on Apr 11, 2007 at 22:12 UTC
    Yes. Next question.

    Seriously, what's wrong with a much easier approach like this?

    our $FOO_HAS_BEEN_CALLED; sub foo { if (not $FOO_HAS_BEEN_CALLED) { call_me_only_once(); $FOO_HAS_BEEN_CALLED = 1; } }
    Or maybe this code belongs in a BEGIN block.
      Yes. Next question.
      Next question: why?
      Seriously, what's wrong with a much easier approach like this?

      Somebody could mess with the package global $FOO_HAS_BEEN_CALLED and blow things up, and call_me_only_once() gets called twice. Of course, call_me_once() disposing of itself properly after being called would make damn sure it isn't a second time :-)

      --shmem

      _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                    /\_¯/(q    /
      ----------------------------  \__(m.====·.(_("always off the crowd"))."·
      ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
        Why? Because it solves a simple problem in a complex and convoluted way.

        Somebody could mess with the package global $FOO_HAS_BEEN_CALLED and blow things up

        The "somebody could mess with it" argument has never been valid for Perl. Of course somebody could mess with it! Somebody could mess with absolutely everything in your program. If you want to prevent possible accidental changes, you could make it a lexical instead and let foo() be a closure around it. But no one mentioned security as the reason for this.

        I still think it sounds like setup code that should be in some kind of BEGIN or INIT block.

        Actually, there's no reason for it to be a package variable. I would have used a lexical. And bah! If someone shoots themselves in the foot by messing with internal state variables, it's their own foot. It's not like someone could accidently change the value of the variable.
      Seriously, what's wrong with a much easier approach like this?

      It could be much easier to read:

      if (not $FOO_HAS_BEEN_CALLED) {

      ... if it were instead:

      unless ($FOO_HAS_BEEN_CALLED) {

      Otherwise, it seems reasonable.

        Now that you mention it, I usually handle this kind of bail-out conditional with a simple test at the top:
        return if $foo_has_been_called;
        Then you don't have to indent the whole rest of the sub.
    ~dewey
        Here it is without globals:
        { my $foo_has_been_called; sub foo { if (not $foo_has_been_called) { call_me_only_once(); $foo_has_been_called = 1; } } }

      There's nothing wrong with this approach. (I certainly hope there isn't because my own code is filled with instances of this pattern.) But it has always annoyed me to saddle a function with a test that is useful only once in its lifetime. It's a perverse sense of aesthetics, I admit it, but that is, at any rate, the motivation behind the gyrations... I was just wondering whether there were fundamental problems with the approach (other than its neurotic convolutedness).

      the lowliest monk

        There's nothing wrong with this approach. In fact, my own code is filled with instances of this pattern. But it has always annoyed me to saddle a function with a test that is useful only once in its lifetime. It's a perverse sense of aesthetics, I admit it, but that is, at any rate, the motivation behind the gyrations...

        Yeah, I'm concerned too with "this kinda things": see for example Doing "it" only once which was brought up here by Limbic~Region, but inspired by a post I made in p6l. There, the situation was somewhat lower level in that it boiled down not to a matter of mangling the symtable, but even the optree itself. Unfortunately most people didn't get it in both places, that is p6l and here: just pointing out that there are many WTDI and asking "what's wrong with this one?" Both me, LR and those who actually understood are perfectly aware that there are many ways around it, but still find them all somewhat unsatisfactory for some reason.

        To sum up what was going on in that thread, the mythological beast we were/are after is a statement modifier (or something like that) specifying that the statement itself must be executed once only, and then evaporate like it never existed, not being skipped upon a condition. There are fundamentally two categories of replies pointing out ways to do it without the mythological beast:

        1. use a counter and skip over the statement when it's positive;
        2. have a version of the loop with the statement that exits the loop when the statement is executed and then continue with a similar loop that does not contain the statement.

        In the former approach you check a condition also after you're sure it won't be true anymore a priori, and you don't want it to be. In the latter you avoid this with some sort of code duplication that in certain circumstances may also require you some code factorization that would be otherwise unnecessary. Currently I would personally use the former when I'm not concerned about performance and the latter when I am, with a preference for the latter even in the first cases because I find it conceptually and aesthetically disturbing to perform unnecessary operations even if any overhead they introduce may be negligible. I long for a solution that gives me the best of both solutions on the base of conceptual and aesthetical considerations in that it should be syntactically simple like the first one and avoid unnecessary operations, like the second.

        Somebody else, in that thread, pointed out that manipulations like those we were talking about were very common in early languages, but are now regarded as Evil™, and for good reasons that I can understand, the point still being that just like raw goto is also Evil™, Perl provides tamed forms of goto under the names of last, next and redo which are not and nevertheless make for very useful common Perl idioms. So, much in the same vein I do not long for generic means to achieve self-modifying code to be made excessively accessible, but for a tamed, specific, form of self-modifying code available as a teaspoonful of syntactic sugar simple enough to use from the typing-wise POV to make it elegant and appealing, and distinctive enough not to make it possible for one to use it inadvertently with the risk of being bitten in the neck by it.

        I was just wondering whether there were fundamental problems with the approach (other than its neurotic convolutedness).

        FWIW I find "its neurotic convolutedness" not to be terribly more relevant than the simple condition checking version.

        I can't think of a more fundamental problem with any code than being more complex than it needs to be, except maybe giving the wrong answer.
Re: Is modifying the symbol table to redefine subroutines evil?
by dewey (Pilgrim) on Apr 11, 2007 at 22:19 UTC
    I sort of like it myself-- abusable, but that's no reason to condemn it. It reminds me of make-doer in Forth (the best resource on this I've found is http://www.forthfreak.net/thinking-forth.pdf).

    Update: Actually, it's more commonly called doer/make or doer-make. The overall concept is referred to as "vectored execution" in the book I linked to. Book link fixed, thanks blazar.
    ~dewey
Re: Is modifying the symbol table to redefine subroutines evil?
by shmem (Chancellor) on Apr 11, 2007 at 22:20 UTC
    It is certainly not, but your purpose in writing that could be :-)

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
Re: Is modifying the symbol table to redefine subroutines evil?
by NetWallah (Canon) on Apr 11, 2007 at 23:33 UTC
    Assuming code-tampering is not an issue, how about this:
    my $foo; # Global, because this is the only visible (subroutine ref) $foo=sub{ print qq[First call - do first processing here \n]; # First processing code goes here ... # Now setup for subsequent calls .... $foo=sub{ print qq[subsequent calls\n] # Code for subsequent calls goes here } }; $foo->() for 0..5; # OK - so the subroutine call looks slightly weird +to newbies.
    Unfortunately, Murphy's laws will kick in if the caller decides to use a copy/clone of the $foo scalar, so don't do that.

         "Choose a job you like and you will never have to work a day of your life" - Confucius

Re: Is modifying the symbol table to redefine subroutines evil?
by educated_foo (Vicar) on Apr 12, 2007 at 05:44 UTC
    No. Perl makes this possible, and it's easy to understand (most people have probably already seen it in an AUTOLOAD). Perl's a lot more fun when you worry less about what "the authorities" will think of your code.
Re: Is modifying the symbol table to redefine subroutines evil?
by belden (Friar) on Apr 14, 2007 at 00:33 UTC
    Modifying the symbol table can be useful: at work we used a similar trick to write Test::Resub. (Its use is slightly different from what you're doing, but you might like to give its source a quick read.)

    If call_me_only_once() really should only be called once, then why not solve the problem there instead of in its caller? (You've already seen how to enforce single execution by scoping a variable outside the sub.)

    Addressing the problem there would mean you wouldn't need to worry about bar() making a call to call_me_only_once(); presumably your program knows the answer to that question already.

    --Belden

Re: Is modifying the symbol table to redefine subroutines evil?
by doom (Deacon) on Apr 13, 2007 at 19:31 UTC
    In your example it would be mildly evil, because it's excessively complicated for what you want to do (any speed advantage you get from it is unlikely to be big enough to matter).

    In principle it isn't though -- I know one team of programmers that likes to use a trick like this to swap in mock code during testing.

Re: Is modifying the symbol table to redefine subroutines evil?
by sfink (Deacon) on Apr 15, 2007 at 15:42 UTC
    Depends on what you want it to do. You need to be aware that:
    • It won't work if it's in a module that you're importing elsewhere (because you end up changing Module::foo, which is irrelevant to the ModuleUser::foo that was imported).

      You can change that with

      no strict 'refs'; my $pkg = caller; *{ $pkg . "::foo" } = sub { ... }; # the final foo
      but only if that's what you want ("call once per user" instead of "call once globally").
    • There's no easy way to reset the trigger after it's fired. For your application, this is probably what you want, but you could imagine scenarios where it is not (eg mod_perl).
    ikegami's lexical variable version gives "call once globally" semantics, without allowing resets (which you can trivially add by introducing another subroutine that captures the same lexical.) Same for the global variable, except resetting can be done directly.

    Personally, I like the idiom

    sub foo { our $CALL_COUNT; call_me_only_once() unless $CALL_COUNT++; ...do stuff... }
    But I'll admit I always wonder whether the code is going to get run 4.3 billion times and redo the init code...
Re: Is modifying the symbol table to redefine subroutines evil?
by dragonchild (Archbishop) on Apr 15, 2007 at 16:38 UTC
    A completely different way to solve this is to use Aspect-Oriented Programming (AOP). Your problem is that you want to do something when foo is called, but before foo executes and only the first time foo executes.

    Aspect would be the Perl way to do this, but I don't know how to remove advice with it, so I'll demonstrate using Dojo. This is a Javascript library.

    function foo () { ... }; function call_only_once () { dojo.event.disconnect( 'before', foo, call_only_once ); // Do stuff here }; dojo.event.connect( 'before', foo, call_only_once ); foo();

    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
Re: Is modifying the symbol table to redefine subroutines evil?
by halley (Prior) on Apr 13, 2007 at 16:49 UTC
    This approach to modifying the symbol-table after a first call is roughly the technique that we used when we implemented a deprecated module: use deprecated;
    package OldeCrufte; use deprecated qw(do_hack); # calling OldeCrufte::do_hack() will carp
    package OldeCrufte; use deprecated; # using the OldeCrufte module will carp

    --
    [ e d @ h a l l e y . c c ]

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://609501]
Approved by planetscape
Front-paged by andye
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (7)
As of 2024-04-18 03:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found