Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Execution order of END/CHECK vs BEGIN/INIT

by belden (Friar)
on Jun 27, 2003 at 20:38 UTC ( [id://269734]=perlquestion: print w/replies, xml ) Need Help??

belden has asked for the wisdom of the Perl Monks concerning the following question:

The following bit of code surprised me:

#!/usr/bin/perl BEGIN { print "1st BEGIN\n" } BEGIN { print "2nd BEGIN\n" } BEGIN { print "\n" } INIT { print "1st INIT\n" } INIT { print "2nd INIT\n" } INIT { print "\n" } CHECK { print "1st CHECK\n" } CHECK { print "2nd CHECK\n" } CHECK { print "\n" } print "1st code\n" ; print "2nd code\n" ; print "\n" ; exit; END { print "1st END\n" } END { print "2nd END\n" } END { print "\n" }
The output:
# begin output - this line not really printed by code above 1st BEGIN 2nd BEGIN 2nd CHECK 1st CHECK 1st INIT 2nd INIT 1st code 2nd code 2nd END 1st END # end output - this line not really printed by code above

perlmod explains why we see "2nd CHECK" and "2nd END" before "1st CHECK" and "1st END":

You may have multiple END blocks within a file--they will execute in reverse order of definition; that is: last in, first out (LIFO)...Similar to END blocks, CHECK blocks are run just after the Perl compile phase ends and before the run time begins, in LIFO order.

perlmod also explains that BEGIN and INIT blocks are executed in first in, first out (FIFO) order.

But why do BEGIN and INIT blocks happen in FIFO order, and CHECK and END blocks happen in LIFO order? This is obviously a conscious design decision in the language- and must be better than all happening in FIFO order(0).

So, what gives? Any insight?

(0) - I say "must be better" because making END happen in LIFO order using -n or -p flags is referred to as a "degenerate case" by perlmod.

blyman
setenv EXINIT 'set noai ts=2'

Fixed 'readmore' - dvergin 2003-06-27

Replies are listed 'Best First'.
Re: Execution order of END/CHECK vs BEGIN/INIT
by tilly (Archbishop) on Jun 28, 2003 at 05:18 UTC
    There is indeed a reason for this.

    Suppose that you have code with an END block. Suppose that your code uses a module which also has a END block that cleans up the module's external dependencies (eg database connections) so that the module is then unusable.

    Which order do we want to run these END blocks? Well we want your user code because you might save something to the database, then we want the module unload second. Where do we think that these things will probably be placed? Well the module use will likely be at the top of your code, and your END block at the bottom of your code. So the END block that we want to execute first is likely to be the one that we saw second. Which means that LIFO is most likely to be the heuristic that gets it right.

    Refering to Programming Perl, 2nd Ed on page 283 they say, You may have multiple END blocks in a file -- they will execute in reverse order of definition; that is last in, first out (LIFO). That is s that related BEGINs and ENDs will nest the way you'd expect, if you pair them up. Which is another variation of what I described. You can pair the initialization and cleanup code either in the file or in modules, and later code (which might expect to have the earlier initialized stuff there) will be entirely nested between the two.

    Sure it seems odd. But it is what is most likely to do The Right Thing. (Which Perl always tries to do.)

    UPDATE: I had said that the Camel's case was both a special case, and a generalization of what I described. That didn't make much sense - it is a variation.

      Like the man said, there had to be a reason:) I would dispute that it is a good one though. Apart from the lack of intuativity (Is that a word?), there are just so many ways to break this.

      The heuristic can be summed up as:

      • I want this piece of code in my main program to be the very last thing executed, so where do I put it?
      • Well, before anything that you want to execute before it.
      • So, I put it at the top of the program?
      • Well, no. If you do that then it will be executed after the END blocks in any packages you use which might not be what you want, so put it at the top of the program, except after any use statements for packages that might have END blocks that need to execute after your END block.
      • But how do I know if the modules I use need to execute their END blocks after I execute mine?
      • Read the source Luke. And hope that the authors thought to document the need.
      • Come to think of it, why might they need to do that?
      • Well, the module might class data shared by all its instances that need to be cleaned up. This cannot be done as a part of any individual instances DESTROY method, so it has to be done in a END block.
      • Ah! But that goves me a problem. My program is converting some irreplaceable flat-file legacy data to DB format. I want to ensure that the file gets deleted after it has been successfully input, and that is what I am going to do in my END block, but I need to ensure that the data has been succesfully flushed to teh DB first. If there is any chance that the connection will fail and the data I stored is lost or corrupted, then I don't want to delete the file. How do I handle this?
      • You've got backups of the files haven't you:)

      A bit contrived but, it still seems more than slightly weird to me. Sort of makes me hanker for the simplicity of old-time basic's line numbered code:)


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller


        Have to disagree. Always found END block behaviour completely intuitive. Indeed, it never occurred to be that anybody would want it to work in any other way.

        I don't often have multiple END blocks, but when I do I want them to run LIFO. Either because later blocks need the context that earlier blocks clean up, or because early blocks need some state that is only known to later blocks.

        Yes, you can come up with situations where they don't do what you want - but I would argue that would be because you're trying to use them for the wrong sort of thing ;-)

        It is a heuristic. Which is to say that it is a fancy way of saying that it doesn't really work. If you want to be a real control freak, it is possible to manufacture cases in which you want operations to go in any order you could possibly ask for, and there is no way for Perl to meet every theoretically possible case.

        In fact if you want some real room for foot shootage, just look at exec and exit. If someone happens to include those in the code, your END blocks don't get to run at all...

        The heuristic is that your BEGIN blocks are for initializations that you want to happen early, and END blocks are for final cleanup. Furthermore anything that appears after you might need your functionality, therefore your initializations need to happen before them, and your cleanup has to occur after them. Therefore BEGINs are FIFO and ENDs are LIFO.

        Now to your bullets.

        I want this piece of code in my main program to be the very last thing executed, so where do I put it? What is your reason for wanting it to be last? Control of program flow? END blocks are not meant to be part of normal program flow. If you try to use them for what they aren't meant to do, it is no surprise that you can cause yourself confusion and pain. If not control of program flow, then what? Well probably cleanup. In which case see above. You put it after any initializations that you want available in your END block, either in your code or in modules that you load.

        Well, before anything that you want to execute before it. Regular code that appears after it will execute before it (if it executes at all). You only need to worry about its placement vs other END blocks. And there it mostly does the right thing.

        So, I put it at the top of the program? You place it at the point in the program where it is obvious that it will need to be run eventually. Which is generally directly after any initializations that it needs to cleanup, and we like this because putting related code together makes synchronization errors less likely.

        Well, no. If you do that then it will be executed after the END blocks in any packages you use which might not be what you want, so put it at the top of the program, except after any use statements for packages that might have END blocks that need to execute after your END block. Did you want to insist on it executing last, or merely to clean yourself up? END blocks have been thought through as a way to clean yourself up. I have yet to see a practical complaint about them in practice. (Now if you want to complain, go take a look at the calling of DESTROY in global destruction, every so often I need to explain why that messed up to people and tell them how to fix it with doing their cleanup in an END block...)

        But how do I know if the modules I use need to execute their END blocks after I execute mine? You should not need to know whether they have END blocks or not. Place your END after your initialization, and your initialization after loading any desired functionality, and your END blocks normally will have whatever functionality is reasonable. However there is an interesting case here when the functionality that you need might be AUTOLOADed at runtime, and the AUTOLOAD might add an END block of its own. In this case you will want to be very careful to make sure that the functionality that you need is exercised before Perl sees your END block. Which means that you either make sure the initialization is in a BEGIN block, or put the initialization in regular code and wrap your END in an eval. But note that I have yet to see someone ask a question indicating that they tripped up on this, and even in this pathological case the principle of putting the END block right after all initializations have happened is precisely what you want to do.

        Read the source Luke. And hope that the authors thought to document the need. If the authors used END blocks as intended, correct usage will be obvious. (Just put your END right after your initializations and don't worry about it...) If the authors chose to miscode their module and put an Easter egg in the END without warning, well this is but the smallest of ways in which bad code can cause problems for the person using it.

        Come to think of it, why might they need to do that? Normally because they have some state that they want to be properly cleaned up?

        Well, the module might class data shared by all its instances that need to be cleaned up. This cannot be done as a part of any individual instances DESTROY method, so it has to be done in a END block. Gotcha alert. Your instances might want to use that data in their own DESTROY methods. But they might be in global variables somewhere that is not cleaned up until global destruction, which happens after END blocks. If this is a problem (I have seen it be occasionally) then the module will want to also manage all of its instances and finalize them during the END phase. (If you want access to virtually any other data, including your own internal variables, then you want to do this. Ilya does have a patch which is in 5.8 IIRC which has a heuristic that mostly gets global destruction right, but it isn't perfect.)

        Ah! But that goves me a problem. My program is converting some irreplaceable flat-file legacy data to DB format. I want to ensure that the file gets deleted after it has been successfully input, and that is what I am going to do in my END block, but I need to ensure that the data has been succesfully flushed to teh DB first. If there is any chance that the connection will fail and the data I stored is lost or corrupted, then I don't want to delete the file. How do I handle this? You have irreplacable data which you are going to allow to be automatically deleted by possibly buggy code in the middle of execution? That would seem to be your biggest problem right there... But we shall suppose that the coder has good reasons for wanting to do this (umm..you are out of space and management refuses to buy backup media, OK, attempting to live with a PHB, I sympathize), how do you accomplish the act? Well in that unfortunate case I would decide on how I am going to track success/failure, and then in my END block, wrap my unlink in an if ($is_success) {...} block.

        You've got backups of the files haven't you:) Before doing anything automatic and possibly nasty with data, I insist on having backups. I know I am human. I have messed up often enough to not trust myself, and I definitely know better than to trust someone else who has not yet learned to take proper precautions. In summary. It is a heuristic. It can theoretically go wrong. But I have yet to see the order of execution of END blocks to not do what is desired in real code if the END block is placed directly after the initialization that it cleans up. Unlike, say, global destruction. Or even the ability of people to unexepectedly eliminate the END phase with an exit or exec. (If you use END blocks, make sure to plead with the sometime C coders to not call exit...)

      <blink><blink>

      Wow. Perfect explanation. Thanks. ++tilly

      blyman
      setenv EXINIT 'set noai ts=2'
Re: Execution order of END/CHECK vs BEGIN/INIT
by ctilmes (Vicar) on Jun 27, 2003 at 20:49 UTC
    Think about nested objects. You create A, then using A, you create B. When you go to destroy B, you may call some A methods, then finally destroy B, so you want A to still exist while B is finalizing itself.

    Always do "startup" stuff in FIFO and "shutdown" stuff in LIFO and everything nests nicely.

      I'd think that timely destruction of nested objects would be taken care of by DESTROY{}'s happening in the right order. Can you expand your point? I don't see how it applies to END{} blocks...

      blyman
      setenv EXINIT 'set noai ts=2'

        I think he was making an anology.

        If module A "uses" module B, then you want modula A's BEGIN block to run before modules B's BEGIN block. When cleanup happens, you want module B's END block to run before module A's END block.

        Update: ... I thought i knew what i talking about .. but i clearly didn't, i'm not even sure what i was trying to say anymore. so i retract it all (except for the analogy part)

        If the objects survive to global destruction, there's no guarantee of destruction order. Perl just sweeps the arenas and blows everything away in whatever order it finds things in.
Re: Execution order of END/CHECK vs BEGIN/INIT
by BrowserUk (Patriarch) on Jun 28, 2003 at 04:51 UTC

    I have to agree with you that this is a strange decision. BEGINs happening FIFO is obviously correct. The first one encountered is executed first, the second next etc.

    With ENDs, intuatively, you would want think that the last one in the file would be the last one executed, the second from last, second from last etc. Ie. In LILO - Last in-Last out order.

    I wonder if this comes down to someone missing the fact that LILO is actually the same as FIFO, and not FILO?

    I'd rate this as a bug. No amount of documentation would persuade me this is correct. The fact the behaviour changes when certain command line switches are applied further persuades me that it is wrong.

    If I ever need to make use of multiple END or CHECK blocks, I might even consider trying to find a way of applying the -n or -p switch to the script and 'disabling' the effects within the script so as to get the correct behaviour of the CHECK/END blocks:)


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller


Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://269734]
Approved by traveler
Front-paged by traveler
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (3)
As of 2024-04-24 04:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found