Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re^4: Summing numbers in a file

by jcb (Parson)
on Jun 01, 2020 at 02:59 UTC ( [id://11117543]=note: print w/replies, xml ) Need Help??


in reply to Re^3: Summing numbers in a file
in thread Summing numbers in a file

That recent discussion involved "pseudo-lexical" file handles using local. I agree that that is a bad idea, but maintain that bareword file handles are reasonable in top-level code. (Modules do not normally contain top-level code.)

While the typo-catching features of use strict are helpful, you should not be using file handle names that are that easily confused in the first place. In particular, FH is suitable for examples of I/O code, but should not be used in real programs as a global. Global file handles should have meaningful names. For example, I recently wrote code that imports a text-format package manifest into a database; the file is read using a handle named MANIFEST.

bareword filehandles clash with package names

I presume that is the origin of the convention of always writing global file handles in all UPPERCASE, since package names are (with few exceptions, like UNIVERSAL) always mixed-case (or lowercase for pragmas) by convention?

Replies are listed 'Best First'.
Re^5: Summing numbers in a file
by haukex (Archbishop) on Jun 01, 2020 at 06:17 UTC

    First of all, note that all of this is in the context of what advice to give wisdom seekers. You're of course free to code however you like.

    That recent discussion involved "pseudo-lexical" file handles using local.

    I know, but many of the issues discussed still apply. And again, I'll point out that lexical filehandles solve all of the issues discussed here. I'll also ask the same thing as I did in that thread: I've named some disadvantages, what are the advantages that you see to using bareword filehandles?

    bareword file handles are reasonable in top-level code. (Modules do not normally contain top-level code.)

    The issue is not where the code is, i.e. whether it's "top-level" or not, it's action at a distance: a module may load another module that may load another module that may do something that clashes with a global the main code is using; those issues are not fun to debug.

    While the typo-catching features of use strict are helpful, you should not be using file handle names that are that easily confused in the first place.

    Sorry, but how is this argument different from "you don't need strict as long as you don't make typos"?

    Let me pull together several quotes from your replies in this subthread and add some emphasis to try to point out a theme:

    In this example, I would say that the problem is not the use of a global file handle, but the main script placing its code into package Foo and calling frobnicate incorrectly.

    The use of subroutine prototypes would either make the bug in frobnicate obvious...

    The real problem in the contrived example is calling a subroutine with the wrong number of arguments.

    which is normally limited to the main script because modules typically provide subs but do not execute code upon loading ... Modules do not normally contain top-level code.

    please do not actually do that in production code, or at least very clearly document

    ... you should not be using file handle names that are that easily confused in the first place.

    In particular, FH is suitable for examples of I/O code, but should not be used in real programs as a global.

    Global file handles should have meaningful names.

    I presume that is the origin of the convention of always writing global file handles in all UPPERCASE, since package names are (with few exceptions, like UNIVERSAL) always mixed-case (or lowercase for pragmas) by convention?

    Of course the normal convention is that everyone should write correct, bug-free code! ;-P Update: Just to be clear, the theme I see here is that you seem to be placing a lot of expectations on people to write correct code, when simply using lexical filehandles easily provides protection from the issues. /Update

    (By the way, Prototypes are often discouraged now except when used to change how subroutine calls are parsed.)

    Speaking of your other post:

    In a case where the file handle is intended to be an "environment parameter" to a subroutine, global file handles are the only option

    Sorry, but I don't get this - what do you mean with an "environment parameter"? And I very strongly disagree with "only option".

      a module may load another module that may load another module that may do something that clashes with a global the main code is using; those issues are not fun to debug

      Those issues are especially hard to debug because modules are not supposed to do that: each module should be in its own package (or be closely coordinated with any other components that share a package) and each package has its own "global" namespace, including bareword file handles. (But I have already said indirectly that modules should be using lexical file handles, except, for example, a logging module that opens the log file as a global handle in its package.)

      "you don't need strict as long as you don't make typos"

      Strictly, (pardon the pun) that is correct, but Murphy's Law says that the typo you do make will drive you crazy when it happens if you rely on that. :-)

      you seem to be placing a lot of expectations on people to write correct code, when simply using lexical filehandles easily provides protection from the issues

      As a polyglot programmer that often works in other languages that simply do not have those protective features (there is no "use strict" in Awk or Bourne-family shells, for two examples) I have come to see those expectations as routine because in languages that do not require variable declarations, they are.

      I suspect that there is some limit of human attentiveness, such that a small set of "watch these carefully" is workable, but as that set expands, the risk of typos increases. In Awk or shell, this can effectively be an upper limit on the size of a program.

      what do you mean with an "environment parameter"? And I very strongly disagree with "only option"

      An implicit parameter passed via a variable with dynamic extent, such as a variable declared special in Common Lisp or a global variable in Perl. Such usage is rarely a good idea (at least in Perl), but can sometimes be necessary to work around badly-designed API limitations and pass needed information to a callback procedure, although a combination of closures and function currying might work in most cases, at the expense of being even harder to debug.

        > there is no "use strict" in Awk or Bourne-family shells

        In larger bash scripts (which should be turned to Perl scripts, really) I always start with

        set -eu

        and put all code into a main() function, so that I can declare all variables with local. It's not as strict as strict, but it helps a lot.

        map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
        modules should be using lexical file handles, except, for example, a logging module that opens the log file as a global handle in its package

        Even there, a global filehandle is not necessary - a lexical declared in the module will do exactly the same job.

        I suspect that there is some limit of human attentiveness, such that a small set of "watch these carefully" is workable, but as that set expands, the risk of typos increases.

        Definitely - but this seems to be an argument for lexical filehandles rather than global ones.

        ... can sometimes be necessary to ... pass needed information to a callback procedure, although a combination of closures and function currying might work in most cases, at the expense of being even harder to debug.

        I disagree with these two bits - I think the usage of globals can be avoided 99.9% of the time (or more) through proper API design (and yes, I make the same exception for existing legacy APIs), and also I disagree with it being harder to debug; issues arising from incorrectly used globals are IMHO much more annoying to debug. But again, the problem usually arises more in larger programs rather than shorter ones.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11117543]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (3)
As of 2024-04-19 18:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found