Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Why is it good practice for a cli script to take switch args before file list?

by leocharre (Priest)
on Jul 19, 2006 at 15:16 UTC ( [id://562315]=perlquestion: print w/replies, xml ) Need Help??

leocharre has asked for the wisdom of the Perl Monks concerning the following question:

I have a cli script that takes as arguments a file path, and optionally, some switches. I am using Getopt::Std right now.

I was having some trouble with the interface. I wanted the user to have to specify switches (-f-a -w hatever) - but I wanted the script to catch any ARGV that does not have a switch and take it as a file argument..

I wanted the following to be valid ways to call the script:

script ./thisfile -f -a
script -f -a ./thisfile
script -f -a ./thisfile ./thisfilealso

Getopt::Std allows me to do this.. but if I specify the arguments first, then the file arguments
metadata -f value -a nothervalue ./thisfile
But if I specify files first, it ignores switches or the file arguments

This is where I get the options..

my $self = { opt => {}, }; getopts('f:v:i:a',$self->{opt}); # -f field -v value -i # what file(s) did user specify. my $files = @ARGV; # only gets fed if arg flags are called via cli bef +ore the file list

Upon further reading, I see that it is widely recomended (for example in Getopt:Long ) that one provide switch arguments first, and then a file list..
And off the top of my head, I think most cli utilities take args before paths.. Why?

I think in the case of this script I'm working on.. It may be more convenient to specify path first. The script is an interface to showing and editing metadata on a file or directory .. here are some usage examples:

metafile ./path/2/file
metafile ./path/2/file -f author -v Joe
metafile ./path/2/file -f author -v Joe -a
metafile ./path/2/file -f author

the constant is that they provide filepath(s).. so I feel silly asking them to turn the above examples to..

metafile ./path/2/file
metafile -f author -v Joe ./path/2/file
metafile -f author -v Joe -a ./path/2/file
metafile -f author ./path/2/file

Replies are listed 'Best First'.
Re: Why is it good practice for a cli script to take switch args before file list?
by merlyn (Sage) on Jul 19, 2006 at 15:54 UTC
    Perhaps because if switches could occur after the start of the filename list, how could you tell whether "-f" is a filename or a switch?

    Bear in mind that "-f" is a perfectly fine filename, since filenames (at least in unix) can contain any byte that isn't "\0" or "/".

    By having the restriction "all switches before any filename list", you can ensure that once you stop looking at switches (including having processed the "--" standard switch), everything that remains, no matter how whacky looking, is intended to be a filename.

    I think that's rather sane.

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.

Re: Why is it good practice for a cli script to take switch args before file list?
by davorg (Chancellor) on Jul 19, 2006 at 15:33 UTC

    I don't know why it's a standard (the origins are probably somewhere deep in the past of Unix time) but the important thing is that it _is_ now a standard. People used to Unix commands will expect the usual method of dealing with command line options. Changing that is likely to confuse people unnecessarily.

    Having said that, it's simple enough to change the behaviour of the Getopt::* modules. They all act on @ARGV so you just need to shift your filename off of the front of @ARGV before calling the option parsing function.

    use Getopt::Std; my $file = shift; my %args; getopts 'abc', \%args; print "File is $file\n"; print "$_ is $args{$_}\n" for keys %args;
    --
    <http://dave.org.uk>

    "The first rule of Perl club is you do not talk about Perl club."
    -- Chip Salzenberg

Re: Why is it good practice for a cli script to take switch args before file list?
by Sidhekin (Priest) on Jul 19, 2006 at 15:35 UTC

    To (non-)answer your stated question first: I have no good answer for "why" beyond "it's the (POSIX) standard".

    The gnu tools happily ignore this standard, so I have no qualms about doing the same.

    Adressing your want: I don't think you can with Getopt::Std, but Getopt::Long can be configured to allow it, and I invariably do so. The configure option is called "permute", and is usually the default. It is also implied by the "gnu_getopt" option, which makes it behave like GNU getopt_long, and is what I normally use:

    use Getopt::Long ':config', 'gnu_getopt'; # or, during runtime: # Getopt::Long::Configure('gnu_getop');

    print "Just another Perl ${\(trickster and hacker)},"
    The Sidhekin proves Sidhe did it!

      I can't see any POSIX on that... Where do you see that?

        I can't see any POSIX on that... Where do you see that?

        Thanks for that convenient link.

        The arguments following the last options and option-arguments are named "operands".

        (Emphasis mine.)

        Also on that page, under the heading "Utility Syntax Guidelines":

        Guideline 9:
        All options should precede operands on the command line.

        "Should" and "guidelines" may not be the strongest of words, but the intent seems clear enough for me.

        print "Just another Perl ${\(trickster and hacker)},"
        The Sidhekin proves Sidhe did it!

Re: Why is it good practice for a cli script to take switch args before file list?
by Solo (Deacon) on Jul 19, 2006 at 15:31 UTC
    And off the top of my head, I think most cli utilities take args before paths.. Why?

    Certainly there is some historical technical reason for the practice. Why has it never changed? Two reasons. By having a common standard we simplify understanding and increase usability. And if it ain't broke, don't fix it.

    --Solo

    --
    You said you wanted to be around when I made a mistake; well, this could be it, sweetheart.
Re: Why is it good practice for a cli script to take switch args before file list?
by rodion (Chaplain) on Jul 19, 2006 at 21:24 UTC
    As to why options-first is the standard, it's because in the formative days of Unix, memory wasn't cheap. Neither were the CPU cycles to make a copy of a list of 500 files, or even just to go through them. Unix was exceptional in allowing long command lines, which glob() would fill up in the shell.

    It was never necessary, however, to go through the list of files twice. You just process the options and then march on through the list, processing them one at a time. Just once. So long as the options come first, it's a one pass algorithm. If you allow the options to come later, you always have to do an earlier pass to find them, because they can change how you process files earlier in the list.

    In the present age of cheap memory and CPU cycles to burn, options-first a leftover convention, now become, ahem, one of the standards. It helps you know where to look for things. (Except when people are following other standards.)

    Note that many of the organized and consistent ways of doing things that we all like about Unix come not just from the insight and self-discipline of our technical ancestors, but from external constraints of time and space (memory space). Unix itself started as a reaction to MIT's MULTICS project, the OS that could do most anything, if you could wait a very long time. (That's where the Unix name came from.)

      This kind of explains why some compilers and print-oriented utilities intermix switches and file names: they parse them sequentially, and files are only effected by switches that precede them.

      emc

      e(π√−1) = −1

      Wow that makes a ton of sense.. First the program is provided with the what, then the who... one by one. Thank you so much for the insight!

Re: Why is it good practice for a cli script to take switch args before file list?
by ambrus (Abbot) on Jul 19, 2006 at 16:10 UTC

    In Unix, programs usually take switch arguments first and other arguments later. That's still the case in real unices like FreeBSD, and also in SunOS. However, GNU has changed this behaiviour to accept options anywhere before the first lone double dash argument. They haven't only made this change in their utility programs but that's also the default behaiviour of the getopt function in the GNU C library (the behaiviour can be overridden by the caller but of course normal programs don't do this). This change causes some incompatibilities, as an argument starting with a minus after an argument not starting with a minus is parsed differently. In this respect, the change is much more obtrusive then long options using double hyphens, as that syntax was otherwise an error. Luckily, as they know this change is incompatible, GNU also gave us an escape route: just define the POSIXLY_CORRECT environment variable to true, and programs will go back to the traditional unix behaiviour. Note also that these incompatibilities rarely cause trouble in practice, because when you don't know if the arguments expand to something starting with a minus, you have to escape them with a double minus anyway, and that escapes all the argumets. (If you ask for my opinion, I find the new behaiviour sometimes convenient but only rarely, so I think it might not be worth what it costs. However, I used linux in my entire life and *BSD only briefly, so I'm quite biased. Also note that it would cause too much breakage in scripts to change the new behaiviour now, so it will probably stay in GNU libc.)

    Now as for perl programs. I've never used the Getopt::Std module, but I used Getopt::Long which can be configured to whatever option style you want the program to have. I uses the following incanatation to make the option parsing GNU-like.

    Getopt::Long::Configure "bundling", "gnu_compat", "prefix_pattern=(--| +-)";
Re: Why is it good practice for a cli script to take switch args before file list?
by swampyankee (Parson) on Jul 19, 2006 at 17:24 UTC

    I would say, pretty much regardless of O/S, most cli users are used to the form command <switches> <files>. After all, this is how most *ix, VMS, and Windows cli functions seem to work. While there are exceptions e.g., some compilers and printing-oriented utilities, and confusions, such as switches that take file names as arguments, it is normal practice.

    Whether or not "normal practice" is "best practice" is a separate issue.

    emc

    e(π√−1) = −1
      I have a minor quibble. Windows took its cli from MSDOS, which didn't care where the options were, so long as they started with a forward slash, and still doesn't. It didn't matter if you needed two passes to parse the command line when you only allowed 128 bytes of command.

      I've never quite forgiven the young Bill Gates and company for going along with IBM's wish to use forward-slash to identify options, as IBM used on their mainframes. It forced them to use back-slash for the directory separator, and I grumble at them every time I get it wrong switching betweeen Unix and MSWindows.

        I thought they got the idea for forward slashes for command switches from VMS. Whether they got the idea from VMS or IBM, backslashes for directory separators was a really bad idea: they're not in a standard place on a keyboard.

        emc

        e(π√−1) = −1

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://562315]
Approved by Sidhekin
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (4)
As of 2024-04-19 17:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found