Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

What makes File::Find's interface so commonly hated

by demerphq (Chancellor)
on Jul 16, 2006 at 13:06 UTC ( [id://561564]=perlmeditation: print w/replies, xml ) Need Help??

Its often said that File::Find has a horrible interface. I'm curious why people think this, and how they think it could be improved.

---
$world=~s/war/peace/g

  • Comment on What makes File::Find's interface so commonly hated

Replies are listed 'Best First'.
Re: What makes File::Find's interface so commonly hated
by adrianh (Chancellor) on Jul 16, 2006 at 14:03 UTC
    Its often said that File::Find has a horrible interface. I'm curious why people think this, and how they think it could be improved.

    Well - I wouldn't go as far as horrible. Insufficiently abstracted maybe. I find File::Find::Rule much more useful most of the time because:

    • I don't have to mess around with long global vars like $File::Find::name
    • I don't have to write any code for the common case of collecting all the matching results
    • I have a declarative style that I find much easier to read and maintain
    • I can never remember how no_chdir, follow and follow_fast affects the behaviour of the $_ and the $File::Find::* variables without referring to the docs.
    • F::F::R makes it easy for me to compose rules
    • Changing directories by default is wrong for me most of the time.
Re: What makes File::Find's interface so commonly hated
by itub (Priest) on Jul 16, 2006 at 14:01 UTC
    IMO: 1) requiring a confusingly-named subroutine that doesn't take arguments, but rather must use global variables is awkward. 2) If you actually want to produce a list of filenames (which is a very common scenario), you need to do it yourself. 3) Changing the working directory by default is not very nice.
Re: What makes File::Find's interface so commonly hated
by wfsp (Abbot) on Jul 16, 2006 at 13:20 UTC
Re: What makes File::Find's interface so commonly hated
by diotalevi (Canon) on Jul 16, 2006 at 15:48 UTC

    It can't recover from failures - I've tried using it on trees where I didn't have access to everything and it stopped as soon as it couldn't enter something. It would have been useful for it to just skip that and try the next thing. It also can't find *one* thing, let me process it, then I go back get *another* thing and I process it, etc.

    The thing I didn't have permission to go into? lost+found.

    ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

Re: What makes File::Find's interface so commonly hated
by broquaint (Abbot) on Jul 16, 2006 at 23:19 UTC
    The problem with the interface is that it is almost entirely lacking. As the user you must provide the code to do the finding, the ignoring and the collecting whereas the module merely provides the code to get you there. Nearly every other module in the core provides an API to do the common things one might expect. File::Find, however, rather abruptly lumps you with a single function which you must fashion into something desirable. When your code is called the available information is limited to the filename, directory name and the file's path with any other information you may require to be extrapolated manually. That's not to mention the complete lack of control in how your code is used i.e no callback, no yielding, no iteration. All this wouldn't be so bad if the code wasn't so positively inextricable. Suffice it to say, what makes File::Find's interface so commonly hated is that it is an assault upon the sensibilities of anyone who encounters it, novice or wizard.
    HTH

    _________
    broquaint

Re: What makes File::Find's interface so commonly hated
by graff (Chancellor) on Jul 17, 2006 at 05:50 UTC
    I agree with the comments in the previous replies, and would add that there is also File::Finder, which provides something more like the command-line interface of the common unix "find" utility; like File::Find::Rule, this "amendment" to File::Find makes it a lot easier to come up with working code.

    But both the ::Finder and ::Rule extensions are just wrappers around the core File::Find module, and all three end up suffering from the same problem relative to using the basic "find" utility -- they are much slower, and this is the main reason why I hate File::Find and anything based on it.

    I'd much rather open a pipeline file handle running the "find" command: this utility is either native or freely available for all common OS's, it's pretty easy to use in a perl script via the file handle idiom, and it runs a lot faster -- typically a by factor of six in wallclock time.

    I posted a benchmark on File::Find four years ago, and another on File::Finder two years ago, so here's a new one for File::Find::Rule (using an example from the module's man page). All of these benchmarks show pretty much the same timing difference between the module and the system "find" utility.

    The output shows that the OS's own caching behavior gives an "unfair advantage" to F::F::R -- the "shell-find pipe" approach took 7 sec on its first iteration, and less than 3 sec on each of the remaining nine iterations. But even with the OS caching already done, F::F::R still takes between 15 and 22 sec per iteration, and puts a much heavier load on the cpu. (This is with perl, v5.8.6 built for darwin-thread-multi-2level on macosx 10.4.7; I've seen similar results on freebsd and linux.)

    If you aren't doing any really big directory trees, and/or you don't care how long it takes, using some version of File::Find is "good enough", but for serious work on a really large directory tree, it's worthwhile to take advantage of the perl's value as a "glue" language (to make efficient use of existing system resources), rather than taking advantage of these particular modules.

Re: What makes File::Find's interface so commonly hated
by revdiablo (Prior) on Jul 16, 2006 at 16:57 UTC

    Another thing I have noticed is that people seem to have an aversion to File::Find's callback interface. There is a perceived loss of control. In some cases, there is a real loss of control, but usually it is mostly perception. Consider a case like the following, though (this is a made up interface, but it's just a demonstration):

    while (my $found = $finder->next) { print $found->filename, "\n"; last if $found->pathname =~ $end_condition; next if $found->pathname =~ $skip_condition; process($found); }

    This wouldn't be quite so nice with a callback interface, even though there's no reason it isn't possible.

      There are other issues. F.ex., with a GUI app, the callback interface is a problem as you need to be able to keep the app responsive while scanning potentially huge directory trees. Yes, it’s possible even with the callback interface, but it’s a right pain. The natural way to do this would be to launch an iterator and then collect results whenever the UI is idle; File::Find forces you to instead poll the UI for events within your wanted function, which breaks separation of concerns and doesn’t work in all scenarios.

      Makeshifts last the longest.

      Not so. I use a module whose callback returns 0 or 1 for last or next respectively.
      sub callback { my ($found) = @_; print $found->filename, "\n"; return 0 if $found->pathname =~ $end_condition; return 1 if $found->pathname =~ $skip_condition; process($found); return 1; }

      Constants would improve readability.

        As I said, this is certainly possible with a callback interface. What you've shown is a way to do that, but it's not using the standard next and last operators, so it's not quite as easy to follow or understand.

        I should probably also note that I don't have a big problem with callback interfaces generally, nor with File::Find specifically. But I can certainly say in this case, one alternative is slightly nicer than the other.

Re: What makes File::Find's interface so commonly hated
by chromatic (Archbishop) on Jul 16, 2006 at 20:42 UTC

    It often gives me the wrong thing (one element of the list) by default (unless I want to collect a list of files) by requiring me to multiply entities (create my own callback) and muck with global variables (as if the callback couldn't take arguments).

Re: What makes File::Find's interface so commonly hated
by tphyahoo (Vicar) on Jul 17, 2006 at 10:46 UTC
    Everybody who hates File::Find should try the find2perl utility that comes bundled with perl.

    find2perl > my-find-script-template.pl chmod u+x my-find-script-template.pl

    It's totally boneheaded utility. As far as I can tell, all it does is output the template for a basic file finder with perl, which you can then modify per your requirements.

    Well, actually, it's a little more powerful than that, but I never figured out how to give arguments to this script. I just output and tweak.

    And yet... and yet... what a help that is!

    Try it if you haven't. :)

      It’s not that I have trouble figuring out how to use File::Find, and I doubt all the experienced programmers who are outspoken about its interface have any either. I’ve made it do a lot of things it was probably never meant to. But doing simple things is much less simple than it could be, and some of the hard things are not even possible at all – in short, the interface runs counter to general Perl philosophy.

      Makeshifts last the longest.

        Maybe more experienced people and less experienced people hate File::Find in a different way.

        ;)

        As for me, I've learned to appreciate it, like a cranky old table saw that isn't always completely predictable, but that I rely on to get the job done.

      haha, they think constants are annoying too:
      ~ $ find2perl =snip # for the convenience of &wanted calls, including -eval statements: use vars qw/*name *dir *prune/; *name = *File::Find::name; *dir = *File::Find::dir; *prune = *File::Find::prune; =snip

      Hey good find on the find2perl script! i like it

      It's not what you look like, when you're doin' what you’re doin'.
      It's what you’re doin' when you’re doin' what you look like you’re doin'!
           - Charles Wright & the Watts 103rd Street Rhythm Band, Express yourself

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://561564]
Approved by McDarren
Front-paged by Old_Gray_Bear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (None)
    As of 2024-04-25 01:21 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      No recent polls found