Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

File::Find::Closures (at last, some code re-use)

by brian_d_foy (Abbot)
on Sep 30, 2004 at 18:58 UTC ( [id://395445]=perlmeditation: print w/replies, xml ) Need Help??

For a long time I've wanted to create a library of ready-to-use callbacks to give to File::Find::find(). I never have liked the idea that I had to code the accculumation of the filenames if I did not process them immediately. This sorrt of thing seemed like it could use a shot of code re-use mojo.

So I created File::Find::Closures to hold all of those sorts of callbacks. The functions in File::Find::Closures have names like find_by_regex(), find_by_min_size(), find_by_name(), and so on. and return two things: the callback which I give directly to find(), and a reporter closure that can access the list of files the callback acccumulated.

I thought Andy Lester was going in this direction with File::Find::Wanted, but I think he had something else in mind. Randal came close to what I wanted with File::Finder, but without the re-use.

Maybe this is just one of those things that everyone ends up writing for himself.

So far, Find::Find::Closures is a developer release, so you won't be able to get it through CPAN.pm. It has no prereqs (other than File::Find which it uses in the tests). I've written everything so other people can easily write their own functions to add to this module, and I hope people send me cool ones that I can include. Create a cool function and it maymake it into an upcoming The Perl Journal article.

--
brian d foy <bdfoy@cpan.org>
  • Comment on File::Find::Closures (at last, some code re-use)

Replies are listed 'Best First'.
Re: File::Find::Closures (at last, some code re-use)
by kelan (Deacon) on Sep 30, 2004 at 19:17 UTC

    I have never used File::Find, so I may be incorrect, but a brief reading through the POD of File::Find::Closures revealed a potential typo. In the synopsis, you have this:

    my( $list_reporter, $wanted ) = find_by_name( qw(README) );
    But in the line immediately after the heading "The closure factories" you say this:
    Each factory returns two closures. The first one is for find(), and the second one is the reporter.
    It seems your example and your description are at odds.

      Drats. The example is out of date, and I will fix that in the next upload.
      --
      brian d foy <bdfoy@cpan.org>
Re: File::Find::Closures (at last, some code re-use)
by SpanishInquisition (Pilgrim) on Oct 01, 2004 at 14:37 UTC
    One thing I really like about merlyn's module is that it supports the filter chaining idiom (which I used on a C++ project a couple of years ago). That, and it has completely replaced File::Find for me. Also it doesn't have to import anything and has a really cool object interface.

    How did I use Filter Chaining and still get reuse of my filters? My C++ module kept returning FilterCollection objects with each cascading application of a filter until the method was terminated with an apply statement.

    Thus I could build filters like:

    FilterCollection* fc = new FilterCollection(rootObj)->isType("foo")->h +asProperty("bar",5150); FilterCollection* fc2 = fc->hasProperty("baz",812)->apply(); for(int i=0; i< fc2.size(); i++) { // do something with fc2.elementAt(i) }

    Now that isn't Perl, but you can see how Merlyn's module can be used to support reuse, while still keeping the totally gnarly and radical filter chaining idiom. If you needed that, I'd rather see an optional mode on Merlyn's module, or File::Finder2 or something, rather than a procedural interface that discards this killer idiom -- if reuse is what you are after, maybe the wheel doesn't need reinvention. I assumed that by reuse you meant efficient reuse of the filter results, as if you just wanted extra file checking methods inside of Merlyn's filter chain you could have added them.

    (Long optional OT aside: The meta question here is -- was this module neccessary for CPAN, especially when it is unfinished? It should have probably stayed on a developer machine. CPAN is kinda crowded and confusing as it is, I don't like to see "alpha" modules up there as that is just more to sort through, especially when the alpha module could have contributed to another project. As for why this module was created when close alternatives exist, this is a FOSS problem -- forking -- but it's a problem which Linux and GPL apps tend to maintain pretty well (patches contribute upstream, etc). CPAN, however, doesn't resist forking as well...so we end up with a lot of modules that do the same thing. Sometimes it's good, but it's very rough on a user trying to find a module. This meta-question is one that we have not solved ... and probably won't ... but a little bit of education and end-user perspective may help ebb the tide of CPAN bloat. These could have been made File::Finder compatible too -- such that people could have the best of both worlds. Why was that decision made?)

    Disclaimer, actually in my implementation, each filter does contain references to the data that successfully passed through the filter for purposes of passing them to the next filter, however the filters are reusable -- you just pay for them when you build each progression. If you want rules though, you can easily see how this could be used to build a cascading engine, where the apply() works the original list down each stage of filters and does not generate anything until apply() is called. You could also allow for inclusion of apply in the middle of a chain, selectively chosing when you do or do not evaluate the filter chain, e.g. FilterCollection->hasProperty("grok",2)->apply() would be interatable (sp?) but you could then take the resultant FilterCollection from that apply call and either reset() it or just take on more calls which would also add to the chain. Perhaps taking an additional progression onto an already applied chain would automatically reset the filter.

    Dude, I love Computer Science...

Re: File::Find::Closures (at last, some code re-use)
by rinceWind (Monsignor) on Oct 01, 2004 at 16:41 UTC
    The limitations of the API to File::Find were one of the reasons for writing File::Wildcard. In cases like directory tree expansion, I much prefer an iterator to a callback.

    I also provide an "all" method which returns all the files as a list, which can pe post-processed by the likes of map and grep.

    --
    I'm Not Just Another Perl Hacker

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://395445]
Approved by kelan
Front-paged by McMahon
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (3)
As of 2024-04-25 17:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found