http://qs321.pair.com?node_id=281907

AcidHawk has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I am wanting to search for file names that conform to a certain pattern. I want to read the pattern from a config file, something like aaaa-nnnnnnnn. The config part is not the problem, the template bit is. This config value should match files that have 4 alpha characters, then a -, then 8 numeric characters.

I have looked (a bit) for template modules but the majority of these are for HTML.

Some pseudo code to try and help explain.

*** Get aaaa-nnnnnnnn from config file *** I am using XML::Simple while ($file = readdir DIR) { next if $file =~/^\./ || $file not matched by $filename_template; *** Carry on to do something with a file that does conform...
The files I want to do something with look something like ABCD-13022003*.* or even ZZZZ-99999999*.* I would also like to be able to change the config to something like aa-nn-aa, which would match XX-55-DD*.* files.

Has anyone any direction to point me in?

-----
Of all the things I've lost in my life, its my mind I miss the most.

Replies are listed 'Best First'.
Re: Filename Template
by Abstraction (Friar) on Aug 07, 2003 at 14:48 UTC
      Further to that ...
      use File::Find::Rule; my $template = 'aaaa-nnnnnnnn'; (my $regex = $template) =~ s/([an])/$1 eq 'a' ? '[a-z]' : '[0-9]'/eig; my $matcher = rule( file => maxdepth => 1, name => qr/^$regex/i, start => $ARGV[0] ); while(my $file = $matcher->match) { ... }
      See. the File::Find::Rule docs for more info.
      HTH

      _________
      broquaint

      Abstraction,
      To expound on what you have said:
      #!/usr/bin/perl -w use strict; use File::Find::Rule; my @files = File::Find::Rule->file() ->name( qr/^[a-zA-Z]{4}-\d{8}/ ) ->in( / );
      It appeared at first in the docs that all you could do was file globs, but later on I saw a regex, so I think this will work.

      Cheers - L~R

      Update: Bah, broquaint beat me to it, so yes it will work. What's worse - his solution provides the ability to modify the template without modifying the regex. I guess the only thing I would have done differently is changed the character class to [a-zA-Z] instead of using /i for speed reasons - but if it runs a tad bit slower it just means you can go get more caffeine.

        even a simple readdir should do the trick

        #!/usr/bin/perl -w use strict; my $dir = '/some/dir'; my $file_re = qr/^(?:\.+|[a-zA-Z]{4}-[0-9]{8})$/; opendir(DIR, $dir) or die "opening $dir: $!\n"; for ( grep(!/$file_re/, readdir(DIR)) ) { # deal with non conforming file names here } closedir(DIR);

        use perl;

Re: Filename Template
by larsen (Parson) on Aug 07, 2003 at 14:50 UTC
    ... the template bit is. This config value should match files that have 4 alpha characters, then a -, then 8 numeric characters. ...
    If you say regular expression instead of template maybe you'll find the light.
      To expand on this, general usage of the following terms could be defined as follows:
      • Template: A format used for describing output
      • Format: A way of abstractly expressing how something should look, as opposed to what the content should be
      • Regular Expression: A way of defining some set of matching criteria, often used for limiting or parsing input
      • Mask: A format used for constraining input
      Hopefully, that helps!

      ------
      We are the carpenters and bricklayers of the Information Age.

      The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6

      Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

        Yes. And it offers me a way to expand my previous answer. You say that a template is "a format used for describing output". I'll go further, saying that a template generates a language (i.e. a set of strings). On the other hand a regular expression denotes a language, then is usable to recognize strings that belong or not to a set. Since the original question asked for a way to recognize strings in a set, I suggested to change terminology in order to be conducted to the right abstraction that Perl provides.