Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Reading Filenames

by jakeeboy (Sexton)
on Jan 04, 2002 at 05:47 UTC ( [id://136176]=perlquestion: print w/replies, xml ) Need Help??

jakeeboy has asked for the wisdom of the Perl Monks concerning the following question:

I'm reading in filenames that are in a form of ap_hm1_111 and ck_hm1_111 like so:

my @files = grep(/ap|ck_hm1_111/, @readdir(DIR))

And the order is always alphabetical. But I was wondering is is always alphabetical or will my script read them in a different order if I move it to a different machine?

Replies are listed 'Best First'.
(Ovid) Re: Reading Filenames
by Ovid (Cardinal) on Jan 04, 2002 at 05:58 UTC

    Not sure if the sorting is the same. However, your code is broken and shouldn't run.

    v-- what's that @? my @files = grep(/ap|ck_hm1_111/, @readdir(DIR))

    Also, your regular expression needs to have the 'ap' and 'ck' grouped properly, or else you'll match a file called 'apples.txt'. If you want the files to be sorted alphabetically (let's assume lower case), you can stick in a sort block.

    my @files = grep /(?:ap|ck)_hm1_111/, sort { lc $a cmp lc $b } readdir +(DIR);

    Cheers,
    Ovid

    Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Re: Reading Filenames
by wog (Curate) on Jan 04, 2002 at 06:03 UTC
    The order of files returned from readdir is highly system dependent and can not be relied upon. Usually, it depends on how the directory is stored on the underlying filesystem.
Re: Reading Filenames
by jarich (Curate) on Jan 04, 2002 at 06:32 UTC
    There are two functions you can use to get the contents of a directory. glob and readdir. glob is the slowest, but allows you to limit the types of files you get back. eg
    my @files = glob("*_hm1_111");
    will return you all files ending with hm1_111, but you need more flexibility than this, so readdir is probably the best choice.

    readdir on the other hand is very fast, and returns you the files in whatever order they are stored in the internal representation for your file system. This will not always be alpha/ascii-betical. glob always returns files sorted ascii-betically.

    The last difference between the two is how . (dot) files are handled. glob("*") will not return files such as .bashrc but readdir will. glob(".*") must be called if you want the . files. (In fact glob works very much like the UNIX c-shell.)

    Having said all of this, I'd suggest you use ovid's suggestion in most circumstances. However, if it is likely that your directory contents will be very large (and you are only interested in a small fraction of the files) and if performance is important then I'd suggest you compare ovid's suggestion with something like this:

    my @files = grep /^(?:ap|ck)_hm1_111/, glob("DIR/*_hm1_111");
    and go with the best.

    Jacinta

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://136176]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (5)
As of 2024-04-24 22:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found