http://qs321.pair.com?node_id=137848

Prince99 has asked for the wisdom of the Perl Monks concerning the following question:

Hail fellow Monks,
I am currently trying to read through directories recursively. I have this down, but I came across a weird situation that I thought you may be able to help me with. The code for reading the directories is basic, but it works:
my $dir = "../data/ifx/reports/"; opendir (DIR, $dir) or die "cannot opendir $dir"; foreach my $file (readdir(DIR)) { if(($file eq ".") || ($file eq "..")) {} else { print $file."<BR>"; &Read_Next_Dir ($file); } } closedir (DIR); sub Read_Next_Dir { my $param = $_[0]; my $dir = "../data/ifx/reports/".$param."/"; opendir (DIR, $dir) or die "cannot opendir $dir"; foreach my $file (readdir(DIR)) { if(($file eq ".") || ($file eq "..")) {} else { print "---->";print $file."<BR>"; &Read_Report_Dir ($file, $param); } } closedir (DIR); } sub Read_Report_Dir { my @report_array; my $report_name = $_[0]; my $param = $_[1]; my $dir = "../data/ifx/reports/".$param."/".$report_name."/"; opendir (DIR, $dir) or die "cannot opendir $dir"; foreach my $file (readdir(DIR)) { if(($file eq ".") || ($file eq "..")) {} else { print "--------->"; print $file."<BR>"; push(@report_array,$file); } } foreach my $unsorted_file(@report_array) { print $unsorted_file; } @report_array = sort {$a cmp $b} @report_array; foreach my $sorted_file(@report_array) { print $sorted_file."<BR>"; } closedir (DIR); }
The results it prints out are as follows:
103 ---->137 --------->Y2001.pdf --------->Q2001-1.pdf --------->Q2001-2.pdf --------->M2001-12.pdf --------->M2001-11.pdf --------->M2001-01.pdf
Y2001.pdfQ2001-1.pdfQ2001-2.pdfM2001-12.pdfM2001-11.pdfM2001-01.pdfM2001-01.pdf
M2001-11.pdf M2001-12.pdf Q2001-1.pdf Q2001-2.pdf Y2001.pdf ---->142 --------->M2001-01.pdf --------->M2001-11.pdf --------->M2001-12.pdf --------->Q2001-1.pdf --------->Q2001-2.pdf --------->Y2001.pdf
M2001-01.pdfM2001-11.pdfM2001-12.pdfQ2001-1.pdfQ2001-2.pdfY2001.pdfM2001-01.pdf
M2001-11.pdf M2001-12.pdf Q2001-1.pdf Q2001-2.pdf Y2001.pdf 107 ---->137 --------->M2001-01.pdf --------->M2001-11.pdf --------->M2001-12.pdf --------->Q2001-1.pdf --------->Q2001-2.pdf --------->Y2001.pdf M2001-01.pdfM2001-11.pdfM2001-12.pdfQ2001-1.pdfQ2001-2.pdfY2001.pdfM20 +01-01.pdf M2001-11.pdf M2001-12.pdf Q2001-1.pdf Q2001-2.pdf Y2001.pdf ---->142 --------->M2001-01.pdf --------->M2001-11.pdf --------->M2001-12.pdf --------->Q2001-1.pdf --------->Q2001-2.pdf --------->Y2001.pdf M2001-01.pdfM2001-11.pdfM2001-12.pdfQ2001-1.pdfQ2001-2.pdfY2001.pdfM20 +01-01.pdf M2001-11.pdf M2001-12.pdf Q2001-1.pdf Q2001-2.pdf Y2001.pdf
I guess my question is why do the two bolded results differ. The files in the directories are the same. Shouldn't they be read in, in the same order, each time I do a readdir()??
Any help would be appreciated.

Prince99

Too Much is never enough...

Replies are listed 'Best First'.
Re: Working through directories
by gav^ (Curate) on Jan 11, 2002 at 02:47 UTC
    Any reason for ignoring the very useful File::Find?
Re: Working through directories
by rje (Deacon) on Jan 11, 2002 at 02:54 UTC
    I asked a co-worker, and we seem to think that file placement within a directory is based on the OS's strange preferences; we don't know the rhyme or reason for it. For example, try this on UNIX:

    ls -lf
    This lists a directory contents without sorting. Note the apparent arbitrariness of the file order... so, even if there are identical files in two directories, there's no guarantee that a directory listing will return them in the same order.

    That's the best we can offer for an explanation.

    rje

Re: Working through directories
by hatter (Pilgrim) on Jan 11, 2002 at 03:59 UTC
    Although perl normally makes things pretty for the programmer, file listings aren't kept by the OS in any particular sort order. To sort it would require extra processing, and a lot of users of readdir don't care if it's sorted or not, or even don't want to read all of a directory listing, so it makes sense for perl not to do the extra work by default.

    If you'd structured your code to read the dir into an array, and then work through that array doing work on each entry, then it'd be trivial to sort the list items however you want. Also, it'd allow you to deal with any errors reading the dir more flexibly (should you want to produce no output, or a specific error if their is a problem with readdir()) This is generally a good programming practice if you want to reuse code, or want to change where a program sources its data from.

    the hatter

Re: Working through directories
by jarich (Curate) on Jan 11, 2002 at 04:47 UTC
    For further information look here where this was discussed along similar lines.
Re: Working through directories
by screamingeagle (Curate) on Jan 11, 2002 at 02:59 UTC
Re: Working through directories
by particle (Vicar) on Jan 11, 2002 at 03:09 UTC
    rje is right. you'll need to do a sort to get consistently sorted results.

    ~Particle