http://qs321.pair.com?node_id=357183

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi all, I have a bit of a problem with this script. It seems to be working happily, searching through all subfolders for a file ending .9. But when it prints to the output file, it appends all the other subfolders containing .9 contents to the file. So naturally the file is huge? Any suggestions?
find (\&process, $folder); sub process { if ($_ =~ /\.9$/) { print "\nProcessing the fort.9 files, from taylor type to z values\ +n"; open (FILE, '<', $_) or die "Cannot open file: $!"; $/= "# input for"; while (<FILE>) { #do something push(@outLines, $_ ); } close FILE; open ( OUTFILE, ">adjusted.learn" ) or die "Cannot open file: $!"; print ( OUTFILE @outLines ); close ( OUTFILE ); } } exit;

Replies are listed 'Best First'.
Re: Working with Subfolder
by tachyon (Chancellor) on May 28, 2004 at 11:53 UTC

    So naturally the file is huge? Any suggestions?

    First whitespace is your friend aka learn how to format and indent your code.

    Second every time you find a file that ends in .9 you overwrite the file 'adjusted.learn' (if there is more than one in that dir) with the '#do something' processed content of the .9 file so there is absolutely no possibility whatsoever of the 'adjusted.learn' file ever getting bigger than the size of the largest 'do something' munged .9 file in that dir unless you are doing something to make that so.

    Here is your code. I have formatted it and changed a few minor things. Have a look at the changes..... besides the whitespace there are useful error messages like can't read (this file name in this dir) because of this reason rather than what your code would generate - ie cannot read file: access denied. That sort of error is useless. We know there is a problem. But what the hell is it! If you write code like you have presented you may as well write open F, $file or die "Error 123\n" for all the informative value you are adding. The only things you will genreally ever see in $! are 'access denied' or 'file does not exist', but you really want to know WHICH FILE WHERE?

    Also if you get data in $_ in a sub it is generally not a good idea to mess with $_ withing that sub. See the while loop.

    Leaving all that asside if you set $/ the input record separator to 'blah blah blah' and the string 'blah blah blah' does not appear in the input file you will read the entire file into memory as a single record. Thus $outlines[0] will contain the entire file content. It that your real problem?

    find (\&process, $folder); sub process { if ($_ =~ /\.9$/) { print "\nProcessing file $File::Find::dir/$_\n"; open FILE, $_ or die "Cannot read $File::Find::name: $!"; local $/= "# input for"; # localise @outlines so we don't accumulate my @outlines = (); while ( my $line = <FILE> ) { #do something push @outLines, $line; } close FILE; my $out = 'adjusted.learn'; open OUTFILE, ">$out" or die "Cannot write $out: $!"; print OUTFILE @outLines; close OUTFILE; } } exit;

    cheers

    tachyon

      i have used your suggestions, however, i am still getting the same problem. Processed the first file fine, then the second processed file, lists both and so on!

        You need to empty @outlines as noted in the code above. Otherwise each call to process() will add more lines to @outlines.

        cheers

        tachyon

Re: Working with Subfolder
by Aragorn (Curate) on May 28, 2004 at 12:35 UTC
    Variable @outLines isn't emtpied when the contents of it are written to the file. So the contents of every file are added to it, after which the complete array (including the old contents) is written to adjusted.learn. Instead of having a global @outLines array, you should declare it within the process subroutine:
    sub process { my @outLines = (); ... }
    Now, every time when process is called, the @outLines array is created afresh.

    Another option would be to insert a @outLines = () statement after writing to the adjusted.learn file.

    Arjen

      seems to have done the job nicely, thanks!
Re: Working with Subfolder
by tucano (Scribe) on May 28, 2004 at 12:43 UTC
    Mah, probably i don't understand your problem but:

     if ($_ =~ /\.9$/)
    Match for file that ends with .9.
    But what do you want to pass to the subroutine process?
    Seems that you pass the folder ok? but what is 'find' another subroutine? But i don't understand where you are listing the files on the directory. In a unix system you can use 'ls' to make a simple listing of directory.
    #!/usr/bin/perl $test = process ($folder); #absolute dir $folder = 'test'; system ("ls $folder > files"); ### Now you have a file that is a list of the files in the directory ### You can now applicate process at every line of the file sub process { if ($_ =~ /\.9$/) { print "\nProcessing the fort.9 files, from taylor type to z values\ +n"; open (FILE, '<', $_) or die "Cannot open file: $!"; $/= "# input for"; while (<FILE>) { #do something push(@outLines, $_ ); } close FILE; open ( OUTFILE, ">adjusted.learn" ) or die "Cannot open file: $!"; print ( OUTFILE @outLines ); close ( OUTFILE ); } } print $test,"\n"; exit;

    Hope that it's help you