http://qs321.pair.com?node_id=279200


in reply to Re: Re: Checking Perl script on server
in thread Checking Perl script on server

Thanks to both of you for the info. Yes I am building huge arrays in my finding and changing data recursively. Please advise if this way is using alot of memory?
sub mySub { if( $_ =~ /\.html) { my $name = $File::Find::name; open ( F, $name ) || warn "Can\'t open File $name: $!\n"; while($line = <F>) { for $hit ($line =~ /matchdata/gi) { push @files, $name; } } close F; } } find( \&mySub, $dir ); foreach (@files) { open(LDATA, "$_") || warn "File does not open: $!\n"; @data = (<LDATA>); close(LDATA); open(LDATA, ">$_") || warn "File Write problem $_: $!\n"; foreach (@data) { s/OLD/NEW/gi; print LDATA $_; } close(LDATA); }

Replies are listed 'Best First'.
Re: Re: Re: Re: Checking Perl script on server
by snax (Hermit) on Jul 30, 2003 at 16:01 UTC
    I can't speak directly to efficiency as it's hardly my specialty, but I would suggest a change to this chunk:
    while($line = <F>) { for $hit ($line =~ /matchdata/gi) { push @files, $name; } } close F;
    Logically it seems like you want to check to see if the file you're currently examining has data that needs to be updated (you're saving the filename in @files for later processing) -- but you're pushing the filename onto your list once for each match in the file! Once you find a match in a particular file you should (based on my understanding of your goal) (1) push the filename onto your to-be-processed stack, and then (2) move on to the next file. So:
    while($line = <F>) { if ($line =~ /matchdata/i) { push @files, $name; last; } } close F;
    Note that I dropped the global flag for the match operator, too -- you just want to know if there's something there, anywhere, to be fixed later, and then move on.

    Naturally, if I misunderstand your goals this could be way off base :)

Re: Re: Re: Re: Checking Perl script on server
by smalhotra (Scribe) on Jul 30, 2003 at 16:28 UTC
    Couple of things:
    1. if /matchdata/ has any capturing () than $name is pushed into @files for each ().
    2. When you do @data = (<LDATA>) you read the entire file into memory. A better way to do this would be:
    foreach (@files) { open(LDATA, "$_") || warn "File does not open: $!\n"; open(TMP, ">$_.tmp") || warn "File Write problem $_.tmp: $!\n"; while (<LDATA>) { s/OLD/NEW/gi; print TMP $_; } close(LDATA); close(TMP); rename("$_.tmp", "$_") or warn "Could not rename '$_.tmp' to '$_': + $!\n"; }
    Another side advantage of this is that if, for some reason, your script dies half way through then you still have our old file rather than a half written new file. hth.

    $will->code for @food or $$;

      Another side effect from the entire file slurp that you had done with the code is that perl will alloc enough memory to fit the whole file into memory, and because of the way most OSs handle releasing memory back to the system you will find that your memory usage just for that portion of code is >= the largets file you act on until your perl script stops executing. This may be an issue (depending on your OS) where the system will not realize what memory is being used activly and force other applications to swap out if your memory is constrained.

      -Waswas