Re: How to process multiple input files?
by jwkrahn (Abbot) on May 22, 2011 at 20:42 UTC
|
#!/usr/bin/perl
use strict;
use warnings;
$^I = ".bak";
undef $/;
my $count = 0;
while ( my $line = <> ) {
$line =~ s{
(<\/div>)
} {
++$count == 2
? "\t<?php include(\$_SERVER['DOCUMENT_ROOT'].\"\/includes
+\/footer.php\"); ?>\n\n$1"
: $1
}gex;
print $line;
$count = 0;
}
| [reply] [d/l] [select] |
|
I worry that an empty file will stop it prematurely. Or a file might contain just "0" or somesuch, but that's less likely. Since he's slurping whole files rather than reading lines, I think it would be prudent to test for defined. (Hmm, what does the normal line-oriented read do if an empty file is in the list? Maybe it's always an issue.)
update: never mind. In production code I would have simply written defined to be sure, but looking through the docs I see that this construct is special even in the case of explicit assignment. I know that the quick while(<>) tests for defined, or started to at some specific version of Perl (I remember the classic Camel book explaining how lines are never False because they end in "\n"), but wasn't sure that applied when assignment was being made.
In general, I rely less on special cases and magical meanings in well-written production code than in a quick one-liner. Declaring variables, and not using $_ much falls into the same category, so I somehow was thinking the magic was not in effect.
| [reply] [d/l] [select] |
|
I think it would be prudent to test for defined.
The code I posted:
while ( my $line = <> ) {
does test for defined.
| [reply] [d/l] |
|
|
|
|
| [reply] |
Re: How to process multiple input files?
by graff (Chancellor) on May 22, 2011 at 20:51 UTC
|
I have tried to put the script within a foreach loop, but it did not work.
So, I'm guessing that you didn't try it this way:
#!/usr/bin/perl
use strict;
use warnings;
for my $f ( @ARGV ) {
local $/;
open( I, '<', $f );
open( O, '>', "$f.bak" );
my $count = 0;
my $line = <I>;
$line =~ s{ (<\/div>) }
{ if (++$count == 2){
"\t<?php include(\$_SERVER['DOCUMENT_ROOT'].\"\/incl
+udes\/footer.php\"); ?>\n\n".$1;
} else {
$1;
}
}gex;
print O $line;
}
That works for me. (BTW, I'm compulsive about making the indentation look right -- seems silly, but it's really helpful to keep code less illegible.)
If you have so many files that you can't fit them all as args on a command line, there's the unix "xargs" tool:
ls | xargs your_prog ## or use "find ... | xargs your_prog"
| [reply] [d/l] [select] |
Re: How to process multiple input files?
by John M. Dlugosz (Monsignor) on May 22, 2011 at 19:30 UTC
|
As written, the <> construct will read from each file name given on the command line, in turn. You don't need to do anything else; just list more than one file on the command line.
| [reply] [d/l] |
|
| [reply] |
|
my $line;
while (defined ($line = <>)) {
will repeat until there are no files left. | [reply] [d/l] |
Re: How to process multiple input files?
by jaredor (Priest) on May 22, 2011 at 20:49 UTC
|
while (my $line = <>) {
...
}
Oops, after submission I saw jwkrahn responded in more detail. That comment should solve (both) your problems, which I now understand to be 1) looping over command line file names, and 2) Modifying the second line of each file. One thing you might do instead of maintaining your own counter would be to use the built-in line counter. The special $. line number variable will be properly maintained from file to file. (will not be properly maintained with the <> operator unless you take special steps as described in the link given. Thank you again jwkrahn.)
| [reply] [d/l] |
|
He'll always have a line-count of 1, since he's slurping the files. The counter variable is used to count how many times the replacement is triggered with the /g option, not the number of "lines" read (he only reads one "line" in the original!).
Putting the declaration of $counter inside the loop should do the trick simply. A better solution might be to rewrite the regex to find the second occurrence of </div> rather than finding all of them and only substituting the second, and "inserting" the content directly rather than repeating the found stuff in the replacement.
| [reply] [d/l] |
|
| [reply] [d/l] [select] |
|
|
| [reply] |
Re: How to process multiple input files?
by John M. Dlugosz (Monsignor) on May 23, 2011 at 00:42 UTC
|
Oh, also your technique to find the second occurrence of something and do something to it is a bit strange. You could use the search /g in a loop and have normal code rather than the inside of evaluated replacement. But, you can locate the second occurrence directly and not need that kind of code.
You want to insert something just before the second </div>, right? Something like this (untested!):
my $replacement= '\t<?php include(\$_SERVER['DOCUMENT_ROOT'].\"\/inclu
+des\/footer.php\"); ?>\n\n';
s{ </div> .*? \K (?=</div>) }
{ $replacement }
x;
Note that you don't use /g so don't keep checking all the rest of the divs, and you don't use $1 or anything in the replacement but "insert" it without replacing any of the stuff used to find that spot.
The \K means that what came before is just context and not included in what gets replaced. The (?=pattern) does the same for what follows. Nothing is "in" the region replaced. See also the use of lazy quantifiers.
The whole program becomes:
#!/usr/bin/perl
use strict;
use warnings;
$^I = ".bak"; # same as -i option
undef $/; # slurp whole files!
my $replacement= '\t<?php include(\$_SERVER['DOCUMENT_ROOT'].\"\/inclu
+des\/footer.php\"); ?>\n\n';
my $filecontents;
while (defined ($filecontents=<>)) {
$filecontents =~ s { </div> .*? \K (?=</div>) } { $replacement } x;
print $filecontents;
}
I added comments and changed the name of the variable from $line because nobody else noticed that this is not a single line. As written, it was confusing and hard to read because of built-in assumptions people make about idioms and style.
| [reply] [d/l] [select] |
Re: How to process multiple input files?
by Anonymous Monk on May 22, 2011 at 23:48 UTC
|
| [reply] |