Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re^2: Removing the carriage return in a Find & Replace?

by bobafifi (Beadle)
on Sep 26, 2008 at 12:04 UTC ( [id://713862]=note: print w/replies, xml ) Need Help??


in reply to Re: Removing the carriage return in a Find & Replace?
in thread Removing the carriage return in a Find & Replace?

Thanks for the quick reply psini!
Using your suggestion, I just tried
perl -i -pe 's/<TD>\s*<FONT FACE=arial SIZE=-1>/widget/g' * test.php
unfortunately it didn't work.

However, when I remove the carriage return in the html and run
perl -i -pe 's/<TD><FONT FACE=arial SIZE=-1>/widget/g' * test.php
no problem. Not sure why, but the s* doesn't seem to be recognized.

Thanks again,
Bob

Replies are listed 'Best First'.
Re^3: Removing the carriage return in a Find & Replace?
by Fletch (Bishop) on Sep 26, 2008 at 12:13 UTC

    Because you've told perl to read the file a line at a time (well, more you haven't told it not to do otherwise and line is the default) so $_ will only contain <TD>\n and the next line will have <FONT ....>. At no point is the entire contents you expect to match in $_ simultaneously and in the right order so the match never happens and the substitution never triggers.

    See the documentation for the -0 switch in perlrun, specifically the part about turning on paragraph mode.

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

      The -p option you use splits the input in separate lines. For Perl \n isn't the same as a space even if it is for HTML. One solution is to undef $/ in order to enable ''slurp mode'', (or to use the before mentioned command line option -0):

      perl -i -pe 'BEGIN { undef $/ } s/<TD>\s*<FONT\s+FACE=arial\s+SIZE=-1> +/widget/g' test.html
        How can I use this to batch edit multiple pages? For some reason, I can only edit one page at a time.

        Thanks!
        -Bob
      I just tried mscharrer variation on this and it worked!
      perl -i -pe 'BEGIN { undef $/ } s/<TD>\s*<FONT\s+FACE=arial\s+SIZE=-1>/widget/g' test.html

      Thanks so much!
      Bob
Re^3: Removing the carriage return in a Find & Replace?
by psini (Deacon) on Sep 26, 2008 at 12:17 UTC

    Are you sure it is a CR and not some evil non-printable character used by MS?

    Try editing the file with a text editor (not a word processor!), delete the current newline character, insert a CR and try again. If it works, the problem is to find what is the newline character used in the file.

    Rule One: "Do not act incautiously when confronting a little bald wrinkly smiling man."

      I'm on a Mac using TextEdit in text mode (no MS) and Terminal to run Perl.
      Have you been able to get my example to work on your machine? Thanks,
      Bob
        Did you try Fletch's suggestion? On a Mac (which shouldn't matter much for this):
        -> cat junk.html _ABC_<TD> <FONT FACE=arial SIZE=-1>_XYZ_ -> perl -00pe 's/<TD>\s*<FONT FACE=arial SIZE=-1>/widget/g' junk.html _ABC_widget_XYZ_

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://713862]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (3)
As of 2024-04-19 05:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found