Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

how to use matching operator on newlines

by redss (Monk)
on Jan 03, 2007 at 20:06 UTC ( [id://592821]=perlquestion: print w/replies, xml ) Need Help??

redss has asked for the wisdom of the Perl Monks concerning the following question:

This should be a simple one. I have a unix-formatted double spaced file, so each line of text is followed by 2 linefeeds, which is \n, right?

I want to run a perl script from the command line to translate double newlines to single newlines but it's not working. What is wrong with this code?

perl -p -i -e 's/\n\n/\n/g;' foo.txt

Replies are listed 'Best First'.
Re: how to use matching operator on newlines
by kyle (Abbot) on Jan 03, 2007 at 20:22 UTC

    perl -p reads a line at a time, so it never sees any more than one \n at once. Try this:

    perl -pi -e 'BEGIN{undef $/} s/\n\n/\n/g;' foo.txt

      One option to avoid slurping and would be to set $/ to the double return:

      perl -i -pe 'BEGIN{$/="\n\n"} s/\n\n/\n/;' foo.txt

      Another option would be to set $/ and $\ (Yay, output format!) and chomp it:

      perl -i -pe 'BEGIN{$/="\n\n";$\="\n"} chomp;' foo.txt

      -i added back in per kyle's post. Also, ++ to ikegami, too!

      --
      $you = new YOU;
      honk() if $you->love(perl)

        perl -pe 'BEGIN{$/="\n\n";$\="\n"} chomp;' foo.txt
        can be shortened to
        perl -ple 'BEGIN{$/="\n\n"}' foo.txt

        You really need the -i option to edit the file in place as the OP did (I love the irrational number options "-pi -e"). Avoiding slurping is good (especially if it's a large file). The chomp usage is clever, but I think I prefer your first line. Not only does it have one less weird variable, but the s also makes it a lot more obvious what it's doing. On the other hand, maybe I shouldn't be worried about the readability of a one-liner.

      Good catch. thanks!
Re: how to use matching operator on newlines
by shmem (Chancellor) on Jan 03, 2007 at 22:31 UTC

    XY Problem - you don't need a matching operator. But you stated X and Y :-)

    The simplest way is

    perl -lp00 -i -e '' foo.txt

    This sets $\ (output record separator) to "\n" and the input to paragraph mode ($/="\n\n" - more accurate qr{\n\n+}, it's like a regexp). That snippet squeezes multiple "\n"s into one, no matter how many. See perlrun.

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
Re: how to use matching operator on newlines
by BrowserUk (Patriarch) on Jan 03, 2007 at 20:50 UTC
      This one-liner avoids that

      That may very well be what the OP actually wants, but it is not equivalent his stated goal.

      The OP's goal was to "translate double newlines to single newlines." Your code gets rid of blank lines. Consider what happens when a file starts with a blank line. Or, for another example, when there is a non-blank line followed by two blank lines. Etc.

      -sauoq
      "My two cents aren't worth a dime.";
      I don't think that accomplishes the goal. Your solution removes all the blank lines, not just the double-spacing (unlike the other solutions). For example, yours converts a\n\n\n\nb\n\n to a\nb\n rather than a\n\nb\n.

        I don't think that accomplishes the goal.

        It depends upon which of the stated goals you focus on. The op's description started with

        I have a unix-formatted double spaced file, so each line of text is followed by 2 linefeeds,...

        And if that is an accurate statement of the problem, my one-liner will work.

        If it's not accurate, or omits significant details, then it won't and a slightly more sophisticated one-liner is required:

        perl -e"BEGIN{$/=qq[\n\n]}" -ne"chop;print" junk.txt

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: how to use matching operator on newlines
by GrandFather (Saint) on Jan 03, 2007 at 20:14 UTC
    use strict; use warnings; my $text = do {local $/; <DATA>;}; $text =~ s/\n\n/\n/g; print $text; __DATA__ line 1 line 2 line 3

    Prints:

    line 1 line 2 line 3

    which is what I would expect. What are you doing that is different?


    DWIM is Perl's answer to Gödel
Re: how to use matching operator on newlines
by jonadab (Parson) on Jan 03, 2007 at 22:22 UTC
    I have a unix-formatted double spaced file, so each line of text is followed by 2 linefeeds, which is \n, right?

    Theoretically, yes, but this assumes much. Among other things, we are assuming that you are running this regular expression on a system where \n corresponds to a single linefeed character, which is not universally the case. (You say the file is from a Unix system, but you don't say if you are running Perl on a Unix system...) It is also worth looking at the text file in a hex editor, to verify that it has the same kind of line endings you think it has.


    Sanity? Oh, yeah, I've got all kinds of sanity. In fact, I've developed whole new kinds of sanity. You can just call me "Mister Sanity". Why, I've got so much sanity it's driving me crazy.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://592821]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (9)
As of 2024-03-28 14:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found