Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re^5: Perl code to format the text file by inserting tag in blank line

by AnomalousMonk (Archbishop)
on Feb 12, 2020 at 22:26 UTC ( [id://11112897]=note: print w/replies, xml ) Need Help??


in reply to Re^4: Perl code to format the text file by inserting tag in blank line
in thread Perl code to format the text file by inserting tag in blank line

Would you mind explaining the 3 lines of code that you have replied me with?

Note: In addition to the regex doc perlre linked by GrandFather, I would also recommend the excellent perlretut tutorial and the perlrequick quick reference.

As mentioned before, in your posted code you seem to be trying to fix up some data after you've written it to a file. This is possible, but very tricky; the time to fix your data is before output.

perl -i -lpe "s/^\s*$/<r><\/r>/g" some_file_name does the trick from the command line, but what is it really doing and how can you incorporate it in your Perl code? Deparsing can be helpful; see the O and B::Deparse core modules:

c:\@Work\Perl\monks>perl -MO=Deparse,-p -i -lpe "s/^\s*$/<r><\/r>/g" BEGIN { $^I = ""; } BEGIN { $/ = "\n"; $\ = "\n"; } LINE: while (defined(($_ = <ARGV>))) { chomp($_); s[^\s*$][<r></r>]g; } continue { print($_); } -e syntax OK
This is the internal code generated and executed with the  -i -l -p command-line switches (see perlrun). The workhorse  s/// has been enclosed in a loop. The  s[^\s*$][<r></r>]g expression operates by default on the  $_ default scalar (see perlop and perlvar), which you can see being assigned values, successive records/lines from a file, in the while-loop condition expression in the deparsed code. (Note that the  /g substitution modifier is not needed here (although it does no harm): the entire line is being replaced and this can only be done once.)

But you want to fix up the data before output. The first step is to generate the data:
    my $out = "$result{$moid}{$ext}{$kpi[$jk]}\n";
Then check the data and fix it if necessary:
    $out =~ s{ \A \s* \z }{<r></r>\n}xms;
Here,  s/// operates not on  $_ but on the  $out variable; this is accomplished by the  =~ "binding" operator (see perlop).
Finally, output correct data:
    print OUTFILE $out;

The  $out =~ s{ \A \s* \z }{<r></r>\n}xms; fixup statement could have been written in other, perhaps better ways. For instance:
    $out = "<r></r>\n" if $out =~ m{ \A \s* \z }xms;
which might be narrated as "$out is changed to an empty <r>-pair if $out matches a blank line." (Note the  m// operator is used here, not substitution; see perlop.) This is perhaps clearer and thus more easily maintainable than doing the same thing with substitution. Developing this a bit further, one might write something like

use constant BLANK_LINE => qr{ \A \s* \z }xms; use constant EMPTY_R_PAIR => "<r></r>\n"; ... $out = EMPTY_R_PAIR if $out =~ BLANK_LINE;
which could be considered self-documenting and moreover allows easy global alteration of the meanings of BLANK_LINE and EMPTY_R_PAIR. (This is an example of the Don't Repeat Yourself (DRY) or "define in one place" principle.)

And that's all there is to it. It's a Simple Matter of Programming. :)


Give a man a fish:  <%-{-{-{-<

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11112897]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (4)
As of 2024-04-24 17:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found