Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Recursive insertion of tags

by rsriram (Hermit)
on Jul 20, 2006 at 10:57 UTC ( [id://562540]=perlquestion: print w/replies, xml ) Need Help??

rsriram has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I have a markup file, in which there is a element like <ins cnt="#">. Every time I encounter this element, it has to be replaced with <TAG>, the number of times specified in the attribute.

For example, if the tag is <ins cnt="4">, the output file should have <TAG><TAG><TAG><TAG>. This ins appears several times in the input file. I am reading through the file and used a for loop as:

$file =~ /<ins cnt="([^>]+)">/g;
for ($x=0; $x != $1; $x++) {
print F2 <TAG>;
}

But this is not producing the result I needed. Can anyone help me with a syntax/logic for this replacement

Replies are listed 'Best First'.
Re: Recursive insertion of tags
by gellyfish (Monsignor) on Jul 20, 2006 at 11:05 UTC

    You could do it in a simple subtitution:

    $file =<<EOF; <ins cnt="4"> blah <ins cnt="2"> EOF + $file =~ s/<ins cnt="(\d+)">/"<TAG>" x $1/egs; + print $file;
    Note the /e modifier to the substitution that permits the evaluation of code in the RHS.

    /J\

Re: Recursive insertion of tags
by swkronenfeld (Hermit) on Jul 20, 2006 at 14:52 UTC
    gellyfish's code is the way to solve this problem, but I'll point out a couple mistakes in your code to help you avoid them in the future.

    print F2 <TAG>;
    You want to put TAG in quotes. The line should be print F2 "<TAG>";

    This is something that using warnings would have helped you catch. Your program is attempting to read a line from the filehandle TAG, and print that to filehandle F2.
    # ./test.pl Name "main::TAG" used only once: possible typo at ./test.pl line 9. readline() on unopened filehandle TAG at ./test.pl line 9. readline() on unopened filehandle TAG at ./test.pl line 9. readline() on unopened filehandle TAG at ./test.pl line 9. readline() on unopened filehandle TAG at ./test.pl line 9.

    Also, although it isn't broken in your example, your regular expression can use some work. You are matching anything that isn't ">", and then using it in a numerical comparison. This will be a problem if you capture something non-numeric. A better idea would be to write your regex like this:
    if($file =~ /<ins cnt="(\d+)">/) { for ($x=0; $x < $1; $x++) { print F2 "<TAG>"; } }
    What I changed:
    • Matching on digits only for the count.
    • If $line doesn't match the pattern, your code doesn't attempt to use $1 in a for loop.
    • I removed the g modified from your regex, as I don't think you intended for it in this case.
    • Nitpicking: I changed the for loop condition from $x != $1. It does not matter for this example, but it's less likely to get caught in an infinite loop when you are doing more complex things (like possibly modifiying $x inside your loop. Note that there are more Perlish ways of writing this, including print F2 "<TAG>" for(1 .. $1).

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://562540]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (6)
As of 2024-04-19 08:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found