I won't go into the (for me, vexed) question of the rights and wrongs of using regexes this way, but if you are going to do it this way, you can certinly make your code a lot more readable.
#!/usr/bin/perl
use warnings;
use strict;
open (READ, "test.xml") || die "ERROR: $!\n";
my @array = <READ>;
close READ;
open (WRITE, ">new.xml") || die "ERROR: $!\n";
foreach (@array) {
if ($_ =~ m'<!-- Testing XML -->'){
print WRITE "<bar>\n",
"<name> TEST </name>\n",
"<type> Foo </type>\n",
"<!-- PDP Status -->\n",
"<unknown_sec> 0 </unknown_sec>\n",
"</bar>\n\n";
}
if ($_ =~ m'</Test Tag>' ) {
print WRITE "<bar><value> TEST </value></bar>\n";
}
if ($_ =~ m[\Q</v></row>\E\n$] ) {
$_ =~ s[\Q</row>\E\n$][<v> UnKnown </v></row>\n];
}
print WRITE $_;
}
close WRITE;
I beleive that the above is equivalent to your posted code, but it is untested and I may have introduced errors, but a picture is worth a thousand words.
The first thing I would change are the multiple print statements for a single print statement.
Then there is little to be gained by using single quoted strings for part of you output if you need to use a double quoted string to add the newlines. The compiler will probably make an better job of optimising this than you will:) Using a single print statement is probably slightly more efficient that multiple calls, but the main benefit is readability (IMO:).
Another change I would make is to avoid having to escape characters in regexes where it isn't needed. Most of the characters you were escaping simply didn't need to be escaped, but where the regex doesn't contain any meta-characters, using single quotes as an alternative delimiter (eg. m'') avoids interpolation makes things a lot cleaner.
It would possibly be more efficient (given that was your question) to use index in these cases anyway.
Where the regex does contain some meta-characters and some which you would need to escape to prevent them being read as such (not the case here I think, but it serves as an example), then using \Q and \E around the bits you want escaped is often cleaner and more readable than escaping each character individually.
The final change was to remove the duplicated print WRITE $_; statement and the redundant else clause. You could also reduce that to a slightly simpler print WRITE; as $_ is the default, but whether that clarifies or obfuscates is an open question.
Examine what is said, not who speaks.
1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
3) Any sufficiently advanced technology is indistinguishable from magic.
Arthur C. Clarke.
|