You could try something like this, i.e. put your not-to-be-replaced entites into a hash for easy lookup:
#!/usr/bin/perl
my %entities = map { $_ => 1 } qw(& " < > ©
+);
while (my $line = <DATA>) {
$line =~ s/(&(\w+?;)?)/exists $entities{$1} ? $1 : "&$2"/eg;
print $line;
}
__DATA__
foo & " bar &blah; &foo baz & ...
TEST&TEST;A&E&an HTML--- string - < © TVS>
Would output:
foo & " bar &blah; &foo baz & ...
TEST&TEST;A&E&an HTML--- string - < ©
+; TVS>
(note that if you run this under strictures, it'll complain "Use of uninitialized value in concatenation" in case $2 is empty... I'll leave this as an exercise for you to fix :)
Update: I hadn't followed your other thread... so looking at wfsp's solution there, I'd say just use that instead and be happy :) |