http://qs321.pair.com?node_id=596572


in reply to clean html tags

For just escaping HTML entities, I use this code:
{ # closure my %HTML_ESCAPE = ( "\xa0" => "&nbsp;", "&" => "&amp;", "'" => "&apos;", "\"" => "&quot;", "<" => "&lt;", ">" => "&gt;", ); sub html_escape { return '' unless defined($_[0]); (my $t=$_[0]) =~ s/([\xa0\'\"&<>])/$HTML_ESCAPE{$1}/g; $t; } }
It's best to escape the data as it's coming in; otherwise it's very difficult to distinguish between, for example, a less-than sign that should be converted to &lt; and one that is part of the markup.