http://qs321.pair.com?node_id=213501


in reply to Fixing Bad HTML

HTML::TreeBuilder does a good job of finding and closing such problems when it parses, as well as adding some implicit tags that get forgotten. The following line-liner should be enough to get you started:

perl -MHTML::TreeBuilder -ne 'print map {ref $_ ? $_->as_HTML : $_} HTML::TreeBuilder->new_from_content($_) ->look_down(_tag=>"body")->content_list'

perl -pe '"I lo*`+$^X$\"$]!$/"=~m%(.*)%s;$_=$1;y^`+*^e v^#$&V"+@( NO CARRIER'