http://qs321.pair.com?node_id=534748

skx has asked for the wisdom of the Perl Monks concerning the following question:

I'm interested in modifying user submitted HTML, such that all tags are balanced.

eg "<b><i>test</b>" is obviously broken HTML.

I realise I can do simple cases with regexps, but to do it properly I probably want to use HTML::Treebuilder, or similar.

The problem is I'm not 100% sure how to start. I can certainly keep a stack of opened tags, and know when something is broken. But pushing the closures on in the right order is a bit tricky.

Suprisingly CPAN didn't seem to have anything to offer when I searched for terms such as 'html balance', so if there is existing code I've not found it.

Steve
--