http://qs321.pair.com?node_id=983760


in reply to Fixing broken character encoding

It is possible if its been malformed once (single step), multiple iterations can be impossible.

IIRC I think http://validator.w3.org/ can help ( Bundle::W3C::Validator )

As can these
HTML::Encoding - Determine the encoding of HTML/XML/XHTML documents
Encode::Detective - detect a data encoding
Encoding::FixLatin - takes mixed encoding input and produces UTF-8 output
Encode::DoubleEncodedUTF8 - Fix double encoded UTF-8 bytes to the correct one

But you ought to post some minimal html

I figure it ought to be as simple as parsing the file, decoding the entities, treating the string as octets and deciding what charset it is