note
clinton
<p>A typical place where [mod://Encode::Guess] falls down (through no fault of its own) is in differentiating one variant of [wp://iso-8859] from another.</p>
<p>Who's to say if <c>chr(250)</c> is "Č" (ISO-8859-2) or "Θ" (ISO-8859-7)?</p>
<p>Without prior knowledge, you're up the creek without a paddle. So I agree wholeheartedly with [id:/748896|Moritz's suggestion] of converting everything to UTF8, while you still know what encoding it is in.</p>
<p>([graff] - I know you're too wise a monk to have been suggesting otherwise, but I wanted to provide a simple example of just how limited [mod://Encode::Guess] can be.)</p>
<p>Clint</p>
748893
749063