http://qs321.pair.com?node_id=537066


in reply to Re: problems with extended ascii characters in filenames
in thread problems with extended ascii characters in filenames

graff++

I have similar fights with HTML. While I know the source of the problem is utf8 related I've never really been able to get to the bottom of it. I still come across both Rémy and Rémy!

This is the first time I've come across such a clear, straightforward explanation of what is actually happening. Hopefully, armed with your insights, I now have at least half a chance of avoiding these "screw ups" in future.

Many thanks!

wfsp

Replies are listed 'Best First'.
Re^3: problems with extended ascii characters in filenames
by fraktalisman (Hermit) on Mar 16, 2006 at 13:14 UTC

    As for HTML and Perl source code: Once you start using UTF-8 here, you must not re-save the same files from text editors which do not yet support UTF-8, otherwise the extended characters in the source text get messed up.

    There are unfortunately still quite a lot of programs which only support Latin-1 (iso-8859-1) encoding. In HTML, you could get around the problem with the classic solution of the nineties: writing HTML entities, like é for é etc.

    For the same backward compatibility reason, I usually avoid any non-ASCII character (i.e. ord($char)>127) in filenames.