in reply to UTF-8 and browsers - Update
Bug in Firefox. It should work as you describe.
As for the composition: first of all, work on characters, or at least or codepoints, not on utf-8 bytes. Second, you want Unicode Normal Form C (see Unicode::Normalize), so that you can write:
use Unicode::Normalize; use charnames ':full'; # this is just to make things easier in this ex +ample binmode(STDOUT,':utf8'); # this to make 'print' output utf-8 bytes my $a="O\N{COMBINING DIAERESIS}"; my $b=NFC($a); print length($a),$a,"\n"; print length($b),$b,"\n";
Will print:
2Ö 1Ö
(more or less, depending on PM's escaping mechanisms)
-- dakkar - Mobilis in mobile
Most of my code is tested...
|
---|
In Section
Seekers of Perl Wisdom