Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

Re^4: truncate string to byte count

by ikegami (Pope)
on Feb 28, 2019 at 21:09 UTC ( #1230690=note: print w/replies, xml ) Need Help??

in reply to Re^3: truncate string to byte count
in thread truncate string to byte count

But please feel free to provide some actual test code that demonstrates the bug

Code that suffers from The Unicode Bug is code that returns different results for equal strings. This is easily demonstrated using the following:

my $s = "\x80\x80"; utf8::upgrade( my $u = $s ); utf8::downgrade( my $d = $s ); is($u, $d); is(utf8cut($u,2), utf8cut($d,2));

better yet, show how you would've coded it to (at least in your view) "correctly" handle the different strings "\x80\x80" and "\N{U+80}\N{U+80}".

Perl considers those the same value, and any code that doesn't is by definition suffering from The Unicode Bug.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1230690]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (1)
As of 2021-02-27 01:44 GMT
Find Nodes?
    Voting Booth?

    No recent polls found