Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re: use locale broken?

by moritz (Cardinal)
on Mar 16, 2011 at 19:20 UTC ( [id://893621]=note: print w/replies, xml ) Need Help??


in reply to use locale broken?

If you use properly decoded strings (which you do, since use utf8; is in effect) and no locales, \w, \d etc. follow Unicode semantics, which means they match more than the basic Latin characters.

I'm not very familiar with locales, but I guess that it expects the strings to be non-decoded binary strings in the encoding specified in the locale (here: UTF-8), so it might work without the utf8 pragma.

In general I recommend against locales, if you can avoid them. In my experience they are always a source of trouble, and don't bring the promised "do what I mean"-effect.

Replies are listed 'Best First'.
Re^2: use locale broken?
by december (Pilgrim) on Mar 17, 2011 at 18:13 UTC

    It seems that use locale just doesn't work well for UNICODE character sets, because it doesn't consider these locale-specific characters valid word characters. I think it's a problem in Perl, because clearly \w should include "צהו" Scandinavian characters when such a locale is in effect, UNICODE or not.

    But well, I can avoid buggy locale handling by explicitly converting all input and output to UNICODE, regardless of the user's settings. I just wish it would have worked...

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://893621]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (3)
As of 2024-04-25 21:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found