Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^6: UTF8 versus \w in pattern matching (basic test)

by haj (Vicar)
on Jul 06, 2021 at 17:54 UTC ( [id://11134713]=note: print w/replies, xml ) Need Help??


in reply to Re^5: UTF8 versus \w in pattern matching (basic test)
in thread UTF8 versus \w in pattern matching

That's not strange. You're seeing Unicode codepoints, which for the characters in question happen to be identical to their ISO-8859-1 encodings. Add "\N{EURO SIGN}" to the string and you get "\x{20ac}": That's again the codepoint and no UTF-8 encoding.

"Everything is UTF-8" is one of the most frequent false assumptions I encounter when dealing with non-ASCII characters.

  • Comment on Re^6: UTF8 versus \w in pattern matching (basic test)

Replies are listed 'Best First'.
Re^7: UTF8 versus \w in pattern matching (basic test)
by jo37 (Deacon) on Jul 06, 2021 at 18:03 UTC

    Thanks for the clarification.

    Greetings,
    -jo

    $gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11134713]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (3)
As of 2024-04-25 17:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found