Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask

Re^3: Special character not being captured

by choroba (Archbishop)
on Jun 21, 2019 at 07:15 UTC ( #11101652=note: print w/replies, xml ) Need Help??

in reply to Re^2: Special character not being captured
in thread Special character not being captured

> when I go to get the first character (...) I suddenly need to specify the encoding

UTF-8 is a multi-byte encoding. It means that some characters, being one of them, are encoded by more than one byte (in this case, two bytes: 0xC3 0x86). If a string starts with such a character, but Perl doesn't know the encoding, it assumes Latin-1, which is a single byte encoding. First character then corresponds to the first byte only, which is 0xC3. It doesn't have any meaning in UTF-8, so it's transformed into �, the replacement character.

map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

Replies are listed 'Best First'.
Re^4: Special character not being captured
by Lady_Aleena (Curate) on Jun 23, 2019 at 17:47 UTC

    One last thing, I've been trying to figure out how to add utf8 to first_alpha, which I posted earlier. I am not having any success with it. So, how should I add it to that subroutine?

    No matter how hysterical I get, my problems are not time sensitive. So, relax, have a cookie, and a very nice day!
    Lady Aleena
      It doesn't belong there. You should always decode the input, as soon as possible; and similarly encode the output immediately before sending it out. first_alpha should receive an already decoded string.

      map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11101652]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (2)
As of 2021-09-21 12:10 GMT
Find Nodes?
    Voting Booth?

    No recent polls found