Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re: tr/// not working for replacment of curly quotes (use utf8)

by LanX (Saint)
on Jul 26, 2020 at 21:22 UTC ( [id://11119840]=note: print w/replies, xml ) Need Help??


in reply to tr/// not working for replacment of curly quotes

> What am I missing?

use utf8 ?

You should dump your string to see what's inside, best inspected with Devel::Peek

Update

Tested on my mobile

$ perl use utf8; $string = "“can’t”"; $string =~ tr/“’”/"'"/; print $string . "\n"; __END__ "can't" $

Explanation

Without utf8 Perl is considering every string to be a byte string not a character string (the flag will be missing)

Your multibyte unicode character might look ok in your editor but Perl will try to transliterate individual bytes.

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery

Replies are listed 'Best First'.
Re^2: tr/// not working for replacment of curly quotes
by nysus (Parson) on Jul 26, 2020 at 21:30 UTC

    OK, so it seems I'm hopelessly confused on use utf8;. I didn't think that was needed anymore.

    $PM = "Perl Monk's";
    $MCF = "Most Clueless Friar Abbot Bishop Pontiff Deacon Curate Priest Vicar";
    $nysus = $PM . ' ' . $MCF;
    Click here if you love Perl Monks

      I provided an explainantion for it now.

      And next time you cite the docs, please try to link to it like [doc://tr]

      > I didn't think that was needed anymore.

      Perl still needs to be able to operate with byte strings and this backwards compatibly.

      So you need to indicate that the literals in your code contain multi byte characters in utf-8 ( provided your editor is set to produce utf8)

      Otherwise you'll need to use Encode to convert between the two string states.

      This is also necessary when converting from another encoding than utf8.

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

      Unless this has changed again, (I normally work in ASCII) use utf8 specifically declares that the Perl program itself is in UTF-8. As a side effect, (because the DATA handle is the handle the parser used to read the code) it causes any __DATA__ block to also be read as UTF-8.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11119840]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (3)
As of 2024-04-25 23:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found