http://qs321.pair.com?node_id=1227736


in reply to Re^4: Handling utf-8 characters when scraping
in thread Handling utf-8 characters when scraping

One way to verify what Perl is really storing internally is Devel::Peek. What I would look for is that the UTF8 flag is on, and the string when shown as UTF-8 is correct:

use Devel::Peek; my $str = "\x{20AC}"; Dump($str); __END__ SV = PV(0x15c0d70) at 0x15e0440 REFCNT = 1 FLAGS = (POK,IsCOW,pPOK,UTF8) PV = 0x15e4e10 "\342\202\254"\0 [UTF8 "\x{20ac}"] CUR = 3 LEN = 10 COW_REFCNT = 1

Here, [UTF8 "\x{20ac}"] is correct. There's also utf8::is_utf8($str) to check for the UTF8 flag, although I'd recommend only using that for debugging as well. If you don't want all the extra output, you might just say:

use Data::Dump; my $str = "\x{20AC}"; dd $str, utf8::is_utf8($str); __END__ ("\x{20AC}", 1)