http://qs321.pair.com?node_id=426988


in reply to utf8 && XML::Simple

Just want to point out (this is not what you're asking) that XMLin can also take a file name as first argument. So instead of:
# Yes, I could use a different slurp funciton... my $readin=""; open (IN,"<out.xml"); while (<IN>){ $readin = $readin . $_; } close IN; my $result = XMLin($readin,KeyAttr=>{item=>'name'},ForceArray=>1);
you can just say:
my $result = XMLin("out.xml",KeyAttr=>{item=>'name'},ForceArray=>1);

Replies are listed 'Best First'.
Re^2: utf8 && XML::Simple
by zakzebrowski (Curate) on Feb 01, 2005 at 18:26 UTC
    ++ ++ !! This technique works. It looks like XML::Simple will *automatically* read in a file as utf8. So, you must explicilty write a file as utf8, and just use the XMLin method explicitly to read the file... Thanks! Zak


    ----
    Zak - the office
Re^2: utf8 && XML::Simple
by Trace On (Novice) on Apr 16, 2015 at 07:45 UTC
    I have the same problem as zakrebrowski. When I use his example I still do not get the characters right:
    #!/usr/bin/perl use XML::Simple; use Data::Dumper; use Encode; my $content = "<?xml version=\"1.0\" encoding=\"UTF-8\" ?>\n"; $content .= "<tag>\x{c3}\x{bb}</tag>\n"; print "input:\n$content\n"; my $xml = new XML::Simple; my $data = $xml->XMLin($content, KeepRoot => 1); encode_utf8($data->{'tag'}); print "data: ".$data->{'tag'}."\n"; print Dumper $data;
    returns:
    input: <?xml version="1.0" encoding="UTF-8" ?> <tag>û</tag> data: $VAR1 = { 'tag' => "\x{fb}" };

    My real life code tries to parse an xml with xml::Simple and stores the data in a mysql-database. The database has the same encoding problems as above.

    I am looking at this sample code for days now with no idea where to go on ... Any help is appreciated!

      The code appears correct, because 00FB is Latin Small Letter U With Circumflex. So the next steps would be to check how the data gets stored in MySQL, how you retrieve the data and how you then display the data.

        Next step? Right step! Forgot a tiny $dbh->do('set names utf8'); in my code. Thanks!