http://qs321.pair.com?node_id=1199814

mje has asked for the wisdom of the Perl Monks concerning the following question:

I've been using XML::Fast to process XML files for some time and successfully. However, the code was moved to a newer machine and has stopped working in some circumstances. A difference between the machines is XML::Fast version, 0.11 on original machine (working) and 0.17 on new machine (not working). When no other changes are made but to upgrade to 0.17 on the old machine it also stops working.

The error I'm getting is:

Failed to encode 2017-9-21T08-49-17.XML to JSON for indexing - malform +ed or illegal unicode character in string [�ndby IF], cannot c +onvert to JSON at xx.pm line 1827.

The XML file comes from a 3rd party and is ISO-8859-1 encoded. The bit it is complaining about is <Value>Br<F8>ndby IF</Value>. A cut down version of the XML which fails is:

<?xml version="1.0" encoding="ISO-8859-1"?> <xx feedtype="delta"><Timestamp CreatedTime="2017-09-21T06:49:17" Time +Zone="GMT"/><Value>Brøndby IF</Value></xx>

The code which is now failing is:

use Cpanel::JSON::XS; use XML::Fast; sub esIndexFile2 { my ($self, $file) = @_; my $xml = do { local $/ = undef; open (my $fh, "<:encoding(ISO-8859-1)", $file) or die "Failed +to open $file - $!"; <$fh>; }; $xml =~ s/^(?:.*\n)//; # remove first line - the encoding lin +e my $hash; eval { $hash = xml2hash $xml; }; if (my $ev = $@) { warn("Failed to parse file $file for indexing - $@ - SKIPPING" +); return; } my $json = eval { encode_json($hash); # <------------ fails here }; if (my $ev = $@) { $self->logwarn("Failed to encode $file to JSON for indexing - +$@ - SKIPPING"); return; } return 1; }

The changes file for XML::Fast is not too helpful. I have discovered adding utf8decode => 1 to the xml2hash makes it work now but I don't really understand why. I am doing anything wrong here? What might have changed in XML::Fast to cause this to happen?