If your data is really encoded in UTF-16, not UTF-8 and you have characters with a code point above 127, you're going to be in a world of hurt. I also want to point out that XML::XPath has had it's last update in 2003, and still has several outstanding bugs in it (I know, I did some fixes to overcome a couple in my own processing, and even that was years ago). For XPath processing, it is HIGHLY recommended to use XML::LibXML which handles all the character encodings properly so that you don't have to mangle your input XML. Here's a PerlMonks intro to get you started: Stepping up from XML::Simple to XML::LibXML