http://qs321.pair.com?node_id=745255


in reply to Character coding issues with Spreadsheet::XLSX

If M$ somehow decided that Excel 2007 would not change the way unicode is handled in spreadsheets, then this might help you out: xls2tsv uses the old Spreadsheet::ParseExcel, but if the unicode handling hasn't changed, then you'll find a consistent clue about when you need to "decode()" from UTF-16BE into utf8 to get what you want.

Then again, if M$ did decide to change their unicode handling in Excel, you might need to get some sort of hex-dump picture of the character data in the cells of interest. Save a spreadsheet with known non-ascii characters in selected cells, and you should be able to work out what needs to be done.