http://qs321.pair.com?node_id=461873

pepik_knize has asked for the wisdom of the Perl Monks concerning the following question:

Much to my dismay, I have to process Excel spreadsheets from our (low-tech) vendors. I have a script that opens the file, parses it, checks the data, then writes out an XML file. I have one vendor who somehow generates an Excel 4.0 document (which ParseExcel can't handle). I was about to suggest that she get Openoffice, but then I found the issue I'm wondering about:

If I save a file in Openoffice Calc (saving as Excel, version doesn't matter), ParseExcel sees the formatting instead of the values for numeric fields. Specifically, if I open a new Calc worksheet, then enter '1', 'abc', '4' in rows A -- C, ParseExcel will show 'GENERAL', 'abc', 'GENERAL'. If I then change the formatting of column A to text, I'll see '1'. Excel apparently knows what OO means, though, as if I then open the same file with Excel, it looks and acts normally. Saving the file with Excel eliminates the problem. Is this an OO bug? Is this just something ParseExcel can't handle? (It also doesn't handle dates well, changing 12/31/2004 into 38352.)

I would appreciate any advice on this.

Of all the causes that conspire to blind
Man's erring judgment, and misguide the mind,
What the weak head with strongest bias rules,
Is pride, the never-failing vice of fools.
-- Pope.