http://qs321.pair.com?node_id=461873

pepik_knize has asked for the wisdom of the Perl Monks concerning the following question:

Much to my dismay, I have to process Excel spreadsheets from our (low-tech) vendors. I have a script that opens the file, parses it, checks the data, then writes out an XML file. I have one vendor who somehow generates an Excel 4.0 document (which ParseExcel can't handle). I was about to suggest that she get Openoffice, but then I found the issue I'm wondering about:

If I save a file in Openoffice Calc (saving as Excel, version doesn't matter), ParseExcel sees the formatting instead of the values for numeric fields. Specifically, if I open a new Calc worksheet, then enter '1', 'abc', '4' in rows A -- C, ParseExcel will show 'GENERAL', 'abc', 'GENERAL'. If I then change the formatting of column A to text, I'll see '1'. Excel apparently knows what OO means, though, as if I then open the same file with Excel, it looks and acts normally. Saving the file with Excel eliminates the problem. Is this an OO bug? Is this just something ParseExcel can't handle? (It also doesn't handle dates well, changing 12/31/2004 into 38352.)

I would appreciate any advice on this.

Of all the causes that conspire to blind
Man's erring judgment, and misguide the mind,
What the weak head with strongest bias rules,
Is pride, the never-failing vice of fools.
-- Pope.
  • Comment on Spreadsheet::ParseExcel incompatible with Openoffice?

Replies are listed 'Best First'.
Re: Spreadsheet::ParseExcel incompatible with Openoffice?
by monarch (Priest) on May 30, 2005 at 23:51 UTC

    I wonder, if I may, ask some more about your problem. I have a (small) interest in ParseExcel, as well as WriteExcel, as I recently wrote a script to deliver hundreds of Excel spreadsheets using the WriteExcel module.

    You were saying that your customer generates an Excel 4.0 document, and that ParseExcel can't handle it - do you mean that ParseExcel just aborts on the file, or does it returns the kind of problems you were experiencing with OpenOffice calc?

    Also, according to the CPAN documentation for ParseExcel, each cell object has a number of fields. They include: Value, Val, Type, Code, Format, Merged, and Rich. Which field were you viewing? Is is possible that if you were dumping:

    $oWkC = $oWkS->{Cells}[$iR][$iC]; print $oWkC->{Value};
    that maybe the value you're looking for could be in:
    print $oWkC->{Val};
    ?

    Perhaps a contribution of:

    use Data::Dumper; print Dumper( $oWkC );
    might aid in your question here?

      ParseExcel cannot open the original document (Excel 4.0). The specific error is "Couldn't open your file::Bad file descriptor."

      As it turns out, I forgot that I was using ParseExcel::Simple. That wasn't really a problem, though, as when I switch back to ParseExcel, I see the same behavior.

      Now, that was a great suggestion to look at the object using Data::Dumper. I did the following:

      I dumped the objects as they were saved by OO, then did the same after saving (but not changing) the file with Excel. Two fields differed: _Value, and FmtIdx. It seems that the _Value field gets set to "GENERAL" for numeric fields by OO, and the value is in the Val field. Excel has the value in both places. In other words, you were correct to surmise:

      that maybe the value you're looking for could be in:
      print $oWkC->{Val};

      I had switched over to ParseExcel::Simple because it was easier to use, but I suppose that if I'd like to get this vendor to send me usable files, I'll need to switch back and use the Val value. <sigh>

      ...

      On further examination, that doesn't help the date formatting at all. I'll still have to convert that, but I'm not too worried about it. (It should be related to the Julian date.) Thanks very much for your help!

Re: Spreadsheet::ParseExcel incompatible with Openoffice?
by sekitan (Beadle) on May 31, 2005 at 01:02 UTC
    I've had similar problems with ParseExcel. In such cases, I was able to use OLE to get the job done. Even using newer versions of Excel through OLE to process xls files created by old versions didn't give me any trouble.

    However, I have to guess you are not a Win32 Monk. Unless another monk has a good solution, windows perl would be an easy fix. Best of luck.

      Actually, I am a Win32 Monk. However much I'd like to use OLE, though, I can't. This is a cgi script that currently lives on my laptop, where I access it. Eventually it will be an outward facing page that resides on a Linux box, where our vendors will submit their files. No OLE there, I'm afraid.

      Thanks for the suggestion!