Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Parse error in Spreadsheet::ParseExcel::Simple

by igelkott (Priest)
on Apr 19, 2008 at 14:19 UTC ( [id://681650]=perlquestion: print w/replies, xml ) Need Help??

igelkott has asked for the wisdom of the Perl Monks concerning the following question:

I'm using Spreadsheet::ParseExcel::Simple to copy data from excel files. This works most of the time but has failed for one of the eight (nearly identical) test files and I can't figure out why.

In this one file, the sheets returned by Spreadsheet::ParseExcel::Simple are not in the same order as found in excel. Actually, I can't find the data I want in any sheet until I delete the other sheets (manually in excel).

The simple conclusion is that this one file is somehow corrupt but a simple VB script (not shown out of respect to the group) finds the sheets in the right order and includes the desired data.

The simple hack shown below is what I've used to look for the missing data and discover the sheet ordering error. It simply echoes a few cells from each sheet and works as expected on the other test files. My real script runs on a unix system with Perl 5.8 but I get the same results on a windows machine (Perl 5.10).

#!/usr/bin/perl use strict; use warnings; use Spreadsheet::ParseExcel::Simple; my $as_file = shift or die("Victim required"); my $xls = Spreadsheet::ParseExcel::Simple->read("$as_file") or die("Ex +cel unread"); my $s = 0; for my $sheet ($xls->sheets) { $s++; my $n = 0; print "#################################\n"; print " sheet $s\n"; print "#################################\n"; while ($sheet->has_data) { $n++; my @d = $sheet->next_row; $#d = 3 if @d > 3; print join("\t", @d), "\n"; last if $n > 10; } }

It may just be some pathological problem with this particular excel file so that the real issue is completely off-topic but thought I'd at least see if there's something simple I'm missing or if others have seen parsing errors with this module.

TIA, iggi

Replies are listed 'Best First'.
Re: Parse error in Spreadsheet::ParseExcel::Simple
by roboticus (Chancellor) on Apr 21, 2008 at 13:49 UTC
    igelkott:

    I'm not certain I understand the problem. Is it that the sheets aren't in the order you're expecting or that your code can't find the sheets until others are deleted?

    If it's the first case, try accessing the sheets by name rather than index. That way you'll be able to find them even if they're reordered or created in a different order than you expect. (Of course, then you're relying on people not renaming the sheets!)

    I don't have any ideas for you if it's the second case.

    ...roboticus
      accessing the sheets by name rather than index

      I tried doing through Spreadsheet::ParseExcel rather than the "simple" version.

      I get the sheet names in the right order (not that that is so important) but the contents appear with a different sheet. For example:

      Sheet foo contents of bar Sheet bar contents of baz Sheet baz empty

      What I want is the data from "foo" (what I expect to be the first sheet). As you suggest, if I could get it by name, all would be saved ... even if the event would be hard to explain.

      I was a bit vague regarding the "reappearing data" because it's just too weird. In excel, I deleted the other sheets from the file and found that I could then find my "foo" data on the "foo" sheet (the only one left).

        igelkott:

        I suspect that it's not a corrupt file, but that the sheets may not be stored in order of appearance on the tab bar. To check it, try adding the method:

        sub sheet_name { shift->{sheet}->{Name}; }

        to the Spreadsheet::ParseExcel::Simple.pm file, then you can print the sheet name associated with the sheet to check it out. Then, of course, let me know what you discover!

        ...roboticus
Re: Parse error in Spreadsheet::ParseExcel::Simple
by Aim9b (Monk) on Apr 21, 2008 at 11:15 UTC
    Since you have access to a windows box, you might try another perl option such as WIN32::OLE, just to see if it offers the same results. It's the only one I've played with so far. Good luck.
      try ... WIN32::OLE

      Thanks for the suggestion. Actually, I just tested Win32::OLE earlier today. A simple script (not shown) found that the sheet names and contents were in the right order. The "desired" data was present and readable.

      The only practical problem for me is that I'd be difficult to use this on the unix server (I don't even have wine there). If all else fails, I could switch things around to have the client (windows) computers do the parsing, but I'll waste some more time on this approach before giving up.

      Just for curiosity, I'll try to figure out why Spreadsheet::ParseExcel::Simple fails. Is it a weird excel file or a real bug?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://681650]
Approved by ikegami
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (3)
As of 2024-04-19 23:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found