Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Excel Spreadsheet Data Set

by rob_au (Abbot)
on Aug 20, 2001 at 15:46 UTC ( [id://106188]=CUFP: print w/replies, xml ) Need Help??

This code was written for a web site where the optimal data source for the dynamic pages would be an Excel spreadsheet. This choice in data source, while perhaps not as flexible as a relational database, would allow the client to update their web site through an administrative web interface where the updated data set from the Excel spreadsheet can be readily uploaded to the site.
 
The following code supports multiple 'categories' of data sets across multiple Excel sheets and returns the sheet data in the hash key 'data' and the data fields in the hash key 'fields' - At present, this code assumes the data field names to reside in the first row of the Excel sheet with data fields aligned in columns.
 
The resultant hashref returned by this routine can readily be accessed by other modules such as Data::Dumper and XML::Simple - If there is sufficient interest, I can post some additional code showing how I have employed the returned hashref in my code.
sub _load_xls ($) { my ($fname) = shift; return undef unless (-e $fname); my ($parse) = Spreadsheet::ParseExcel->new(); my ($excel) = $parse->Parse($fname); my (%data); for (my $sheetcount = 0; $sheetcount < $excel->{SheetCount}; $shee +tcount++) { my ($sheet) = $excel->{Worksheet}[$sheetcount]; my (@fields, @data); for (my $field = $sheet->{MinCol}; defined $sheet->{MinCol} && + $field <= $sheet->{MaxCol}; $field++) { my ($cell) = $sheet->{Cells}[$sheet->{MinRow}][$field]; push (@fields, $cell->Value) if ($cell); }; for (my $row = $sheet->{MinRow} + 1; defined $sheet->{MinRow} +&& $row <= $sheet->{MaxRow}; $row++) { my (%item); for (my $col = $sheet->{MinCol}; defined $sheet->{MinCol} +&& $col <= $sheet->{MaxCol}; $col++) { my ($cell) = $sheet->{Cells}[$row][$col]; $item{$fields[$col - $sheet->{MinCol}]} = $cell->Value + if ($cell); }; push (@data, \%item); }; $data{$sheet->{Name}} = { 'fields' => \@fields, 'data' => \@data, }; }; return (\%data); };

Replies are listed 'Best First'.
Re: Excel Spreadsheet Data Set
by George_Sherston (Vicar) on Aug 20, 2001 at 16:13 UTC
    This is great - thanks very much for posting it. I think it's a good solution to a problem I'm facing, which is that I have to get a lot of different people, many of whom don't know a web site from a hole in the ground, much less MYSQL, to submit data. And one thing they DO understand is excel. But it's very nice not to have to do all that cutting and pasting. ++, you the man.

    In an attempt to make this comment worthy of its own node, I'd like to pose the general question to all monks, whether they have any keen ideas how to let users interact with perl by using systems they're already familiar with. I guess the challenge is making it error-proof, because it's easy to ask someone to submit a tab-separated list, but harder to make sure he or she puts all the necessary tabs and no unnecessary tabs. Forcing users to use OUR choice of interface is great because it makes it error-proof, but it's not so nice for the user.

    By the way, I have an ancient fax programme (maybe 1995) and when the fax is being sent it flips up a little picture of a fax machine. Which is funny. Because I remember when FAXES were the confusing weird technology that nobody understood, whereas now the fax is the comforting old-fashioned thing everyone's grown up with, which we use to help explain what's going on with all this bizarre modern computer stuff.

    § George Sherston
      In an attempt to make this comment worthy of its own node, I'd like to pose the general question to all monks, whether they have any keen ideas how to let users interact with perl by using systems they're already familiar with. I guess the challenge is making it error-proof, because it's easy to ask someone to submit a tab-separated list, but harder to make sure he or she puts all the necessary tabs and no unnecessary tabs. Forcing users to use OUR choice of interface is great because it makes it error-proof, but it's not so nice for the user.
       
      It was very much with this idea in mind that I wrote this code. I played with a few data storage methods for this web site - I decided that MySQL was more than what was needed by this client, DBM didn't really allow for easy update without yet-another-interface, CSV wasn't flexible enough (or more too that I don't really like using DBD::CSV) and while XML provided portability and easy updating, Excel seemed to be an easier option for the client.
       
      This data storage method in combination with a web-upload facility, I believe, will allow my client to readily update their web site with minimal fuss and without having to come back to me, as their web developer, to consistently update their site. Autonomy equates to empowerment and in a world where increasingly users are playing 'catch-up' in the IT sector, empowerment can translate (I hope) to higher levels of satisfaction.
       

       
      Ooohhh, Rob no beer function well without!
Re: Excel Spreadsheet Data Set
by dmmiller2k (Chaplain) on Aug 22, 2001 at 21:36 UTC

    This is a great solution if your site is hosted on a Lose321 system, but not as useful on UNIX/Linuxen.

    More often, I find myself receiving data in spreadsheet form and having to somehow manipulate it in perl before inserting it into a database, or vice versa.

    I have found it to be fairly cumbersome to save the individual worksheets from an Excel workbook file (*.xls) as CSV files. Once this is done, of course, using either DBI::CSV or just straight Perl to manipulate the data is simple and straightforward (and probably not worthy of further discussion).

    The reverse direction is even more cumbersome: putting several CSV files into a workbook as separate worksheets.

    Anyone know of a portable library (preferably with a Perl module interface) for directly manipulating Excel workbook files? It would have to work on UNIX (Solaris, and maybe Linux).

    Thanks, dmm

    Just call me the Anti-Gates ...
    

    1 Lose32 - I first saw this term used by Al Stevens in his C Programming column in Dr. Dobbs Journal sometime in 2000, I forget exactly when.

      The reverse direction is even more cumbersome: putting several CSV files into a workbook as separate worksheets.

      The Spreadsheet::WriteExcel module can create a multiple worksheet Excel file on Unix. The standard distribution comes with an example program called csv2xls.pl which will convert CSV to an Excel file. There is also an example here.

      Anyone know of a portable library (preferably with a Perl module interface) for directly manipulating Excel workbook files? It would have to work on UNIX (Solaris, and maybe Linux).

      Takinori Kawai's DBD::Excel module provides a DBI wrapper around Spreadsheet::WriteExcel and Spreadsheet::ParseExcel on all platforms. In addition Spreadsheet::ParseExcel, by the same author, contains a module called Spreadsheet::ParseExcel::SaveParser which uses WriteExcel and ParseExcel to provide a round-trip access to Excel files.

      None of these solutions are 100% effective but they may serve your needs.

      John.
      --

      The issue with directly manipulating Excel files is that you have to have a good parser and a good writer. Spreadsheet::ParseExcel works quite well as a Parser. We use it at work, quite effectively. (It's impossible to read his code, and there's numerous improvements that can be made to his alpha release, but that's neither here nor there.)

      However, Spreadsheet::WriteExcel has been known to have issues, especially when working with Spreadsheet::ParseExcel, primarily with formulas.

      That aside, those two should make a very good pair.

      ------
      /me wants to be the brightest bulb in the chandelier!

      Vote paco for President!


        However, Spreadsheet::WriteExcel has been known to have issues, especially when working with Spreadsheet::ParseExcel, primarily with formulas.

        "Issues" might scare away the punters. ;-)

        The situation is as follows. Excel writes a formula in two parts. The RPN binary encoding of the formula and the result of the calculation. Spreadsheet::WriteExcel only writes the encoded formula part because it wouldn't be possible to the calculate, a priori, the result of arbitrarily complex expressions. Equally Spreadsheet::ParseExcel only reads the result part of a formula because it isn't in a position to evaluate the formula. Therefore, when Spreadsheet::ParseExcel reads a formula written by Spreadsheet::WriteExcel it only sees a blank result.

        In the circumstances this is a best effort by both modules and most users would be unaffected by it.

        John.
        --

      This is a great solution if your site is hosted on a Lose321 system, but not as useful on UNIX/Linuxen.
       
      Entirely untrue - This solution was specifically built so that the client could upload their Excel spreadsheet from their Win32 machine to the Unix web server where the data could be integrated into the web page.
       
      Anyone know of a portable library (preferably with a Perl module interface) for directly manipulating Excel workbook files? It would have to work on UNIX (Solaris, and maybe Linux).
       
      You might do good to take a further look into the Spreadsheet::ParseExcel module - This module provides excellent methods for extracting spreadsheet data and format information, irrelevant of the platform.
       

       
      Ooohhh, Rob no beer function well without!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://106188]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (2)
As of 2024-04-25 05:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found