http://qs321.pair.com?node_id=11117438

nysus has asked for the wisdom of the Perl Monks concerning the following question:

I've gotten the ball to the 4 yard line on a problem I've been working on wrt to the Spreadsheet::XLSX module. Looking for help banging this problem into the end zone. I don't know enough about XLSX parsing to properly fix this.

The problem:

I'm parsing a .xlsx file generated by quickbooks. The spreadsheet has some custom format codes in it. Here's the appropriate xml from the styles.xml file embedded in the xlsx archive:

<numFmt numFmtId="165" formatCode="#,##0.00\ _€"/><numFmt numFmtId="16 +6" formatCode="&quot;$&quot;* #,##0.00\ _€"/>

When extracting data from the cell with this custom format code, the value looked like this:

'20.00  €'

Note the euro symbol tacked on to the end. This symbol does not show up when viewing the spreadsheet.

The hack:

To get this working the way I wanted, I hacked the Spreadsheet::XLSX module. I made two changes. One to the __load_styles function:

sub __load_styles { my ($zip) = @_; my $member_styles = $zip->memberNamed('xl/styles.xml'); my @styles = (); my %style_info = (); if ($member_styles) { my $formatter = Spreadsheet::XLSX::Fmt2007->new(); foreach my $t ($member_styles->contents =~ /xf\ numFmtId="([^" +]*)"(?!.*\/cellStyleXfs)/gsm) { #" push @styles, $t; } my $default = $1 || ''; foreach my $t1 (@styles) { $member_styles->contents =~ /numFmtId="$t1" formatCode="([ +^"]*)/; my $formatCode = $1 || ''; if ($formatCode eq $default || not($formatCode)) { if ($t1 == 9 || $t1 == 10) { $formatCode = '0.00000%'; } elsif ($t1 == 14) { $formatCode = 'yyyy-mm-dd'; } else { $formatCode = ''; } } $formatCode = $formatter->FmtStringDef($t1); $style_info{$t1} = $formatCode; $default = $1 || ''; } } return (\@styles, \%style_info); }

The other change I made was to the Spreadsheet::XLSX::Fmt2007.pm module where I added in the hex values for "165" and "166" along with the format I wanted without the euro sign:

0xA5 => '#,##0.00', 0xA6 => '&quot;$&quot;* #,##0.00',

Obviously this is a bad hack. What is the proper way to fix these modules so I can have custom fomatters that work as intended when parsing an xlsx file?

$PM = "Perl Monk's";
$MCF = "Most Clueless Friar Abbot Bishop Pontiff Deacon Curate Priest Vicar";
$nysus = $PM . ' ' . $MCF;
Click here if you love Perl Monks