I've gotten the ball to the 4 yard line on a problem I've been working on wrt to the Spreadsheet::XLSX module. Looking for help banging this problem into the end zone. I don't know enough about XLSX parsing to properly fix this.
The problem:
I'm parsing a .xlsx file generated by quickbooks. The spreadsheet has some custom format codes in it. Here's the appropriate xml from the styles.xml file embedded in the xlsx archive:
<numFmt numFmtId="165" formatCode="#,##0.00\ _€"/><numFmt numFmtId="16
+6" formatCode=""$"* #,##0.00\ _€"/>
When extracting data from the cell with this custom format code, the value looked like this:
'20.00 €'
Note the euro symbol tacked on to the end. This symbol does not show up when viewing the spreadsheet.
The hack:
To get this working the way I wanted, I hacked the
Spreadsheet::XLSX module. I made two changes. One to the
__load_styles function:
sub __load_styles {
my ($zip) = @_;
my $member_styles = $zip->memberNamed('xl/styles.xml');
my @styles = ();
my %style_info = ();
if ($member_styles) {
my $formatter = Spreadsheet::XLSX::Fmt2007->new();
foreach my $t ($member_styles->contents =~ /xf\ numFmtId="([^"
+]*)"(?!.*\/cellStyleXfs)/gsm) { #"
push @styles, $t;
}
my $default = $1 || '';
foreach my $t1 (@styles) {
$member_styles->contents =~ /numFmtId="$t1" formatCode="([
+^"]*)/;
my $formatCode = $1 || '';
if ($formatCode eq $default || not($formatCode)) {
if ($t1 == 9 || $t1 == 10) {
$formatCode = '0.00000%';
} elsif ($t1 == 14) {
$formatCode = 'yyyy-mm-dd';
} else {
$formatCode = '';
}
}
$formatCode = $formatter->FmtStringDef($t1);
$style_info{$t1} = $formatCode;
$default = $1 || '';
}
}
return (\@styles, \%style_info);
}
The other change I made was to the Spreadsheet::XLSX::Fmt2007.pm module where I added in the hex values for "165" and "166" along with the format I wanted without the euro sign:
0xA5 => '#,##0.00',
0xA6 => '"$"* #,##0.00',
Obviously this is a bad hack. What is the proper way to fix these modules so I can have custom fomatters that work as intended when parsing an xlsx file?
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.