Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

I've gotten the ball to the 4 yard line on a problem I've been working on wrt to the Spreadsheet::XLSX module. Looking for help banging this problem into the end zone. I don't know enough about XLSX parsing to properly fix this.

The problem:

I'm parsing a .xlsx file generated by quickbooks. The spreadsheet has some custom format codes in it. Here's the appropriate xml from the styles.xml file embedded in the xlsx archive:

<numFmt numFmtId="165" formatCode="#,##0.00\ _€"/><numFmt numFmtId="16 +6" formatCode="&quot;$&quot;* #,##0.00\ _€"/>

When extracting data from the cell with this custom format code, the value looked like this:

'20.00  €'

Note the euro symbol tacked on to the end. This symbol does not show up when viewing the spreadsheet.

The hack:

To get this working the way I wanted, I hacked the Spreadsheet::XLSX module. I made two changes. One to the __load_styles function:

sub __load_styles { my ($zip) = @_; my $member_styles = $zip->memberNamed('xl/styles.xml'); my @styles = (); my %style_info = (); if ($member_styles) { my $formatter = Spreadsheet::XLSX::Fmt2007->new(); foreach my $t ($member_styles->contents =~ /xf\ numFmtId="([^" +]*)"(?!.*\/cellStyleXfs)/gsm) { #" push @styles, $t; } my $default = $1 || ''; foreach my $t1 (@styles) { $member_styles->contents =~ /numFmtId="$t1" formatCode="([ +^"]*)/; my $formatCode = $1 || ''; if ($formatCode eq $default || not($formatCode)) { if ($t1 == 9 || $t1 == 10) { $formatCode = '0.00000%'; } elsif ($t1 == 14) { $formatCode = 'yyyy-mm-dd'; } else { $formatCode = ''; } } $formatCode = $formatter->FmtStringDef($t1); $style_info{$t1} = $formatCode; $default = $1 || ''; } } return (\@styles, \%style_info); }

The other change I made was to the Spreadsheet::XLSX::Fmt2007.pm module where I added in the hex values for "165" and "166" along with the format I wanted without the euro sign:

0xA5 => '#,##0.00', 0xA6 => '&quot;$&quot;* #,##0.00',

Obviously this is a bad hack. What is the proper way to fix these modules so I can have custom fomatters that work as intended when parsing an xlsx file?

$PM = "Perl Monk's";
$MCF = "Most Clueless Friar Abbot Bishop Pontiff Deacon Curate Priest Vicar";
$nysus = $PM . ' ' . $MCF;
Click here if you love Perl Monks


In reply to Looking for assistance for proper fix for Spreadsheet::XLSX bug(?) by nysus

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (3)
As of 2024-03-29 14:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found