Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re: Looking for assistance for proper fix for Spreadsheet::XLSX bug(?)

by Tux (Canon)
on May 29, 2020 at 08:41 UTC ( [id://11117450]=note: print w/replies, xml ) Need Help??


in reply to Looking for assistance for proper fix for Spreadsheet::XLSX bug(?)

I say it once more: DO NOT USE Spreadsheet::XLSX ! It is dead, buggy and not maintained anymore!

Chasing bugs in this old crap is a waste of time, as the bugs won't get fixed anyway. The module has served its goal and was useful, very useful, when it was written (and maintained), but please people, stop using it.

Install and use Spreadsheet::ParseXLSX, rewrite you script to match its API and then check if you still have problems.


Enjoy, Have FUN! H.Merijn
  • Comment on Re: Looking for assistance for proper fix for Spreadsheet::XLSX bug(?)

Replies are listed 'Best First'.
Re^2: Looking for assistance for proper fix for Spreadsheet::XLSX bug(?)
by nysus (Parson) on May 29, 2020 at 12:01 UTC

    I did try Spreadsheet::ParseXLSX (stumbled on it while trying to see what to do about this problem) but I found it gave a lot of warnings for empty cells and it did not even handle the utf-8 euro symbol at all. I've come to the conclusion the there is no great solution for parsing Excel files with Perl. That makes me sad.

    $PM = "Perl Monk's";
    $MCF = "Most Clueless Friar Abbot Bishop Pontiff Deacon Curate Priest Vicar";
    $nysus = $PM . ' ' . $MCF;
    Click here if you love Perl Monks

      "...I found it gave a lot of warnings for empty cells and it did not even handle the utf-8 euro symbol at all"

      Emphasis mine. From rt://40061:

      'The Euro symbol isn't stored in Excel as a single character. It is stored as a UTF-16 character and is returned by Spreadsheet::ParseExcel as a Perl utf8 character.'

      'This is what you are seeing and in terms of the way Spreadsheet::ParseExcel handles Unicode it is the correct behaviour.'

      'How you should handle the utf8 string from there depends on what you want to do with it.'

      further on..

      "I've come to the conclusion the there is no great solution for parsing Excel files with Perl. That makes me sad. "

      If you are saddened by what is available, extend what's there or write your own. Alternatively Spreadsheet::Read:

      #!/usr/bin/perl use strict; use warnings; use feature 'say'; use Spreadsheet::Read; use open ':std', ':encoding(UTF-8)'; my $book = ReadData('euro.xlsx'); my $cell = $book->[1]{A1}; say $cell;

      Output:

      €40

        Spreadsheet::Read also uses Spreadsheet::XLSX under the hood. Buggy or not, I found that Spreadsheet::XLSX handled the euro symbol without issue via Spreasheet::BasicRead.

        I think the heart of the problem, as I've discovered, is that the modules are trying to applying a custom format improperly and there is really no reason for these modules to try to apply the custom formatting because all it does is insert some padding into the cell. To address this, I'm just going to strip out the weird formatting command in the format:  _€, and not worry about this anymore.

        All the XLSX modules, with the exception of Spreadsheet::Read, haven't been touched in about 4 years and have long issue queues. But Spreadsheet::Read relies on modules that are not maintained.

        $PM = "Perl Monk's";
        $MCF = "Most Clueless Friar Abbot Bishop Pontiff Deacon Curate Priest Vicar";
        $nysus = $PM . ' ' . $MCF;
        Click here if you love Perl Monks

      Maybe fix those problems instead then? Post an SSCCE
Re^2: Looking for assistance for proper fix for Spreadsheet::XLSX bug(?)
by nysus (Parson) on May 29, 2020 at 12:13 UTC

    And based on the issue queue which looks to be largely ignored, I'd say this ParseXLSX is abandonware.

    $PM = "Perl Monk's";
    $MCF = "Most Clueless Friar Abbot Bishop Pontiff Deacon Curate Priest Vicar";
    $nysus = $PM . ' ' . $MCF;
    Click here if you love Perl Monks

      IMHO it shows two things:

      1. People use it
      2. It needs more people to help fixing the issues

      Don't moan, help!


      Enjoy, Have FUN! H.Merijn

        There is a difference between moaning and pointing things out. The two shouldn't be confused.

        $PM = "Perl Monk's";
        $MCF = "Most Clueless Friar Abbot Bishop Pontiff Deacon Curate Priest Vicar";
        $nysus = $PM . ' ' . $MCF;
        Click here if you love Perl Monks

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11117450]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (1)
As of 2024-04-25 04:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found