Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re^2: Is there a way to make these two regex lines cleaner?

by bartender1382 (Beadle)
on Apr 16, 2022 at 19:17 UTC ( [id://11143007]=note: print w/replies, xml ) Need Help??


in reply to Re: Is there a way to make these two regex lines cleaner?
in thread Is there a way to make these two regex lines cleaner?

Awesome catch!

I am using the use the Spreadsheet::Read module. Which uses the command,

my $book  = ReadData ("$upload_dir/$filename");

It will allow me to open the buffer, read it myself, then hand off the buffer to the ReadData command.

Sadly that's failing, see below, and will have to debug more.

Again, awesome catch! Glad I included the garbage.

open my $fh, '<:raw:encoding(UTF-8)', "$upload_dir/$filename"; read $fh, my $string, -s $fh; close $fh; my $book = ReadData ($string);

Replies are listed 'Best First'.
Re^3: Is there a way to make these two regex lines cleaner?
by haukex (Archbishop) on Apr 16, 2022 at 19:30 UTC
    I am using the use the Spreadsheet::Read module.

    That's an important piece of information missing from the root node! I am guessing that your files are CSV files? Because opening any other file type (XLS, XLSX, etc.) with an '<:raw:encoding(UTF-8)' will likely corrupt those files, and ReadData($filename) should be preferred there. And for CSV files, Spreadsheet::Read uses Text::CSV or Text::CSV_XS under the hood, both of which have a detect_bom option when used directly - unfortunately I currently don't see a way to get Spreadsheet::Read to apply that option, so unless Tux has any hints, you could use one of those two CSV modules directly.

    In any case, you may want to check your $filename to see if it's a CSV file first, before handing it off to the processing code appropriate for the file type.

    Update: Regarding read $fh, my $string, -s $fh;, the idiomatic way to slurp a file in Perl is my $string = do { local $/; <$fh> }; (see $/). Other minor edits. And you need to check your open for errors, see "open" Best Practices.

Re^3: Is there a way to make these two regex lines cleaner?
by swl (Parson) on Apr 16, 2022 at 23:59 UTC

    You could also use File::BOM to open the file and then pass the file handle to Spreadsheet::Read.

    # untested use File::BOM qw( :all ); use Spreadsheet::Read; open_bom(my $fh, $file, ':utf8'); my $book = ReadData ($fh, parser => "csv");

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11143007]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (6)
As of 2024-04-18 16:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found