As I wrote in the CB, I think File::BOM might be the way to do it. I'm still searching, but at least I've a working example that shows the error:
#!/usr/bin/perl
use v5.12;
use warnings;
use autodie qw( :all );
use File::BOM;
use DBI;
open my $h,'>:encoding(utf-8)','test.csv';
say $h qq<\x{FEFF}"foo","bar","baz">;
say $h qq<"1","2","3">;
say $h qq<"4","5","6">;
close $h;
my $dbh=DBI->connect('dbi:CSV:',undef,undef,{ RaiseError => 1, PrintEr
+ror => 0, f_ext => '.csv'});
my $sth=$dbh->prepare('select * from test');
$sth->execute();
while (my @a=$sth->fetchrow_array()) {
say join(",",@a);
}
$sth->finish();
Update:
It seems that only the header() method in Text::CSV_XS is able to handle the BOM, and DBD::CSV does not call that method. Instead, it calls getline().
#!/usr/bin/perl
use v5.12;
use warnings;
use autodie qw( :all );
use File::BOM;
use DBI;
use Data::Dumper;
package My::Text::CSV_XS {
use parent "Text::CSV_XS";
sub DUMP
{
return Data::Dumper->new([\@_],['*_'])->Sortkeys(1)->I
+ndent(1)->Useqq(1)->Dump();
}
sub new
{
my $proto=shift;
say "$proto -> new(",DUMP(@_),")";
$proto->SUPER::new(@_);
}
sub header
{
my $self=shift;
say "$self -> header(",DUMP(@_),")";
$self->SUPER::header(@_);
}
sub getline
{
my $self=shift;
say "$self -> getline(",DUMP(@_),")";
$self->SUPER::getline(@_);
}
sub getline_hr
{
my $self=shift;
say "$self -> getline_hr(",DUMP(@_),")";
$self->SUPER::getline_hr(@_);
}
sub getline_all
{
my $self=shift;
say "$self -> getline(",DUMP(@_),")";
$self->SUPER::getline_all(@_);
}
sub getline_hr_all
{
my $self=shift;
say "$self -> getline_hr(",DUMP(@_),")";
$self->SUPER::getline_hr_all(@_);
}
}
open my $h,'>:encoding(utf-8)','test.csv';
say $h qq<\x{FEFF}"foo","bar","baz">;
say $h qq<"1","2","3">;
say $h qq<"4","5","6">;
close $h;
my $dbh=DBI->connect('dbi:CSV:',undef,undef,{ RaiseError => 1, PrintEr
+ror => 0, f_ext => '.csv', csv_class => 'My::Text::CSV_XS'});
my $sth=$dbh->prepare('select * from test');
$sth->execute();
while (my @a=$sth->fetchrow_array()) {
say join(",",@a);
}
$sth->finish();
>perl test2.pl
My::Text::CSV_XS -> new(@_ = (
{
"auto_diag" => 1,
"binary" => 1,
"escape_char" => "\"",
"quote_char" => "\"",
"sep_char" => ","
}
);
)
My::Text::CSV_XS -> new(@_ = (
{
"auto_diag" => 1,
"binary" => 1,
"eol" => "\r\n",
"escape_char" => "\"",
"quote_char" => "\"",
"sep_char" => ","
}
);
)
My::Text::CSV_XS=HASH(0x23e04e8) -> getline(@_ = (
bless( \*Symbol::GEN1, 'IO::File' )
);
)
DBD::CSV::st execute failed:
Execution ERROR: Missing first row due to EIF - Loose unescaped quote
+at /usr/lib64/perl5/vendor_perl/DBI/DBD/SqlEngine.pm line 1480.
.
at /usr/lib64/perl5/vendor_perl/DBI/DBD/SqlEngine.pm line 1271.
[for Statement "select * from test"] at test2.pl line 71, <GEN1> line
+ 1.
>
Second Update:
There is an ugly way to get File::BOM into DBD::CSV. DBD::CSV inherits from DBD::File, and DBD::File has an f_encoding attribute that is simply wrapped in :encoding(...), without further processing. That string is then passed to binmode.
This is the relevant part of DBD::File:
sub apply_encoding
{
my ($self, $meta, $fn) = @_;
defined $fn or $fn = "file handle " . fileno ($meta->{fh});
if (my $enc = $meta->{f_encoding}) {
binmode $meta->{fh}, ":encoding($enc)" or
croak "Failed to set encoding layer '$enc' on $fn: $!";
}
else {
binmode $meta->{fh} or croak "Failed to set binary mode on $fn: $!
+";
}
} # apply_encoding
So, with a carefully crafted encoding string, it works at least on my system:
#!/usr/bin/perl
use v5.12;
use warnings;
use autodie qw( :all );
use File::BOM;
use DBI;
open my $h,'>:encoding(utf-8)','test.csv';
say $h qq<\x{FEFF}"foo","bar","baz">;
say $h qq<"1","2","3">;
say $h qq<"4","5","6">;
close $h;
my $dbh=DBI->connect(
'dbi:CSV:',
undef,
undef,
{
RaiseError => 1,
PrintError => 0,
f_ext => '.csv',
f_encoding => 'utf-8):via(File::BOM' # <-- this is the
+ evil trick!
}
);
my $sth=$dbh->prepare('select * from test');
$sth->execute();
while (my @a=$sth->fetchrow_array()) {
say join(",",@a);
}
$sth->finish();
Effectively, this calls binmode $meta->{fh}, ":encoding(utf-8):via(File::BOM)", and so, File::BOM can hide the BOM from Text::CSV_XS and DBD::CSV.
>perl test3.pl
1,2,3
4,5,6
>
Alexander
--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.