http://qs321.pair.com?node_id=11115129


in reply to Re^4: Lost in compressed encodings
in thread Lost in compressed encodings

I would suggest using Text::CSV_XS to read/parse the CSV. It already knows how to deal with UTF-8.

It is capable of using a TAB as separation character:

use Text::CSV_XS; my $csv = Text::CSV_XS->new ({ binary => 1, sep_char => "\t", auto_dia +g => 1 }); my @headers = $csv->header ($in); # Read the docs, there are options p +ossible here while (my $row = $csv->getline ($in)) { # ... }

update: I realized later that Text::CSV_XS' csv function already supports gzip as part of the encoding attribute.

use Text::CSV_XS qw( csv ); use PerlIO::via::gzip; my $aoa = csv (in => "test.csv.gz", encoding => ":via(gzip)");

Enjoy, Have FUN! H.Merijn