Re^6: Perl custom sort for Portuguese Lanaguage

Replies are listed 'Best First'.
Re^7: Perl custom sort for Portuguese Lanaguage by haukex (Archbishop) on Jul 08, 2020 at 21:06 UTC
This works for me: `csv (in => 'quux.csv', filter => {1 => sub { !/^#/ }});` Unfortunately that also filters lines whose first field is `"#foo"` (with the quotes). I remember Tux recently saying filtering before parsing wasn't supported, though I'm having trouble finding the reference at the moment (it could have been in the chatterbox too). It may be a bit tricky because this is valid CSV too: `abc,"d #e f",ghi` [download] (That's one row, `["abc", "d\n#e\nf", "ghi"]`.) Update: I looked again and I think it must have been in the chatterbox; I do distinctly remember someone having a similar question recently...	[reply] [d/l] [select]
Re^8: Perl custom sort for Portuguese Lanaguage by choroba (Cardinal) on Jul 08, 2020 at 21:27 UTC
The meta info knows whether the field was quoted or not. `#!/usr/bin/perl use warnings; use strict; use Text::CSV_XS; my $csv = 'Text::CSV_XS'->new ({ binary => 1, auto_diag => 1, keep_meta_info => 1 }); open my $in, '<:encoding(utf8)', shift or die $!; while (my $row = $csv->getline($in)) { next if $row->[0] =~ m/^#/ && ! $csv->is_quoted(0); $csv->say(STDOUT, $row); }` [download] Tested with `#x,y,z skip abc,"d #e f",ghi keep #comment skip a,b,c,#xyz keep "#foo",x,y,z keep` [download] `map{substr$_->[0],$_->[1]\|\|0,1}[\\|\|{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]`	[reply] [d/l] [select]
Re^9: Perl custom sort for Portuguese Lanaguage by haukex (Archbishop) on Jul 09, 2020 at 06:06 UTC
The meta info knows whether the field was quoted or not. True, though AFAICT the `meta_info` doesn't seem to keep track of escaped characters: `use warnings; use strict; use Data::Dump; use Text::CSV; my $csv = Text::CSV->new({ binary=>1, auto_diag=>2, keep_meta_info=>1, escape_char=>"\\" }); while ( my $row = $csv->getline(*DATA) ) { dd $row, $csv->meta_info; } $csv->eof or $csv->error_diag; __DATA__ foo,bar "#foo","bar" #foo,bar \#foo,bar` [download]	[reply] [d/l] [select]
Re^8: Perl custom sort for Portuguese Lanaguage by soonix (Canon) on Jul 09, 2020 at 06:31 UTC
In this special case it looks like there won't be portuguese words starting with a "#", so it would work for OP, as long as he is aware of it	[reply]
Re^9: Perl custom sort for Portuguese Lanaguage by haukex (Archbishop) on Jul 09, 2020 at 06:36 UTC
In this special case it looks like there won't be portuguese words starting with a "#", so it would work for OP, as long as he is aware of it True as well `:-)` (I guess this is more about the generic case of filtering comments from CSV files.)	[reply] [d/l]
Re^7: Perl custom sort for Portuguese Lanaguage by Tux (Canon) on Jul 09, 2020 at 13:54 UTC
If you only want the first lines starting with `#` to be filtered, that is indeeed what filter is for: `use Data::Peek; use Text::CSV_XS qw( csv ); my $r = 0; my $aoa = csv (in => DATA, filter => sub { $_[1][0] =~ m/^\s#/ ? $r +: ++$r; }); DDumper $aoa; __END__ # This is comment # and so is this # and this a,b,c #but,not,this 1,2,3` [download] --> `[ [ 'a', 'b', 'c' ], [ '#but', 'not', 'this' ], [ '1', '2', '3' ] ]` [download] Enjoy, Have FUN! H.Merijn	[reply] [d/l] [select]
Re^8: Perl custom sort for Portuguese Lanaguage by haukex (Archbishop) on Jul 09, 2020 at 14:08 UTC
Thanks! I see you're filtering lines beginning with `#` when they occur at the beginning of the file; the way I understood the OP's sample data is that the comments can occur anywhere. And my worry was that, even though in the OP's data this is probably not the case, `filter`-based solutions will remove lines that may actually not be comments, and I wasn't sure if there was a easy solution for this? `use warnings; use strict; use Data::Peek; use Text::CSV_XS qw/csv/; DDumper csv( in=>DATA, escape_char=>"\\", filter => sub { $_[1][0] !~ m/^\s#/ }); __DATA__ # This is a comment a,b,c # Also a comment x,y,z "#not",a,comment \#also,not,"a comment"` [download] Output: `[ [ 'a', 'b', 'c' ], [ '' ], [ 'x', 'y', 'z' ], [ '' ] ]` [download]	[reply] [d/l] [select]
Re^9: Perl custom sort for Portuguese Lanaguage by Tux (Canon) on Jul 09, 2020 at 14:14 UTC
So more or like like this:? `DDumper csv ( in => DATA, sep => "\|", filter => sub { $_[1][0] =~ m/^\s#/ && @{$_[1]} == 1 ? 0 : 1; }, );` [download] Which would not even need a ternary if slightly rewritten Enjoy, Have FUN! H.Merijn	[reply] [d/l]
Re^10: Perl custom sort for Portuguese Lanaguage (updated x2) by haukex (Archbishop) on Jul 09, 2020 at 14:20 UTC


Do you know where your variables are?
	PerlMonks