Text::CSV encoding parse()

slugger415 has asked for the wisdom of the Perl Monks concerning the following question:

Hello esteemed monks, I am using Text::CSV to parse an array of text strings (pipe delimited) and want to use UTF-8 encoding to read the strings. In the doc at https://metacpan.org/pod/Text::CSV#new I see this instruction:

On parsing (both for "getline" and "parse"), if the source is marked being UTF8, then all fields that are marked binary will also be marked UTF8.

I have set my 'new' instance to binary, and it mostly works, except some accented characters are showing up in the resulting web page as black diamond question marks, e.g. conexi�n. (Japanese and other language characters look fine.) Is there something else I need to set? If I don't use Text::CSV and just 'split' the strings, those characters look fine, and correct.

my $csv = Text::CSV->new ({ binary => 1, sep_char => "|" });
foreach my $row (@sorted_urls){
 $csv->parse($row);
# processing
}
[download]

Thank you.

Comment on Text::CSV encoding parse() Download Code

Replies are listed 'Best First'.
Re: Text::CSV encoding parse() by haukex (Archbishop) on Aug 13, 2019 at 18:05 UTC
some accented characters are showing up in the resulting web page as black diamond question marks Are you sure you've also set your output filehandles to the correct encoding, and have specified that encoding in the HTML? Please provide a Short, Self-Contained, Correct Example. To debug the input end of the process, see my suggestions at Re: Parsing Problems (updated).	[reply]
Re^2: Text::CSV encoding parse() by slugger415 (Monk) on Aug 13, 2019 at 18:45 UTC
Hi, yes I'm using the CGI module and have it properly set: `print $q->header(-charset => 'utf-8');` And as mentioned if I don't use Text::CVS the characters display correctly.	[reply] [d/l]
Re^3: Text::CSV encoding parse() by haukex (Archbishop) on Aug 13, 2019 at 19:44 UTC
Hi, yes I'm using the CGI module and have it properly set: `print $q->header(-charset => 'utf-8');` And as mentioned if I don't use Text::CVS the characters display correctly. Ok, but I'm sorry, there still isn't enough information to answer your question - have another look at my reply above, plus the links therein.	[reply] [d/l]
Re^4: Text::CSV encoding parse() by slugger415 (Monk) on Aug 14, 2019 at 17:43 UTC
Re^5: Text::CSV encoding parse() by haukex (Archbishop) on Aug 14, 2019 at 19:55 UTC
Some notes below your chosen depth have not been shown here
Re^5: Text::CSV encoding parse() by choroba (Cardinal) on Aug 14, 2019 at 19:35 UTC
Some notes below your chosen depth have not been shown here
Re^3: Text::CSV encoding parse() by jcb (Parson) on Aug 14, 2019 at 03:24 UTC
That means that you are declaring to the browser that your output is UTF-8. Is it actually* UTF-8?*	[reply]
Re^4: Text::CSV encoding parse() by slugger415 (Monk) on Aug 14, 2019 at 17:48 UTC
Re^5: Text::CSV encoding parse() by afoken (Chancellor) on Aug 14, 2019 at 23:03 UTC
Re^5: Text::CSV encoding parse() by jcb (Parson) on Aug 14, 2019 at 23:55 UTC
Some notes below your chosen depth have not been shown here


more useful options
	PerlMonks