You surely *can* split the record in one go -
Version 1 - '\' escaped quotes:
use strict;
use Data::Dumper;
while (<DATA>)
{
chomp;
my @rec;
# was - foreach (split /"(.*,.*)"|,/) ...
foreach (split /"((?:\\"|.)*?)"|,/) { push @rec, $_ if $_ }
print Dumper(\@rec);
}
__DATA__
1,"Hello, world",This is good,2
121212,"Simpson, Bart",Springfield,"Roger"
121212,"2\" tape, \"white",springfield,"Roger"
121212,"Simpson \", Bart",Springfield,"Roger"
And the output is -
$VAR1 = [
'1',
'Hello, world',
'This is good',
'2'
];
$VAR1 = [
'121212',
'Simpson, Bart',
'Springfield',
'Roger'
];
$VAR1 = [
'121212',
'2\\" tape, \\"white',
'springfield',
'Roger'
];
$VAR1 = [
'121212',
'Simpson \\", Bart',
'Springfield',
'Roger'
];
Update: There was a minor flaw in the original solution, I did not search for escaped quotes inside the quote, here's the enhanced version.
Version 2 - '"' escaped quotes:
use strict;
use Data::Dumper;
while (<DATA>)
{
chomp;
my @rec;
foreach (split /"(.*?)(?:(?<!")"(?!")|(?<="")"(?!"))|,/)
{ s/""/"/g, push @rec, $_ if $_ }
print Dumper(\@rec);
}
__DATA__
1,"Hello, world",This is good,2
121212,"Simpson, Bart",Springfield,"Roger"
121212,"2"" tape, ""white",springfield,"Roger"
121212,"Simpson "", Bart",Springfield,"Roger"
121212,"2""",springfield,"Roger"
And output is -
$VAR1 = [
'1',
'Hello, world',
'This is good',
'2'
];
$VAR1 = [
'121212',
'Simpson, Bart',
'Springfield',
'Roger'
];
$VAR1 = [
'121212',
'2" tape, "white',
'springfield',
'Roger'
];
$VAR1 = [
'121212',
'Simpson ", Bart',
'Springfield',
'Roger'
];
$VAR1 = [
'121212',
'2"',
'springfield',
'Roger'
];
Update: Thanks to antirice to point out that the quotes are escaped by quote ("), not backslash (\) inside the quote.