Contributed by dbetz on Mar 20, 2000 at 20:48 UTC
Q&A  > strings


Normally I would use split on ',', but that will not work because of the comma embedded within certain fields. Is there a "perl" way to parse this into its proper fields?

"R", "2164", "27-2164", "270102", "Add Terminal Server to John, Jane, and George's PC's", "3/13/00", "3/27/00", "00/00/00", "02:00:00", "Jane Doe", " 3 - Released", "Jane Doe"

Edited by davido: Added code tags and more legible formatting.

Answer: How can I split a comma-delimited string when the fields can have commas in them?
contributed by chromatic

The best answer is to use Text::CSV from CPAN. Otherwise, you'll have to craft a regex which can handle obscure cases like commas between quotes, escaped quotes within quotes, and other funny stuff like that.

One possibility is:

$string =~ m!"*?([^,])"*?(?:=,)!;
Answer: How can I split a comma-delimited string when the fields can have commas in them?
contributed by turnstep

Here's an answer from Mastering Regular Expressions:

sub parse_csv { my $text = shift; ## record containing comma-separated values my @new = (); push(@new, $+) while $text =~ m{ ## the first part groups the phrase inside the quotes "([^\"\\]*(?:\\.[^\"\\]*)*)",? | ([^,]+),? | , }gx; push(@new, undef) if substr($text, -1,1) eq ','; return @new; ## list of values that were comma-spearated } ## Use like this: @goodlist = parse_csv($csvlist);

Ugly, to be sure, but the complexity level really kicks up a notch when you add the delimiters into the fields themselves. Also, the above snippet allows quotes inside the fields, as long as they are backslashed.

Please (register and) log in if you wish to add an answer

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.