http://qs321.pair.com?node_id=489211

Angharad has asked for the wisdom of the Perl Monks concerning the following question:

Hi there. This is a question for a friend rather than myself. I cant think of anyway around it, other than modify the file so to put a space between each one and split on that but anyway. If one has the following info in a text file
"foo1","foo2","foo3","foo4","foo5","foo6","foo7","foo8","foo9"
How might one split is so for example you extract "foo1", .. including the ',' in the variable? Splitting in the ',' removes it of course from the equation. Thanks.

Replies are listed 'Best First'.
Re: another split question
by ysth (Canon) on Sep 05, 2005 at 12:46 UTC
    A couple of ways:
    @elements = split /(?<=,)/, $text;
    or
    @elements = $text =~ /([^,]+,?)/g;
Re: another split question
by TomDLux (Vicar) on Sep 05, 2005 at 14:45 UTC

    The other way to deal with it is to use split /,/ anyway, and then just know that there is a comma at the end of the string. You could even stick it back on if you wanted to.

    But it makes me wonder why you want the comma. I bet you really want foo1, not the quotes, not the comma. Probably if you re-consider whatever processing happens next, your code might simplify.

    Or maybe I'm wrong.

    TomDLux

    --
    TTTATCGGTCGTTATATAGATGTTTGCA

Re: another split question
by Anonymous Monk on Sep 05, 2005 at 12:28 UTC
    perldoc -f split
    If the PATTERN contains parentheses, additional list eleme +nts are created from each matching substring in the delimiter. split(/([,-])/, "1-10,20", 3); produces the list value (1, '-', 10, ',', 20)
    How to RTFM
      This is not exactly what the OP asked for. Of course it may be a starting point to achieve that based on the gathered data. But that would be more complex than is needed IMHO. ysth's reply comes closer to what he (the OP) really wants.
      Just a suggestion... where L stands for "Link" ;)
      [doc://split]
      produces split.

      Flavio
      perl -ple'$_=reverse' <<<ti.xittelop@oivalf

      Don't fool yourself.
Re: another split question
by sh1tn (Priest) on Sep 05, 2005 at 22:55 UTC
    Alternatively regex can be used:
    $text = '"foo1","foo2","foo3","foo4","foo5","foo6","foo7","foo8","foo9 +"'; push @comma_elems, $1 while $text =~ /(\S+?,)/g;


Re: another split question
by Roger (Parson) on Sep 05, 2005 at 12:34 UTC
    I just realised that is not what OP is looking for...
    There is the hard way...

    use strict; use Data::Dumper; while (<DATA>) { chomp; my @rec; foreach (split /"(.*?)(?:(?<!")"(?!")|(?<="")"(?!"))|,/) { s/""/"/g, push @rec, $_ if $_ } print Dumper(\@rec); } __DATA__ 1,"Hello, world",This is good,2 121212,"Simpson, Bart",Springfield,"Roger" 121212,"2"" tape, ""white",springfield,"Roger" 121212,"Simpson "", Bart",Springfield,"Roger" 121212,"2""",springfield,"Roger"

    And there is the easy way...
    use Text::CSV_XS; use Data::Dumper; my $csv = Text::CSV_XS->new(); while (<DATA>) { chomp; my $status = $csv->parse($_); my @rec = $csv->fields(); print Dumper(\@rec); } __DATA__ 1,"Hello, world",This is good,2 121212,"Simpson, Bart",Springfield,"Roger" 121212,"2"" tape, ""white",springfield,"Roger" 121212,"Simpson "", Bart",Springfield,"Roger" 121212,"2""",springfield,"Roger"