cidaris has asked for the wisdom of the Perl Monks concerning the following question:
I hate to post regex questions, but I'm second-guessing myself every time I think I make progress.
The problem: I have a CSV file that Text::CSV is handling very nicely.
Unfortunately, it will balk on data like this:
"crosby","stills","nash","and sometimes "young""
The input can include any number of characters, and unfortunately, I don't have control over the input to tell people "hey, don't use anything but letters or numbers!" Believe you me, I'd love to put some constraints on their input, but it's a proprietary tool, and, well, I could lecture until I was blue in the face and some snot-nosed kid would immediately enter every non-alphanumeric character he could find.
I had tried this, but it's not right:
because it will match the "," that I'm trying to delimit with. My next thought was something like this:
but that would only work if a comma were a class of character... And probably not even then ;) I guess in pseudo-code, I'm after something like this:
Can anyone who has been through this kind of nightmare help me?
Thanks,
cidaris
The problem: I have a CSV file that Text::CSV is handling very nicely.
Unfortunately, it will balk on data like this:
"crosby","stills","nash","and sometimes "young""
The input can include any number of characters, and unfortunately, I don't have control over the input to tell people "hey, don't use anything but letters or numbers!" Believe you me, I'd love to put some constraints on their input, but it's a proprietary tool, and, well, I could lecture until I was blue in the face and some snot-nosed kid would immediately enter every non-alphanumeric character he could find.
I had tried this, but it's not right:
if ($line =~ m/".?".?".?"/g)
because it will match the "," that I'm trying to delimit with. My next thought was something like this:
if ($line =~ m/"[^\,]?"[^\,]?"[^\,]?"/g)
but that would only work if a comma were a class of character... And probably not even then ;) I guess in pseudo-code, I'm after something like this:
if ($line =~ m/"(ANY # OF NON-COMMAS)"(ANY # OF NON-COMMAS)"(ANY # OF +NON-COMMAS)"/g)
Can anyone who has been through this kind of nightmare help me?
Thanks,
cidaris
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: CSV and regex mixups
by Zaxo (Archbishop) on Jul 02, 2003 at 22:26 UTC | |
by Tomte (Priest) on Jul 02, 2003 at 23:43 UTC | |
Re: CSV and regex mixups
by BrowserUk (Patriarch) on Jul 02, 2003 at 23:35 UTC | |
Re: CSV and regex mixups
by flounder99 (Friar) on Jul 03, 2003 at 11:57 UTC | |
by flounder99 (Friar) on Jul 03, 2003 at 13:07 UTC | |
Re: CSV and regex mixups
by aquarium (Curate) on Jul 02, 2003 at 22:10 UTC | |
by cidaris (Friar) on Jul 02, 2003 at 23:29 UTC | |
Re: CSV and regex mixups
by clscott (Friar) on Jul 03, 2003 at 17:29 UTC |
Back to
Seekers of Perl Wisdom