laziness, impatience, and hubris | |
PerlMonks |
Re: regular expression (search and destroy)by sweetblood (Prior) |
on Nov 12, 2003 at 21:39 UTC ( [id://306645]=note: print w/replies, xml ) | Need Help?? |
This can get to be rather complicated to parse. The problems I've seen with this type of data can throw a wrench into your parsing methods. I haven't found a good module that covers all the subtlties with quoted delimited data. Just as an example if your data looks like you describe: 121212, "Simpson, Bart", Springfield this is a trivial matter to parse, but what if your data looks like: 121212,"2" tape, white", springfield If the case is that you'd never encounter quotes embed within your fields then it is less of a problem. If you are dead set against using some of the fine CPAN modules or even as previously suggested Text::Balance (core module) you could do something like this:
The idea is to read a record then walk through the record 1 byte at a time trying to determine if a delimiter is inside a set of protecting quotes. One other thing, the above method is not very rapid so if you have tons (100's of megs/gigs/terras) you may have to wait awhile. In the end, your probably best off using a module.
In Section
Seekers of Perl Wisdom
|
|