go ahead... be a heretic | |
PerlMonks |
comment on |
( [id://3333]=superdoc: print w/replies, xml ) | Need Help?? |
This can get to be rather complicated to parse. The problems I've seen with this type of data can throw a wrench into your parsing methods. I haven't found a good module that covers all the subtlties with quoted delimited data. Just as an example if your data looks like you describe:
121212, "Simpson, Bart", Springfield this is a trivial matter to parse, but what if your data looks like: 121212,"2" tape, white", springfield If the case is that you'd never encounter quotes embed within your fields then it is less of a problem. If you are dead set against using some of the fine CPAN modules or even as previously suggested Text::Balance (core module) you could do something like this:
The idea is to read a record then walk through the record 1 byte at a time trying to determine if a delimiter is inside a set of protecting quotes. One other thing, the above method is not very rapid so if you have tons (100's of megs/gigs/terras) you may have to wait awhile. In the end, your probably best off using a module. In reply to Re: regular expression (search and destroy)
by sweetblood
|
|