http://qs321.pair.com?node_id=522309


in reply to Regex with malformed CSV files

I'd probably just document the fact that Outlook is broken and not supported, or do what jZed suggests. You may be able to do a better job if you parse the problem lines yourself and examine individual fields for clues on where you are at in a "line".

For example, most fields will probably not have embedded newlines in them; a zip code will probably not have embedded quotes or commas and will probably be short; email addresses will tend to have '@' in them; quoted fields will probably not strech 100's of characters; etc.

If you enforce some rules like this, your parser may be able to determine most of the time where it is. Of course, you could go stark raving mad in a futile effort trying to figure out the perfect ruleset...