Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery

negative lookahead to the rescue! (boo)

by boo_radley (Parson)
on Jul 24, 2001 at 20:47 UTC ( #99402=note: print w/replies, xml ) Need Help??

in reply to Delete unmatched quotes from a delimited file?

$_='|value 1|value two|"|Value 3 was "NULL"|2001/06/06|'; s/(\|.*?)(")(?!")(.*?\|)/$1$3/; print;
there are 4 sets of parens in the regex, I'll try to break them down :
the first one looks for a pipe, then any number of characters (but not being greedy about it)
the second grabs the lone quote the third is a negative lookahead. This is where is golden nugget of regular expression goodness lies!
the negative lookahead makes sure that there's no quote following the one from the second backreference. Also, since it's a zero-width assertion, it doesn't create a backreference of its own.
Finally, the last set of parens describes "the rest of the string" up to the ending pipe-delimiter.

"look for a pipe, and then any characters up to a quote, make sure it's not followed by another quote, and then the rest of the string, up to a pipe"
Now the one caveat for this re is that it will misbehave on "", but my reg-fu is not strong enough to determine the handler for that contingency.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://99402]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (4)
As of 2021-10-28 12:35 GMT
Find Nodes?
    Voting Booth?
    My first memorable Perl project was:

    Results (96 votes). Check out past polls.