Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Regex question is this one of those look (ahead|behind)s ?

by misterperl (Pilgrim)
on Mar 25, 2014 at 17:36 UTC ( [id://1079709]=perlquestion: print w/replies, xml ) Need Help??

misterperl has asked for the wisdom of the Perl Monks concerning the following question:

$_ = 'abcdef';

1. replace def with xyz if the string DOESNT have a preceding a: s/^(\SB^aSB*)def/$1xyz/

works fine. (note SB = SQUARE BRACKET, this EDITOR is removing them for some reason)

But let's say instead, only replace if the preceeding string is NOT eq 'abc' exactly?

I studied NLB Assertion and thought this would bring joy:

s/^(?<!abc)def/$1xyz/

but no joy. It never substitutes now if preceeded by abc or NOT preceeded by it..

Wisdom is appreciated.
  • Comment on Regex question is this one of those look (ahead|behind)s ?

Replies are listed 'Best First'.
Re: Regex question is this one of those look (ahead|behind)s ?
by davido (Cardinal) on Mar 25, 2014 at 17:53 UTC

    First order of business: If you read Writeup Formatting Tips you'll see that wrapping code in <code> ...... </code> tags is the way to prevent your code from getting mangled by the website's linking syntax. Following that protocol, your first regex must look like this:

    s/^([^a]*)def/$1xyz/

    ...which can't possibly match "abcdef", because it anchors the match to the start of the string, and an 'a' character comes between the start of the string and the literal 'def', blocking the match. I guess that "works fine" because it does fail to match "abcdef".

    The second regex is this:

    s/^(?<!abc)def/$1xyz/

    If your goal is to replace 'xyz' if it is not preceded by 'abc', that one would work, except for the anchoring to the beginning of the string. You're explicitly stating that you don't want to match 'def' if 'abc' comes before it. But by anchoring to the beginning of the string, and since assertions don't consume what they match, the only legal string would start with "def". Also, since you're not consuming the prefix (lookahead and lookbehind assertions don't consume), and since you're not capturing (there are no capturing parens), there's no need to have the $1 in the replacement.

    By adding capturing parens to the beginning of the regex, with three wildcard metacharacers, it works out better. You'll find that s/^(...)(?<!abc)def/$1xyz/ rejects 'abcdef', but 'xxxdef' matches, becoming 'xxxxyz', which I think is what you want.

    s/^(...)(?<!abc)def/$1xyz/;

    Dave

      Dave writes "...which can't possibly match "abcdef", "

      Dave if you read closely I was looking NOT have a preceeding "a" in that example- I was constrasting a CHAR, vs a set of CHARS...

      SO I was CLOSE , except for the anchor . THANKS to you and Kenneth- BOTH of you got my ++ vote...
Re: Regex question is this one of those look (ahead|behind)s ?
by kennethk (Abbot) on Mar 25, 2014 at 18:20 UTC
    use strict; use warnings; use YAPE::Regex::Explain; print YAPE::Regex::Explain->new(qr/^(?<!abc)def/)->explain;
    tells us
    The regular expression: (?-imsx:^(?<!abc)def) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- ^ the beginning of the string ---------------------------------------------------------------------- (?<! look behind to see if there is not: ---------------------------------------------------------------------- abc 'abc' ---------------------------------------------------------------------- ) end of look-behind ---------------------------------------------------------------------- def 'def' ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------

    Look aheads and look behinds are zero-width. This means that your regular expression is a subset of /^def/, which is clearly not what you intend. There are a number of ways to get what you express above. Probably the most natural would be s/(?<!^abc)\Kdef/xyz/. Moving the ^ inside the negative look behind means it only matters if the a starts the string. \K means 'keep everything before this' (see Character Classes and other Special Escapes)


    #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

Re: Regex question is this one of those look (ahead|behind)s ?
by AnomalousMonk (Archbishop) on Mar 25, 2014 at 17:47 UTC
Re: Regex question is this one of those look (ahead|behind)s ?
by Anonymous Monk on Mar 25, 2014 at 23:27 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1079709]
Approved by Ratazong
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (3)
As of 2024-04-25 19:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found