Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

A Regex simple Question

by pysome (Scribe)
on Sep 06, 2007 at 15:43 UTC ( [id://637461]=perlquestion: print w/replies, xml ) Need Help??

pysome has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks, I only wanna to match a string that could not contain a sub-string.Such as
#!/usr/bin/perl $s = "a000000000b"; if ($s =~ /^a.*(?!ab).*b$/) { print "yes"; } else { print "no"; }
My ideal result is to match string that : 
1)start with "a"
2)end with "b"
3)don't include "ab" between them.

That is to say,
"a000000000b" : matched
"a0a0000b00b" : matched
but
"a0000ab000b" : unmatched
I use negative look-ahead assertion (?!ab) ,but it can't work.Can somebody help me?
Thanks
Pysome

Replies are listed 'Best First'.
Re: A Regex simple Question
by FunkyMonk (Chancellor) on Sep 06, 2007 at 15:52 UTC
    I'd split it into two matches:
    while ( <DATA> ) { chomp; print "$_: ", /^a.*b$/ && !/.ab./ ? "yes" : "no", "\n"; } __DATA__ a000000000b a0a0000b00b a0000ab000b ab00000000b a000ab

    Output:

    a000000000b: yes a0a0000b00b: yes a0000ab000b: no ab00000000b: yes a000ab: yes

    update: Added two more tests cases and fixed a bug

Re: A Regex simple Question
by Sidhekin (Priest) on Sep 06, 2007 at 15:54 UTC

    Negative lookahead is always tricky ... you're on the right track though.

    In this specific case you could do this:

    /^a(?!b)(?:(?!ab).)+b$/

    ... or even this:

    /^(?!.*ab)a.+b$/

    ... unless I've read you wrong, and you want to allow the string "ab" and even "ab000b"/"a000ab" (since the "ab" is not strictly between the "a" and the "b"), in which case you want this:

    /^a(?:(?!ab.).)*b$/

    ... or even this:

    /^(?!.+ab.)a.*b$/

    Always tricky, the negative lookahead ... if you can split it, like FunkyMonk suggests (or some variation thereof), it'll be far more readable.

    print "Just another Perl ${\(trickster and hacker)},"
    The Sidhekin proves Sidhe did it!

Re: A Regex simple Question
by johngg (Canon) on Sep 06, 2007 at 15:55 UTC
    A negative look-ahead is fine, just move the .* into the assertion.

    if ($s =~ /^a(?!.*ab).*b$/) {

    Cheers,

    JohnGG

    Update: If you want "a00000000ab" to match then you could change the look-ahead to ensure there is at least one character after the "ab".

    use strict; use warnings; my @strings = qw{ a000000000b a0a0000b00b a0000ab000b a00000000ab }; print m{^a(?!.*ab.).*b$} ? qq{$_: matched\n} : qq{$_: unmatched\n} for @strings;

    which produces

    a000000000b: matched a0a0000b00b: matched a0000ab000b: unmatched a00000000ab: matched
Re: A Regex simple Question
by ikegami (Patriarch) on Sep 06, 2007 at 15:54 UTC
    Your have to make sure no position starts matches /ab/.
    /^a(?:(?!ab).)*b$/

    (?:(?!regexp).)
    is to regexps as
    [^chars]
    is to chars.

    Update: Not quite. /$pre(?(?!$regexp).)*$post/ only works when /$regexp/ can't match something /.$post/ matches.

    It doesn't work in this case because /ab/ can match something that /.b/ matches. It'll fail if the input is a000ab, for example.

    However, I suspect you have something closer to the following, which works fine:

    / <tag> # Start tag (?:(?!</tag>).)* # Body </tag> # End tag /x
Re: A Regex simple Question
by johnlawrence (Monk) on Sep 06, 2007 at 16:04 UTC
    I don't like negative lookahead much either! My non-lookahead version:
    /^a([^a]|a[^b])*a?b$/
    Works as well I think.
Re: A Regex simple Question
by moritz (Cardinal) on Sep 06, 2007 at 15:56 UTC
    Your regex says "an a, followed by arbitrary characters, followed by arbitrary characters that don't start with ab, then a b".

    This means that the (?!ab) will always find a position where the next position is not ab.

    Knowing this, the solution is not so hard:

    /^a(?!.*ab).*b$/

      That will fail (false negative) for a000ab.
        You are right. That can easily be ammended by changing it to /^a(?!.*ab.).*b$/.

        Regexes are hard, sometimes. Especially corner cases ;)

Re: A Regex simple Question
by injunjoel (Priest) on Sep 06, 2007 at 16:29 UTC
    Update: Disregard this, I read too quickly initially.
    Just my 2 cents...
    m/^a[^ab]*b$

    -InjunJoel
    "I do not feel obliged to believe that the same God who endowed us with sense, reason and intellect has intended us to forego their use." -Galileo

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://637461]
Approved by Sidhekin
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (2)
As of 2024-04-16 14:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found