Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

RegEx question

by vit (Friar)
on Dec 21, 2017 at 22:06 UTC ( [id://1206002]=perlquestion: print w/replies, xml ) Need Help??

vit has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,
I need to write an expression which will match the following :
^(something but not "bank") (nova scotia) (something but not "bank")$

I tried different options but none works, for example
"^(?!bank).*?nova scotia.*?(?!bank).*$"
I feel that this is wrong but can't find the right one, please help.

Replies are listed 'Best First'.
Re: RegEx question
by ikegami (Patriarch) on Dec 21, 2017 at 22:14 UTC

    If you mean "something that isn't bank" (as you said),

    / ^ (?: .{0,3} | (?!bank}.{4} | .{5,} ) nova [ ] scotia (?: .{0,3} | (?!bank}.{4} | .{5,} ) $ /sx

    Alternative:

    / ^ (?: | [^b].* | b[^a].* | ba[^n].* | ban[^k].* | bank.+ ) nova [ ] scotia (?: | [^b].* | b[^a].* | ba[^n].* | ban[^k].* | bank.+ ) $ /sx

    If you mean "something that doesn't contain bank",

    / ^ (?: (?!bank). )* nova [ ] scotia (?: (?!bank). )* $ /sx
      Is it possible to write it without Pattern Modifiers? In other words without /sx

        Sure, just squish it all together (untested)
            /^(?:(?!bank).)*nova[ ]scotia(?:(?!bank).)*$/
        and remember that  . (dot) no longer "matches all." But why would you want to?

        Update: Actually, you no longer need the character class (update: because without  /x a space is a literal space):
            /^(?:(?!bank).)*nova scotia(?:(?!bank).)*$/
        and if you just gotta get rid of that last space:
            /^(?:(?!bank).)*nova\x20scotia(?:(?!bank).)*$/
        (both still untested).


        Give a man a fish:  <%-{-{-{-<

Re: RegEx question
by Laurent_R (Canon) on Dec 21, 2017 at 23:52 UTC
    It might be simpler two use two regexes:
    # if the string is in $_: /nova scotia/ and not /bank/; # else: $str =~ /nova scotia/ and $str !~ /bank/;
    And that second option should also work in Java with the appropriate syntax adaptations.
Re: RegEx question
by Anonymous Monk on Dec 26, 2017 at 14:05 UTC
    What comes to mind is a few separate if-statements. First, gnore the line if it does not contain nova scotia. (Least probable so listed first.) Then, perhaps you can simply ignore the line if it does contain bank anywhere. Alternatively, you can split() the line on /nova scotia/ and verify that none of the pieces contains bank.

    I have come to dislike fancy-regexes because soon enough some new requirement comes along that messes them up, leaving the poor programmer to make a now-complicated change, and to somehow verify, not only that it handles the new requirement, but also that it continues to handle the old one(s).

      Five days after good real answers, this code-free, poor advice written without reading the thread arrives. One should never make the least likely operation the one always run. The regex is meant to be used in two languages so a coded logic tree isn't desired. Options exist in the construction of code between 2x4s and the JPL.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1206002]
Approved by ikegami
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (5)
As of 2024-04-23 06:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found