Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

To not capture recursive group while collecting certain matches

by Anonymous Monk
on Nov 29, 2021 at 09:55 UTC ( [id://11139211]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

How regex match must not capture recursive group used and required inside when collecting all the matched captured groups?
just illustration
@m='lkjkljkjlkhkjkfjkfkvklkv'=~/^\w*(kl(?1)\w*).*?(kv)/g;
How to manage m contains all matched kv correctly clean from matched recursive ?

Replies are listed 'Best First'.
Re: To not capture recursive group while collecting certain matches
by Anonymous Monk on Nov 29, 2021 at 16:31 UTC

    As given, your example does not match anything. If you use re 'debug'; you will get a fairly cryptic trace of what the regular expression is doing:

    $ perl -Mre=debug -E 'my @m='lkjkljkjlkhkjkfjkfkvklkv'=~/^\w*(kl(?1)\w*).*?(kv)/g; say for @m;'
    Compiling REx "^\w*(kl(?1)\w*).*?(kv)"
    Final program:
       1: SBOL /^/ (2)
       2: STAR (4)
       3:   POSIXU\w (0)
       4: OPEN1 (6)
       6:   EXACT <kl> (8)
       8:   GOSUB1-4:4 (11)
      11:   STAR (13)
      12:     POSIXU\w (0)
      13: CLOSE1 (15)
      15: MINMOD (16)
      16: STAR (18)
      17:   REG_ANY (0)
      18: OPEN2 (20)
      20:   EXACT <kv> (22)
      22: CLOSE2 (24)
      24: END (0)
    floating "klkl" at 0..9223372036854775807 (checking floating) anchored(SBOL) minlen 6 
    Matching REx "^\w*(kl(?1)\w*).*?(kv)" against "lkjkljkjlkhkjkfjkfkvklkv"
    Intuit: trying to determine minimum start position...
      doing 'check' fbm scan, 0..22 gave -1
      Did not find floating substr "klkl"...
    Match rejected by optimizer
    Freeing REx: "^\w*(kl(?1)\w*).*?(kv)"
    

    The trace says your match never got started because the string being matched does not contain 'klkl'. The first 'kl' is the literal one from your regular expression; the second is because of the recursion. I believe that even if the string contained 'klkl' there would be no match, because there is no way for the recursion to end; it would just look for 'klklkl', then 'klklklkl', and so on.

    Can you say what you expect @m to contain in your example?

    And can you provide an example where you match against the shortest possible string that you want to match successfully?

Re: To not capture recursive group while collecting certain matches
by LanX (Saint) on Nov 29, 2021 at 17:15 UTC
    > clean from matched recursive ?

    There are no "recursive" inner matches returned.

    It's unclear what you want but the following demonstrates how to match only at the top level of a recursive match.

    DB<173> x 'bla[[abc]]foo' =~ / ( \[ (?1)* \w* \] ) /xg; 0 '[[abc]]'

    Not what you wanted?

    Then see how (not) to ask a question

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

Re: To not capture recursive group while collecting certain matches
by LanX (Saint) on Nov 29, 2021 at 10:40 UTC
Re: To not capture recursive group while collecting certain matches
by Anonymous Monk on Nov 29, 2021 at 10:39 UTC
    Use named patterns?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11139211]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (3)
As of 2024-04-20 04:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found