QM has asked for the wisdom of the Perl Monks concerning the following question:

Is it possible to have a capture for the purposes of backreferences, but not returned as part of the match?

# Example @x = 'bogus firstblahjokeblahthird bogus' =~ /(first)(?:blah).*?\2(thi +rd)/; # doesn't work # @x is empty

(In fact, I suspect the above doesn't work because, while there is eventually a 2nd capture group, it hasn't captured anything before the backreference. But that isn't really the point here.)

I don't really have a specific need, but it just seems interesting to say "this capture is only for backreference, and not for returning as a capture".

Quantum Mechanics: The dreams stuff is made of

Replies are listed 'Best First'.
Re: Regex backreference without capture
by rsFalse (Hermit) on Mar 01, 2019 at 15:52 UTC
    Hello, QM,

    Haven't heard about such capture groups for only backreferences.
    Interestingly, if any simple overcome exist? This overcome below looks ugly:
    #!/usr/bin/perl -wl print map "[$_]", "bogus firstblahj?keblahthird bogus" =~ / (?| (*F) | first ( (blah) .*? \2 ) third (?{ $L = $1 }) (*THEN) (*F) # 1 2 2 1 | (??{ defined $L ? "" : "(*F)" }) (first) (??{ print "mid-match:[$L]" ; quotemeta $L }) (third +) # 1 1 2 +2 ) /x;
    mid-match:[blahj?keblah] [first][third]
    Idea here is to use conditional '(?|...)' with 2 groups, save mid-match (if full match is successful), fail first branch anyway, then match same thing on alternative branch rewriting $1 and $2, and using previously defined mid-match to avoid captures.
      Thanks. But I think that's too complex for any normal usage. (I'd hate to maintain that, even if I was the only coder.)

      But seriously, thanks. Maybe this will spur more ideas?

      Quantum Mechanics: The dreams stuff is made of

Re: Regex backreference without capture (?&NAME) NamedCapture
by beech (Parson) on Mar 05, 2019 at 03:36 UTC

    ? So you want  @x = qw( first third ) ;

    This "works"

    dd( 'bogus firstblahjokeblahthird bogus' =~ m/ (first) (?&patblah) .*? (?&patblah) (third) (?(DEFINE) (?<patblah>blah) ) /sx ); __END__ ("first", "third", undef)

    The undef comes from the named capture  (?<patblah>blah) which takes up slot $3 in this pattern, which is why (?(DEFINE) is placed at the end


      But does this do backreferences? I see it matches the same pattern, but that's not the same as a backreference.

      But thanks, that makes me think about the problem differently.

      Edited to add:
      I'm considering using named capture groups, where there's a need to have a well-defined return value. Which is something PCRE has, and some people (!) can't live without, apparently.

      Quantum Mechanics: The dreams stuff is made of

      E.g. the input is "bogus firstblAHj?keblOWthird bogus", and pattern is 'bl..', then it matches both, but OP would like to match only the same :)