http://qs321.pair.com?node_id=692281


in reply to Global regexp

Not sure your parameters. Do you mean all pairs of numerically consecutive digits? Or just adjacent digits? Does the string contain alphabetics? Although everyone loves to show off their regex foo, not all solutions require a regex. Perhaps this:

my $s = "abc12341xyz"; my @pairs = grep /\d\d/, map { substr $s, $_, 2 } (0..length($s)-2); print join " ", @pairs; 12 23 34 41


s//----->\t/;$~="JAPH";s//\r<$~~/;{s|~$~-|-~$~|||s |-$~~|$~~-|||s,<$~~,<~$~,,s,~$~>,$~~>,, $|=1,select$,,$,,$,,1e-1;print;redo}

Replies are listed 'Best First'.
Re^2: Global regexp
by Anonymous Monk on Jun 17, 2008 at 09:21 UTC
    I meant adjacent digits.

    To be true my problem was more complex than just matching digits. I wanted to find all possible matches for any regexp.
    Take
    print $_, ", " for ('1aaab2' =~ /(a+b)/g)
    #prints aaab,

    It prints only one result, instead of list (what's needed): 'ab', 'aab', 'aaab'.
    Solution given by Corion works perfectly in this case:
    print $_, ", " for ('1aaab2' =~ /(?=(a+b))/g)
    #prints aaab, aab, ab,

    But it has one side effect. See an example:
    while ('1234' =~ /(\d\d)/g) {
    print "$`<$&>$'", ", ";
    }
    #prints <12>34, 12<34>,

    Extended regexp:
    while ('1234' =~ /(?=(\d\d))/g) {
    print "$`<$&>$'", ", ";
    }
    #prints <>1234, 1<>234, 12<>34,

    So side effect is that this extended regexp doesn't allow to use $`, $&, $' variables as usually.

      To be true my problem was more complex than just matching digits. I wanted to find all possible matches for any regexp.

      That's simple enough to.

      local our @results; # Not "my". /(\d\d)(?{ push @results, $1 })(?!)/;
        Thanks!
        I guess magic (?!) do all the job

      Which is to be expected, because my approach never matches anything in the "real body" of the regular expression. If you want different behaviour of the regex engine, you can only achieve that by making it match different things, which will result in the match variables containing different values. If you want to keep the behaviour of $`, $& and $', then you will need to fiddle with pos. You haven't stated why you don't want to do that.

        I think dealing with pos will result in ugly code: maybe code with cycle over string length or time consuming code.
        So you mean perl regexps are always greedy: they match as much as possible and there's no way to configure them beside your approach with (?= ) ?

      See Regexp::Exhaustive to get every possible match of a pattern against a string. It supports the use of $& et al (without global penalty).

      use Regexp::Exhaustive 'exhaustive'; my @matches = exhaustive( 'asdf' => qr/..??/, qw[ $` $& $' ], ); printf "%s<%s>%s\n", @$_ for @matches; __END__ <a>sdf <as>df a<s>df a<sd>f as<d>f as<df> asd<f>

      lodin

        Nice package. Thanks.
        Investigation of code in Regexp/Exhaustive.pm showed there's constructions like in ikegami's post: <(?!)/tt> and (?{push @array, ...})

      So, have the regex engine do what it does best, that is, return the longest matching string. Then, have a routine that takes that string and provides the permutations.


      s//----->\t/;$~="JAPH";s//\r<$~~/;{s|~$~-|-~$~|||s |-$~~|$~~-|||s,<$~~,<~$~,,s,~$~>,$~~>,, $|=1,select$,,$,,$,,1e-1;print;redo}