Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

Re: multiple matches with regexp

by CombatSquirrel (Hermit)
on Oct 10, 2003 at 20:23 UTC ( #298407=note: print w/replies, xml ) Need Help??

in reply to multiple matches with regexp

Well, the Perl way is obviously TIMTOWDI, but I have a Perl-ish RegEx way for you ;-):
$a="aaaa"; $a=~m/(aa)(?{push @a, $1})(?!)/; print join ( "-", @a );
This uses (?{}) (just a bit of code within a RegEx that is executed whenever the RE engine runs over it) and (?!) (negative look-ahead), so that it always fails (that's a bit of its magic), both explained in perlre. You could say it is ugly, but I personally like it :-).
Hope this helped.
Entropy is the tendency of everything going to hell.

Replies are listed 'Best First'.
Re: Re: multiple matches with regexp
by sandfly (Beadle) on Oct 10, 2003 at 21:08 UTC
    This is clever, and extended my understanding of the RE engine (++), but is it guaranteed to work?

    I got interested in why a negative look-ahead was required, and found that negative and positive failing look-behinds work too, but a simple mis-match doesn't, and neither does a failing zero-length positive look-ahead: (?=x). For example, m/(aa)(?{push @a, $1})x/ does not work. Presumably the regex optimiser sees that there is no 'x' in 'aaaa', so it doesn't bother with the step-wise attempts to match the 'a's.

    Is it possible a future regex engine will realise that mis-match is inevitable because (?!) will always mis-match, and break this code?

      A too smart RegEx engine would already break the (?{}) part of the code, which is evaluated every time the engine runs over it. The main problem is that (?{}) is an experimental feature which may be changed or deleted in future Perl versions. Still, AFAIK, it is considered useful for some RegExes (the above one is fairly standard) which will hopefully prevent major changes in the syntax. And don't forget we have Perl 6 coming up ;-).
      Entropy is the tendency of everything going to hell.
        To avoid experimental features one may choose the following:
        $regexp="a{2}"; $_="aaaa"; push @a , $1 while m/(?=($regexp))./g; print join ( "-", @a ) . "\n";
Re: Re: multiple matches with regexp
by almaric (Acolyte) on Oct 11, 2003 at 00:46 UTC
    I like it, and with
    use re 'eval';
    I was also able to use a regular expression instead of a fixed string: "aa" -> "a{2}"

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://298407]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (2)
As of 2022-05-28 05:13 GMT
Find Nodes?
    Voting Booth?
    Do you prefer to work remotely?

    Results (98 votes). Check out past polls.