http://qs321.pair.com?node_id=687755


in reply to Re^2: cleaving a sequence with specific alphabets
in thread cleaving a sequence with specific alphabets

You won't need capturing parentheses when evaluating in list context. Afaik does the .*? invoke a speed penalty, so that the expression might be optimized as:

... my @arr = $string =~ / [^KR]+ # collect non K|R . # the following must be K|R (?!P) # ignore if fragment would start by P /gsx; ...

Regards

mwa

Replies are listed 'Best First'.
Re^4: cleaving a sequence with specific alphabets
by tachyon-II (Chaplain) on May 21, 2008 at 16:30 UTC

    This is broken, not optimised. The .*? was there for a reason.

    $string = "RYOURPOSTBROKEN";
      That snippet shows that yours is broken too. It doesn't return the last segment ("EN").
      my $string = "RYOURPOSTBROKEN"; print("input: $string\n"); print("expecting: R YOURPOSTBR OK EN\n"); print("\n"); { # mwah (split) my $re = qr/(?<=[KR])(?!P)/; my @arr = split $re, $string; print("split $re: @arr\n"); } { # tachyon-II (re) my $re = qr/(.*?[KR])(?!P)/s; my @arr = $string =~ /$re/g; print("$re: @arr\n"); } { # mwah (re) my $re = qr/[^KR]+.(?!P)/s; my @arr = $string =~ /$re/g; print("$re: @arr\n"); }
      input: RYOURPOSTBROKEN expecting: R YOURPOSTBR OK EN split (?-xism:(?<=[KR])(?!P)): R YOURPOSTBR OK EN (?s-xim:(.*?[KR])(?!P)): R YOURPOSTBR OK (?s-xim:[^KR]+.(?!P)): YOU POSTBR OK EN (?s-xim:[^KR]*.(?!P)): R YOU POSTBR OK EN

        Oh well, I made a mistake (better really test the code next time).

        I couldn't come up with a solution dropping the .*?, so Tachyon-II's solution only needs a slight modification ...

        ... my @arr = $string =~ / .*? [KR] (?!P) | [^KR]+$ /gx; ...

        ... for getting the strings tail.

        Thanks for checking this!

        Regards

        mwa