http://qs321.pair.com?node_id=812913


in reply to Regex fun

I can do the deed with
while (m/\+([0-9]+)[ACGTNacgtn]/g) { print "diff+: $1\n"; my $m = $1; s/\+[0-9]+[ACGTNacgtn]{$m}// }
But that's not quite so nice.

I understand that you meant “It's not so nice because I'd like a single regex”, but it's also not so nice because you're doing some work (matching the number and a single base) twice. You might prefer something like

s/\G\+$1[$bases]{$1}// while /(?=\+([0-9]+))/g;
which is at least 1 line, if 2 regexes. :-)

Here's a fairly naughty single-regex approach:

1 while s/(?<=\+)([0-9]+)[$bases]/$1 - 1/eg;

UPDATE: Changed the patterns to use look-around. Your version and my first will both loop forever on a mal-formed string like +2G, whereas the second one will just reduce it to +1 and terminate.
UPDATE: As Hena points out, I forgot a base case in my induction! The following fixes it (at least if no +0 strings are allowed), but loses a lot of the fun:

1 while s/\+([0-9]+)[aAgGcC]/$1 > 1 ? '+' . $1 - 1 : ''/eg;