Good point.
or use features that I don't use and so don't remember to prevent backtracking.
I do remember, you must be talking about the "cut" assertion: (?>PATTERN). See japhy's drafts for his (unfortunately) abandoned book on regexes, at http://japhy.perlmonk.org/book/book/ , chapter 8 (MS Word format), section 8.1.4.
Only, it won't work here, because If I did /a(?>.*?)b/, it would never match, because once the .*? ate a "b", it would never give it back.
print 'xxxxaxxxxbxxxx' =~ /a(?>.*?)b/ ? 'match' : 'fail';
produces fail.
So yours is the better solution... you don't need the "?", because /^[^a]*?a/ and /^[^a]*a/ will match the exact same substring. I have no idea which is faster, or whether it even makes a difference, so I'll just leave it in.
So, my code adapted to produce your pattern:
$string = 'abcd';
$regex = '^' . join '', map "[^$_]*$_", map quotemeta, split //, $stri
+ng;
# embed capture to get offset:
$regex =~ s/(?<=\*)/(/; $regex =~ s/$/)/;
$sequence = 'xxxxxaxxxxxxbxxxxxcxxxxdxxxxx';
if($sequence =~ /$regex/s) {
printf "I found a match, at offset %d and with total length %d\n",
+ $-[1], $+[1]-$-[1];
}
As a precaution I've included a quotemeta so you can safely search for subsequences containing \W characters. |