Wow, I didn't know the empty capture and backreference trick, nice!
I also didn't know the trick where you put the pattern on the left side of the =~ operator and still get results somehow ;-)
I suspect there's some unexpected behavior in the backtracking?
I tried:
use v5.20;
use Data::Dump "pp";
my @z = glob('{a,b,c}'x6);
my $z = '(?:a()|a()|b()|b()|c()|c()){6}\1\2\3\4\5\6';
for my $j (@z)
{
$j =~ $z and say pp {$j => \@- };
}
And I got a bunch of values likes:
...
{ acbcaa => [0, 6, 6, 3, 3, 4, 4] }
{ acbcab => [0, 5, 5, 6, 6, 4, 4] }
{ acbcac => [0, 5, 5, 3, 3, 6, 6] }
{ acbcba => [0, 6, 6, 5, 5, 4, 4] }
{ acbcca => [0, 6, 6, 3, 3, 5, 5] }
{ accaab => [0, 5, 5, 6, 6, 3, 3] }
{ accaba => [0, 6, 6, 5, 5, 3, 3] }
...
{ cccbab => [0, 5, 5, 6, 6, 3, 3] }
{ cccbac => [0, 5, 5, 4, 4, 6, 6] }
{ cccbba => [0, 6, 6, 5, 5, 3, 3] }
{ cccbca => [0, 6, 6, 4, 4, 5, 5] }
{ ccccab => [0, 5, 5, 6, 6, 4, 4] }
{ ccccba => [0, 6, 6, 5, 5, 4, 4] }
Where each pair of alternative match exactly (eg \1 and \2) at the same place, no matter what. I'd suspect that the identical branches are actually merged by the optimizer.
Is there some other magic to DWIM?
There's this:
my @y = glob('{a,b,c}'x6);
my $y = '(?:(?!\1)a()|(?!\2)a()|(?!\3)b()|(?!\4)b()|(?!\5)c()|(?!\6)c(
+)){6}\1\2\3\4\5\6';
for my $j (@y)
{
$j =~ $y and say $j;
}
aabbcc
aabcbc
aabccb
aacbbc
aacbcb
aaccbb
ababcc
abacbc
abaccb
...
ccbaab
ccbaba
ccbbaa
Edit: this also works actually (without \1\2\3\4\5\6 at the end):
# edit reformatted as a multiline regex for clarity
my $y = qr/(?:
(?!\1) a ()
| (?!\2) a ()
| (?!\3) b ()
| (?!\4) b ()
| (?!\5) c ()
| (?!\6) c ()
){6}
/x;
So TIL, (?!\x)XXX() is a pattern to only allow XXX to match once in the whole regex... Cool :-)