perlmeditation
blokhead
I'd seen [Abigail-II|Abigail] do nasty things with regexes before, but never really understood them at all. However, in the recent [id://297616|N-Queens solver], it was a pure regex (backrefs only), there were no <code>(?{})</code> constructs, and I was finally able to understand how [Abigail-II|Abigail] does it..
<p>
Then, on a regex kick, I discovered [http://perl.plover.com/NPC/NPC-3SAT.html|Abigail's pure regex 3-SAT reduction]. If you haven't already seen it, it's the coolest couple of lines on the planet. With all this regex excitement, I decided to try my hand at solving some NP-complete problem(s) with pure regexes.
<p>
I tried a handful of different NP-complete problems, but after a few emails with [Dominus|MJD], I understood why these attempts went wrong. I kept getting hung up on the string and regex size being exponential on the size of input (not meaning my solution was incorrect, but invalidating its use as an NP-completeness reduction proof). Finally, I think I may have gotten it right with Hamiltonian Circuits. [Abigail-II|Abigail] has [id://282049|done this before] using extended regexes, but claimed to be stuck on a pure regex solution that wasn't exponential.
<p>
So without further ado, here's my solution. Given a (simple, undirected) graph with E edges and V vertices, it finds a [http://mathworld.wolfram.com/HamiltonianCircuit.html
|Hamiltonian Circuit] (a cycle that visits every vertex exactly once, starting and ending in the same place). The size of the string it creates is bounded by O(V^4), and the size of the regex it creates is bounded by O(V^2).
<readmore>
<code>
my @E = ([1,3],[1,5],[2,3],[2,4],[2,5],[4,5]);
my $V = 5;
my $verbose = 1;
my @all_edges = map { my $x = $_; map { [$x, $_] } $x+1 .. $V } 1 .. $V-1;
my $string = (join(' ', 1 .. $V) . "\n") x $V
. "\n"
. (join(' ', map { join "-", @$_ } @all_edges ) . "\n")
x @all_edges
. "\n"
. (join(' ', map { join "-", @$_ } @E ) . "\n") x $V;
my $regex = "^ "
. ".* \\b (\\d+) \\b .* \\n\n" x $V
. "\\n\n"
. join("", map { my ($x, $y) = @$_;
".* \\b (?: \\$x-\\$y | \\$y-\\$x ) \\b .* \\n\n"
} @all_edges)
. "\\n\n"
. join("", map { my ($x, $y) = ($_, $_+1);
".* \\b (?: \\$x-\\$y | \\$y-\\$x ) \\b .* \\n\n"
} 1 .. ($V-1))
. ".* \\b (?: \\$V-\\1 | \\1-\\$V ) \\b .* \\n \$\n";
print "'$string' =~ /\n$regex\n/x\n" if $verbose;
if (my @c = $string =~ /$regex/x) {
local $" = " -> ";
print "Hamiltonian circuit: [ @c -> $1 ]\n";
} else {
print "No Hamiltonian circuit\n";
}
</code>
Now the dirty details.. Here's the value of <tt>$string</tt> and <tt>$regex</tt> for the example graph included:
<code>
$string = q[
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5
1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5
1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5
1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5
1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5
1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5
1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5
1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5
1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5
1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5
1-3 1-5 2-3 2-4 2-5 4-5
1-3 1-5 2-3 2-4 2-5 4-5
1-3 1-5 2-3 2-4 2-5 4-5
1-3 1-5 2-3 2-4 2-5 4-5
1-3 1-5 2-3 2-4 2-5 4-5
];
$regex = q[^
.* \b (\d+) \b .* \n
.* \b (\d+) \b .* \n
.* \b (\d+) \b .* \n
.* \b (\d+) \b .* \n
.* \b (\d+) \b .* \n
\n
.* \b (?: \1-\2 | \2-\1 ) \b .* \n
.* \b (?: \1-\3 | \3-\1 ) \b .* \n
.* \b (?: \1-\4 | \4-\1 ) \b .* \n
.* \b (?: \1-\5 | \5-\1 ) \b .* \n
.* \b (?: \2-\3 | \3-\2 ) \b .* \n
.* \b (?: \2-\4 | \4-\2 ) \b .* \n
.* \b (?: \2-\5 | \5-\2 ) \b .* \n
.* \b (?: \3-\4 | \4-\3 ) \b .* \n
.* \b (?: \3-\5 | \5-\3 ) \b .* \n
.* \b (?: \4-\5 | \5-\4 ) \b .* \n
\n
.* \b (?: \1-\2 | \2-\1 ) \b .* \n
.* \b (?: \2-\3 | \3-\2 ) \b .* \n
.* \b (?: \3-\4 | \4-\3 ) \b .* \n
.* \b (?: \4-\5 | \5-\4 ) \b .* \n
.* \b (?: \5-\1 | \1-\5 ) \b .* \n $
];
__OUTPUT__
Hamiltonian circuit: [ 5 -> 4 -> 2 -> 3 -> 1 -> 5 ]
</code>
Here's how it works: <ul>
<li>In the first "paragraph", <tt>$1</tt> through <tt>$5</tt> each get a vertex assigned to them. This paragraph of <tt>$string</tt> is bounded by O(V^2), and <tt>$regex</tt> by O(V).
<br><br>
<li>The second "paragraph" is the ugly bit. We have to make sure all the vertices chosen are different. This could have been done by originally choosing <tt>$1</tt> through <tt>$5</tt> to be any permutation from an exhaustive list of permutations, but such a list would have been exponential in size.
<br><br>
The way I checked is by picking any 5 vertices, then ensuring that they are all pairwise distinct. In <tt>$string</tt>, we have a repeated list of all "valid" (distinct) vertex pairs. In <tt>$regex</tt>, we make sure that every pair of two vertices in {<tt>$1</tt>,..,<tt>$5</tt>} shows up in that list.<p>
This isn't exponential, because there are V(V-1)/2 "valid" (distinct) pairs, and V(V-1)/2 pairs to verify. So this paragraph of <tt>$string</tt> is bounded by O(V^4), and <tt>$regex</tt> by O(V^2).
<br><br>
<li>The third "paragraph" is where we verify that our distinct vertex set has edges from <tt>$1</tt> to <tt>$2</tt> ... to <tt>$5</tt> and then back to <tt>$1</tt>. This paragraph of <tt>$string</tt> is bounded by O(E*V) = O(V^3), and <tt>$regex</tt> by O(V).
<br><br>
If all these conditions matched, then the regex matches, and <tt>$1</tt> through <tt>$5</tt> must be the vertices of a Hamiltonian Circuit.
</ul>
<p>
There you have it. Your feedback is welcome. I really hope I haven't made any mistakes in my analysis. This code isn't one-tenth as beautiful and elegant as [http://perl.plover.com/NPC/NPC-3SAT.html|Abigail's 3-SAT reduction], and perhaps it could be improved upon. But hey, we all have to start somewhere <tt>;)</tt>
</readmore>
<p>
<b>Update:</b> modified code so that <tt>$string</tt> is a little clearer to read. Instead of comma-separating the edge and vertex lists, they are separated by spaces. <tt>$regex</tt> is modified accordingly, matching on <tt>\b</tt> where appropriate, instead of a comma.
<div class="pmsig"><div class="pmsig-137386">
<p>
blokhead
</div></div>