Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

The N-queens problem using pure regexes

by Abigail-II (Bishop)
on Oct 08, 2003 at 14:52 UTC ( [id://297616]=perlmeditation: print w/replies, xml ) Need Help??

More than a year ago, I posted in this section a way of solving the N-queens problem (place N queens on an NxN board such that no two queens attack each other) using a regex. However, the regex was full of (?{ }) and (?(?{ })|) constructs, so it wasn't a real regex, as it executed Perl code all the time.

I never came around explaining how it works, until yesterday when I gave a small talk at a local Perl mongers meeting. Driving home last night, I started realizing that the problem is solvable with pure regexes. No fancy (?{ }) or (?(?{ })|) constructs.

The program below solves the N-queens problem using a pure regex. It takes a few options: -n followed by a number indicates the size of the board, and if you use -p, it prints out the regex (and the string it matches against). -P only prints out the regex and string, but doesn't try to match it. Since it's slow like hell (but I've some ideas to speed it up), try -n 5 or -n 6. -n 8 (the default) takes a long time.

There's no much explaination (yet), but if you see the string and the regex, you can figure it out.

#!/usr/bin/perl use strict; use warnings 'all'; use Getopt::Long; Getopt::Long::Configure ("bundling"); GetOptions ('p|print' => \my $print, 'P|Print' => \my $Print, 'n|number=i' => \(my $nr_of_queens = 8) ); my $nr_of_rows = $nr_of_queens; my $nr_of_cols = $nr_of_queens; my @rows = (1 .. $nr_of_rows); my @cols = map {chr ($_ - 1 + ord 'a')} 1 .. $nr_of_cols; # Return positions not attacked by a certain other position. sub free { my $pos = shift; my ($col, $row) = $pos =~ /(\D+)(\d+)/; $col = ord ($col) - ord ('a') + 1; map {my $c = chr ($_ -> [0] - 1 + ord 'a'); "$c$_->[1]"} grep {$_ -> [0] != $col && $_ -> [1] != $row && abs ($_ -> [0] - $col) != abs ($_ -> [1] - $row)} map {my $c = ord ($_) - ord ('a') + 1; map {[$c, $_]} @rows} @ +cols; } my $str = join "\n" => map {my $c = $_; my $l = join "," => map {"$c$_"} @rows; ",$l,"} @cols; $str .= "\n;\n"; map {$str .= "$_:" . join ("," => free $_) . ",\n"} map {my $c = $_; map {"$c$_"} @rows} @cols; my $re = join "\n" => (".*,(\\w+),.*") x $nr_of_queens; $re .= "\n"; map {my $q = $_; $re .= "[\\x00-\\xFF]*\\n\\$q:"; map {$re .= ".*\\$_,"} grep {$_ ne $q} 1 .. $nr_of_queens; $re .= ".*\n"} 1 .. $nr_of_queens; if ($print || $Print) { print "'$str' =~ \n/^$re/\n"; exit if $Print; } if (my @a = $str =~ /^$re/) { print "[@a]\n"; } __END__

Abigail

Replies are listed 'Best First'.
Re: The N-queens problem using pure regexes
by dragonchild (Archbishop) on Oct 08, 2003 at 16:56 UTC
    Bravo! Very neat solution. And, a beautiful demonstration of the fact that the right data structure isn't just important, it's everything.

    Also, I like the heavy usage of map and grep. A little harder to read at first (as a programmer who's functional at play, not at work), but very concise.

    Question - how does the regex engine not choose a free square in d4 when looking for free squares in d3? I'm guessing it has to do with [\x00-\xFF]*, but I'm not positive as to how that's working ...

    ------
    We are the carpenters and bricklayers of the Information Age.

    The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6

    Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

      how does the regex engine not choose a free square in d4 when looking for free squares in d3?

      I do not know what you mean by this. Do you mean "why doesn't the regex engine never try to place a queen on both d4 and d3?" In that case, the answer is simple - it will never even attempt to place two queens on the same column. The first part of the string is:

      ,a1,a2,a3,a4,a5,a6,a7,a8, ,b1,b2,b3,b4,b5,b6,b7,b8, ,c1,c2,c3,c4,c5,c6,c7,c8, ,d1,d2,d3,d4,d5,d6,d7,d8, ,e1,e2,e3,e4,e5,e6,e7,e8, ,f1,f2,f3,f4,f5,f6,f7,f8, ,g1,g2,g3,g4,g5,g6,g7,g8, ,h1,h2,h3,h4,h5,h6,h7,h8,
      is being matched against the first part of the regex:
      .*,(\w+),.* .*,(\w+),.* .*,(\w+),.* .*,(\w+),.* .*,(\w+),.* .*,(\w+),.* .*,(\w+),.* .*,(\w+),.*
      The newlines in the string and regex are significant here (. won't match a newline). This part will make \1 one of 'a1' .. 'a8', \2 one of 'b1' .. 'b8', \3 one of 'c1' .. 'c8', etc. So, it will never try to place two queens on the 'd' column.

      But you may ask, "why doesn't it put a queen on d3 and one on e4?". That's where the second part of the string and regex come in. \4 contains the position of the queen on the d column, so, in this case, \4 equals 'd3'. Then there's this line in the second part of the regex:

      [\x00-\xFF]*\n\4:.*\1,.*\2,.*\3,.*\5,.*\6,.*\7,.*\8,.*
      We know that \4 equals 'd3', so part of this lines read '\nd3:'. Looking back at the second part of the string, there is only one line that starts with 'd3:':
      d3:a1,a2,a4,a5,a7,a8,b2,b4,b6,b7,b8,c1,c5,c6,c7,c8,e1,e5,e6,e7,e8,f2,f +4,f6,f7,f8,g1,g2,g4,g5,g7,g8,h1,h2,h4,h5,h6,h8,
      If you look carefully, after the colon are all the fields that aren't attacked by a queen on d3. Specifically, the field 'e4' is missing. But the line
      [\x00-\xFF]*\n\4:.*\1,.*\2,.*\3,.*\5,.*\6,.*\7,.*\8,.*
      is saying "match a line that starts with 'd3', and has after the colon a list of fields that include the positions of all other 7 queens". The [\x00-\xFF]* just skips enough lines to get to the next queen.

      Abigail

        The [\x00-\xFF]* just skips enough lines to get to the next queen.

        Your explanation above is good. However, the above quote is the actual meat of my question ...

        ------
        We are the carpenters and bricklayers of the Information Age.

        The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6

        Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

Re: The N-queens problem using pure regexes
by Abigail-II (Bishop) on Oct 09, 2003 at 14:30 UTC
    Since it's slow like hell (but I've some ideas to speed it up)

    Speeding it up turned out to be easier than I thought it was. Below is reworked program that is dramatically faster that the original. But first a table comparing running times of three versions, the original, pure regex solution from the parent node, the faster (still pure regex) solution presented below, and the non-pure variant presented last year. The latter is still the faster solution though.

    Timings (values in wall clock seconds):

      N    Original    Faster   Non-pure
     
      4       0.035     0.034      0.035
      5       0.045     0.036      0.036
      6       0.769     0.041      0.038
      7       4.833     0.042      0.038
      8                 0.082      0.049
      9                 0.072      0.044
     10                 0.113      0.056
     11                 3.504      0.051
     12                            0.096
     13                            0.071
     14                            0.577
     15                            0.467
     16                            3.864
     17                            2.289
     18                           19.630
     19                            1.324
     20                          117.227
    

    Before giving the program, some sample output:

    $ ./queens -p -n 4 ';,a1,a2,a3,a4, ;,b1,b2,b3,b4, b1:,a3,a4, b2:,a4, b3:,a1, b4:,a1,a2, ;,c1,c2,c3,c4, c1:,a2,a4,b3,b4, c2:,a1,a3,b4, c3:,a2,a4,b1, c4:,a1,a3,b1,b2, ;,d1,d2,d3,d4, d1:,a2,a3,b2,b4,c3,c4, d2:,a1,a3,a4,b1,b3,c4, d3:,a1,a2,a4,b2,b4,c1, d4:,a2,a3,b1,b3,c1,c2, ' =~ /^;.*,(\w+),.* ;.*,(\w+),.* [^;]*\2:.*,\1[^;]* ;.*,(\w+),.* [^;]*\3:.*,\1.*,\2[^;]* ;.*,(\w+),.* [^;]*\4:.*,\1.*,\2.*,\3[^;]* / [a3 b1 c4 d2] $ ./queens -n 8 [a8 b4 c1 d3 e6 f2 g7 h5]

    And here's the program:

    #!/usr/bin/perl use strict; use warnings 'all'; use Getopt::Long; Getopt::Long::Configure ("bundling"); GetOptions ('p|print' => \my $print, 'P|Print' => \my $Print, 'n|number=i' => \(my $nr_of_queens = 8) ); my @rows = 1 .. $nr_of_queens; my @cols = ('a' .. 'z') [0 .. $nr_of_queens - 1]; sub a2i {ord ($_ [0]) - ord ('a') + 1} sub i2a {chr ($_ [0] + ord ('a') - 1)} # Given a square, return all non-attacked squares on columns to # the *left* of the given square. (a1 is the lower left corner). sub free { my ($C, $R) = $_ [0] =~ /(\D)(\d+)/; $C = a2i $C; map {join "" => i2a ($_ -> [0]), $_ -> [1]} grep {$_ -> [0] != $C && $_ -> [1] != $R && abs ($_ -> [0] - $C) != abs ($_ -> [1] - $R)} map {my $c = a2i $_; map {[$c, $_]} @rows} @cols [0 .. $C - 1] } my ($str, $re) = ("", ""); foreach my $c (@cols) { $str .= ";," . (join "," => map {"$c$_"} @rows) . ",\n"; $re .= ";.*,(\\w+),.*\n"; next if $c eq 'a'; map {$str .= "$_:," . join ("," => free ($_)) . ",\n"} map {"$c$_" +} @rows; my $C = a2i $c; $re .= "[^;]*\\$C:" . join ("" => map {".*,\\$_"} 1 .. $C - 1) . " +[^;]*\n"; } if ($print || $Print) { print "'$str' =~ \n/^$re/\n"; exit if $Print; } if (my @a = $str =~ /^$re/) { print "[@a]\n"; } __END__

    Abigail

      N Original Faster Non-pure 4 0.035 0.034 0.035 5 0.045 0.036 0.036 6 0.769 0.041 0.038 7 4.833 0.042 0.038 8 0.082 0.049 9 0.072 0.044 10 0.113 0.056 11 3.504 0.051 12 0.096 13 0.071 14 0.577 15 0.467 16 3.864 17 2.289 18 19.630 19 1.324 20 117.227

      I'm curious - have you had a chance to look at why the speeds actually improve when going from 8 to 9 for both Faster and Non-pure and from 10-11 for Non-pure, but slows down 30x for Faster? And, what's with 17, 18, and 19 when it's 2.289 -> 19.630 -> 1.324??

      ------
      We are the carpenters and bricklayers of the Information Age.

      The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6

      Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

        It has to do with how many positions are rejected before a suitable one is found. The solutions found for n = 8 and n = 9 are:
        [a8 b4 c1 d3 e6 f2 g7 h5] [a9 b7 c4 d2 e8 f6 g1 h3 i5]
        As you can see, for n = 8, it never has to backtrack for the first queen (a8 is choosen), but for the seconde queen, b8, b7, b6, and b5 need to be rejected. b8 and b7 will be rejected right away (as they are attacked by a8), but for b6 and b5 to be rejected, lots of other queens will be have to be placed. For n = 9, no backtracking for the first queen is needed, and for the second queen, the positions b9 and b8 are rejected immediately. It's only the third queen were there's some real backtracking going on - c9, c8, c7, and c6 are rejected immediately, and only for c5 more queens will be tried before rejecting it.

        The timings for 'faster' with n >= 10 cannot be trusted, as the program contained a bug for n >= 10 (see elsewhere in this thread - the bug is now fixed). Here's a new table (done on a different computer, and recording user times, not wall clock time), with the fixed programs:

        N Original Faster Non-Pure 4 0.06 0.05 0.04 5 0.07 0.04 0.05 6 1.57 0.07 0.05 7 9.29 0.06 0.05 8 0.23 0.06 9 0.16 0.06 10 0.50 0.07 11 0.41 0.07 12 2.64 0.14 13 1.58 0.10 14 37.23 0.82 15 35.45 0.70 16 5.45 17 3.18 18 27.17 19 1.89 20

        And, in case you are interested, the code that generated the table:

        #!/usr/bin/perl use strict; use warnings; no warnings qw /syntax/; $| = 1; my $width = 15; my $time_out = 120; my @cmds = ("./queens2 -n ", "./queens3 -n ", "./queens1 -f -n "); my $nr_of_commands = @cmds; my $N = 4; print " N"; printf "%${width}s" => $_ for qw /Original Faster Non-Pure/; print "\n"; while ($nr_of_commands) { printf "%3d" => $N; foreach my $cmd (@cmds) { unless (defined $cmd) { print " " x $width; next; } local $SIG {ALRM} = sub {die "Time out!"}; alarm ($time_out); eval { my $time = (`/usr/bin/time -f "%U" $cmd $N 2>&1`) [-1]; alarm (0); chomp $time; printf "%$width.2f" => $time; }; if ($@ && $@ =~ /Time out/) { undef $cmd; $nr_of_commands --; print " " x $width; } } print "\n"; $N ++; }

        Home work question: the code above is lacking something vital. What is it not doing what it should do?

        Abigail

        For any odd N odd N not divisible by 3: $a_solution = [ map { chr(ord('a') + $_ - 1).(((2 * $_) - 1) % $N); } (1..$N) ]

        (e.g., for $N = 11: [ a1, b3, c5, d7, e9, f11, g2, h4, i6, j8, k10 ])

        This is equivalent to starting in the bottom-left corner of the board and moving right one square and up two squares (wrapping when you hit the edge) N-1 times.

        If the regex picks the first space not-already-capturable in each column (From brief inspection, it appears to do so -- It finds an equivalent solution for odd-N), this is the first solution it will find. In the even-N case, this process will not leave any not-already-capturable squares in the last column on the first pass, so it must then backtrack.

        Update: Whoa there, Ben. I spoke way too soon. The above solution only applies when N is odd AND not divisible by 3. (So, for N = (1,5) mod 6)

        More update: I was wrong about most of the analysis, too. This won't be the first solution found.

Re: The N-queens problem using pure regexes
by thor (Priest) on Oct 10, 2003 at 01:33 UTC
    I think I've found a bug:
    thor@bravo:~/perl> queens.pl -n 10 [a10 b8 c6 d9 e1 f1 g1 h7 i5 j2]
    Wouldn't e1, f1, and g1 all be able to attack one another?

    thor

      You are quite correct. The faulty solution can only happen with n >= 10; it happens because 'e10' isn't attacked by 'f1', and "e10" =~ /e1/. Luckely, the fix is simple:
      #!/usr/bin/perl use strict; use warnings 'all'; use Getopt::Long; Getopt::Long::Configure ("bundling"); GetOptions ('p|print' => \my $print, 'P|Print' => \my $Print, 'n|number=i' => \(my $nr_of_queens = 8) ); my @rows = 1 .. $nr_of_queens; my @cols = ('a' .. 'z') [0 .. $nr_of_queens - 1]; sub a2i {ord ($_ [0]) - ord ('a') + 1} sub i2a {chr ($_ [0] + ord ('a') - 1)} # Given a square, return all non-attacked squares on columns to # the *left* of the given square. (a1 is the lower left corner). sub free { my ($C, $R) = $_ [0] =~ /(\D)(\d+)/; $C = a2i $C; map {join "" => i2a ($_ -> [0]), $_ -> [1]} grep {$_ -> [0] != $C && $_ -> [1] != $R && abs ($_ -> [0] - $C) != abs ($_ -> [1] - $R)} map {my $c = a2i $_; map {[$c, $_]} @rows} @cols [0 .. $C - 1] } my ($str, $re) = ("", ""); foreach my $c (@cols) { $str .= ";," . (join "," => map {"$c$_"} @rows) . ",\n"; $re .= ";.*,(\\w+),.*\n"; next if $c eq 'a'; map {$str .= "$_:," . join (",," => free ($_)) . ",\n"} map {"$c$_ +"} @rows; my $C = a2i $c; $re .= "[^;]*\\$C:" . join ("" => map {".*,\\$_,"} 1 .. $C - 1) . +"[^;]*\n"; } if ($print || $Print) { print "'$str' =~ \n/^$re/\n"; exit if $Print; } if (my @a = $str =~ /^$re/) { print "[@a]\n"; } __END__

      Abigail

Re: The N-queens problem using pure regexes
by Roy Johnson (Monsignor) on Nov 21, 2003 at 19:42 UTC
    This may get me a very clever, -10, but I've tweaked the program to run rather quickly (-n 22 runs in a few seconds) by using a known general solution as the basis of my regex. Technically, it still generates a regular expression and matches it against the who-doesn't-attack-whom string, so -n 3 returns no match.

    #!/usr/bin/perl use strict; use warnings 'all'; use Getopt::Long; Getopt::Long::Configure ("bundling"); GetOptions ('p|print' => \my $print, 'P|Print' => \my $Print, 'n|number=i' => \(my $nr_of_queens = 8) ); my @rows = 1 .. $nr_of_queens; my @cols = ('a' .. 'z') [0 .. $nr_of_queens - 1]; sub a2i {ord ($_ [0]) - ord ('a') + 1} sub i2a {chr ($_ [0] + ord ('a') - 1)} # Given a square, return all non-attacked squares on columns to # the *left* of the given square. (a1 is the lower left corner). sub free { my ($C, $R) = $_ [0] =~ /(\D)(\d+)/; $C = a2i $C; map {join "" => i2a ($_ -> [0]), $_ -> [1]} grep {$_ -> [0] != $C && $_ -> [1] != $R && abs ($_ -> [0] - $C) != abs ($_ -> [1] - $R)} map {my $c = a2i $_; map {[$c, $_]} @rows} @cols [0 .. $C - 1] } my ($str, $re) = ('', ''); my $solrow = 0; foreach my $c (@cols) { $str .= ";," . (join "," => map {"$c$_"} @rows) . ",\n"; $solrow += 2; $solrow = 1 if $solrow > $nr_of_queens; $re .= ".*\n" unless $c eq 'a'; $re .= ";.*,($c$solrow),"; next if $c eq 'a'; $re .= sprintf "(?:.*\n){%d}\\%d:", $solrow, a2i($c); for my $i (1..a2i($c)-1) { $re .= ".*,\\$i,"; } $re .= sprintf "(?:.*\n){%d}", $nr_of_queens - $solrow; map {$str .= "$_:," . join (",," => free ($_)) . ",\n"} map {"$c$_ +"} @rows; } if ($print || $Print) { print "'$str' =~ \n/^$re/\n"; exit if $Print; } if (my @a = $str =~ /^$re/) { print "[@a]\n"; } __END__

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://297616]
Front-paged by broquaint
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (6)
As of 2024-04-16 05:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found