http://qs321.pair.com?node_id=28764
Category: Cryptography
Author/Contact Info Randal L. Schwartz, merlyn
Description: Usage: pat ABCABC finds any word that has three repeated characters twice in a row (such as "murmur" in my dictionary). pat XYYX finds words that are four-character palindromes, such as "deed". In the result, X and Y must be different. So pat ABCDEFGHAB finds ten-letter words whose first two and last two characters are identical, but the remaining letters are all distinct, such as "thousandth" or "Englishmen".

To require literal characters, use lowercase, as in pat fXXd, requiring an f, two identical letters, and a d, such as "food" or "feed".

For grins, dumps the regex that the pattern has been transformed into, so you can write your own, or see how much work you're avoiding by using this program.

  "Fun for the entire family!" -- Rolling Stone magazine (but not about this program)

#!/usr/bin/perl -w
use strict;

open WORDS, "/usr/dict/words" or die "no more words: $!";

for (@ARGV) {
  my @avoid = do {
    my @lits = /[a-z]/g;
    @lits ? "[" . join("", @lits) . "]" : ()
  };
  my %template;
  my $regex = "^";
  for (split //) {
    if (/[a-z]/) {
      $regex .= "$_";
    } elsif (/[A-Z]/) {
      if (exists $template{$_}) {
        $regex .= $template{$_};
      } else {
        my $id = 1 + keys %template;
        if (@avoid) {
          $regex .= "(?!" . join("|", @avoid) . ")";
        }
        $regex .= "(.)";
        push @avoid, $template{$_} = "\\$id";
      }
    } else {
      warn "ignoring $_";
    }
  }
  $regex .= "\$";
  print "$_ => $regex\n";
  seek WORDS, 0, 0;
  while (<WORDS>) {
    next unless /$regex/i;
    print;
  }
}
Replies are listed 'Best First'.