Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: N Permutations of M Items

by Zaxo (Archbishop)
on Dec 20, 2002 at 08:27 UTC ( [id://221385]=note: print w/replies, xml ) Need Help??


in reply to N Permutations of M Items

It's better to avoid checking all permutations if you can. Permutations grow in number very rapidly with string length.

You don't need to know which permutation gets you a dictionary word, only that one exists. You can do that by rearranging the characters of words in alphabetical order and matching with wildcards. We deal with 'location' in the form 'acilnoot'.

First, let's get the preliminaries out of the way...

#!/usr/bin/perl use warnings; use strict;
and make an anagram dictionary. That's not necessary, but it will speed things up if the wordlist is not too long for memory.
my ($wl, %words) = '/usr/dict/words'; { open my $wf, '<', $wl or die $!; chomp, push @{$words{join( '', sort split //, lc)}}, $_ while <$wf>; close $wf }

It would be nice to give several words on the command line, so we set that up in a hash which will hold results:

my %matches; @matches{map {lc} @ARGV} = ();

Now, for instance, 'coal' appears in the %words dictionary under the key 'aclo'. We can construct the regex /a.*c.*l.*o/. If 'acilnoot' matches, then 'location' contains 'coal'.

for (keys %matches) { my $ordered = join '', sort split //; $matches{$_} = [ grep { length($ordered) < length($_) ? 0 : do { my $re = '^(.*?)' . join('(.*)', map {quotemeta} split '') . '(.*)$'; $ordered =~ /$re/; }; } keys %words]; }
I've fancied up the regex to capture unused characters (just in case that may be handy ;-), though there is no use made of that here. I also arranged for length check to save time on hopeless cases.

Finally, print results just to have something to show for all that.

print $_, ' contains ', "@{[map {@$_} @words{@{$matches{$_}}}]}", $/, $/ for keys %matches; __END__

Running this on my machine as 'time perl anaword.pl location brevity Zoroaster' prints:

location contains loon octal loot tool Al an at in Io it no on to Acto +n canto Alton onto alnico lion loin Olin lint toil action into Toni I +lona Latin colon location Tonio Cain Inca clan coal cant coat lotion +lain nail tail anti Tina iota Lac can act cat ail Ali Ian Lao ant Nat + tan oat Clint con coo cot loan tonic NATO Clio coil loci coin icon L +in nil oil lit ion tin Ito cool clot colt lot coon not ton too antic zoroaster contains Zoroaster as at re et or so to Oz ooze Eros ores Ro +se sore tore zero toes Azores errs rest zest rotor roost roots arose +Erato roaster rears Serra rater Terra aster rates stare tears razors +orators sorer store zeros Starr rare rear Ares ears eras sear rate te +ar Ezra raze east eats sate seat teas rooster resort roster sorter or +ator are ear era Rae sea ate eat tea oar Sao oat art rat tar sat root + soot zoos rots sort ore roe toe Zoe err set arrest rarest raster rat +ers Sartre starer ersatz roar oars Rosa soar oats Taos arts rats star + roars razor Astor roast too zoo Orr rot rooter brevity contains try be by re et it very rivet bier Brie bite Bert ver +b byte bevy brevity bet bye rib bit biter Tiber tribe ire tie vie rye + yet ivy rite tier tire vier Viet real 0m8.766s user 0m7.450s sys 0m0.070s
I didn't show how to limit attention to four letter words, but that is easy given the structure of this treatment. There are other nodes here which use this technique. Search 'anagram' to find them.

Update: Modified regex to anchor at the ends and not miss unused characters, updated times to the new values.

After Compline,
Zaxo

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://221385]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (4)
As of 2024-04-19 03:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found