perlquestion
smgfc
Yesterday, while going through the daily (read: hourly) routine of slashdot appreciation, I noticed a banner add for <a href="http://www.itasoftware.com">ItaSoftware</a> (this isnt a plug) and this add contained information of a<a href="http://www.itasoftware.com/careers/programmers.php"> challenge</a>. Upon visiting their site I realized it was really just a way to evaluate anyone applying for a job, but decided to do one the challenges for fun. I chose Add-A-Grams, where you start with a three letter word, and add a letter to create another word. The challenge was to find the longest in their 1.6 meg dictionary. After evaluating problem I can to the conclusion that I should start from the largest word and work my way down, anyway here is my sparsely documented code, that I will explain a little. I am really looking for performance tips. The last feature, writing the add-a-grams to a file, i havent tested, because it took an hour to run on my old mac, my pc is having nine heart attacks, and the new mac wont arrive for way too long, so if ther e is a problem there really sorry! <CODE>$dict="<untitled:Desktop Folder:WORD.LST";
$adda=">untitled:Desktop Folder:addagram.lst";
#######
# open the dictionary and create an array of the words
#######
open (dict, $dict) or die ("Can't open $dict: $!\n");
while(defined($line=readline(*dict))) {
chomp $line;
push @words, $line;
}
close (dict);
#######
#create a hash of array, where the key is the length of each element in the array
#######
foreach $word (@words) {
push @{ $wordpl{length($word)} }, $word;
}
#######
#extract the lengths and sort them
#######
foreach $k (keys %wordpl ) {
push @lengths, $k;
}
@lengths = sort {$b <=> $a} @lengths;
#######
#do something for every length
#######
foreach $l (@lengths) {
word();
}
#######
#do something with every word of every length
#######
sub word {
my $word;
foreach $word ( @{ $wordpl{$l} } ) {
$oword=$word; #define original word
my $p=$l-1;
word1($p, $word);
}
}
#######
#see if any word from one array of lengths down differs by only one character
#######
sub word1 {
my $p=@_[0];
my @a=split(//,@_[1]);
my $sword;
WO: foreach $sword ( @{ $wordpl{$p} } ) {
my @b=split(//,$sword);
my %seen = ();
my @aonly = ();
my $item;
@seen{@b}=();
foreach $item (@a) {
push (@aonly, $item) unless exists $seen{$item};
}
if (length(@aonly)==1) { #if the difference between the two strings is one character
if ($p==3) {
push @addw, $oword; #if the length of the string is three, store the original word as a add-a-gram
last WO; #return to word()
}
$p--;
word2($p, $sword); #If the length of the string is >3 but differs by one go to word2 for another pass
last WO; #upon exiting word2 start a new "original word"
}
}
}
#######
#see if any word from one array of lengths down differs by only one character
#######
sub word2 {
my $p=@_[0];
my @a=split(//,@_[1]);
my $sword;
WT: foreach $sword ( @{ $wordpl{$p} } ) {
@b=split(//,$sword);
%seen = ();
@aonly = ();
@seen{@b}=();
foreach $item (@a) {
push (@aonly, $item) unless exists $seen{$item};
}
if (length(@aonly)==1) { #if the difference between the two strings is one character
if ($p==3) {
push @addw, $oword; #if the length of the string is three, store the original word as a add-a-gram
last WT; #return to word()
}
$p--;
word1($p, $sword); #If the length of the string is >3 but differs by one go to word1 for another pass
last WT; #upon exiting word1 start a new "original word"
}
}
}
#######
#find the longest add-a-gram
#######
$lo="";
foreach $i (@addw) {
if ( length($lo)<length($i) ) {
$lo=$i;
}
}
#######
#print all anagrams to a file and signify the longest
######
open (add, $adda) or die ("Cant open $adda: $!\n");
foreach $i (@addw) {
if ($i ne $lo) {
print add "$i\n";
} else {
print "longest: $i\n";
}
}
close (add);
</CODE> Essentially it reads all the words, sorts them into arrays where each word has the same length, puts the arrays into a hash with the keys being the lengths of the elements in the array. Then it starts with the largest words, and takes a pass looking for a word which is one character smaller, and differs by one character. Another pass starts, doing the same thing, but to the new word. I hope you can understand it because I havent explained it well, documented it well, or named variable well, but any performance tips would be great, or even just ways to make the code a little shorter! Thanks again!