Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much

Re^2: Comparing Lines within a Word List

by dominick_t (Acolyte)
on Apr 27, 2016 at 16:19 UTC ( #1161650=note: print w/replies, xml ) Need Help??

in reply to Re: Comparing Lines within a Word List
in thread Comparing Lines within a Word List

Just to clarify, I'm not a computer science student and my deadline is not a class deadline. I'm an artist with a background in mathematics, and I also occasionally publish crosswords and other word puzzles. My sense was that Perl and regular expressions could be enormously helpful in the latter, so I've been teaching myself with the O'Reilly book, but haven't learned enough yet to solve on my own a particular matching question that I could really use an answer to in the next few days, as it's to do with a printer deadline for a puzzle that I'm writing for my best friend's wedding.

Thanks for these thoughts on the spell-checking approach. The code looks enormously interesting but obviously a lot to work through, and maybe more than I need at the moment? The solutions above appear to be able to tackle my specific problem, but perhaps there is something I'm not seeing.

  • Comment on Re^2: Comparing Lines within a Word List

Replies are listed 'Best First'.
Re^3: Comparing Lines within a Word List
by poj (Abbot) on Apr 27, 2016 at 16:40 UTC

    Try this with your dictionary file. If performance is a problem break into separate files according to word length.

    #!perl use strict; open IN,'dict.txt' or die "$!"; my %dict=(); for (<IN>){ chomp; $dict{uc $_}=1; }; close IN; for my $word (sort keys %dict){ next unless $word =~ /R/; my @f = split //,$word; # loop over each letter # changing R to S # to create new word in $w for my $i (0..$#f){ if ($f[$i] eq 'R'){ my $w = $word; substr($w,$i,1) = "S"; # check if generated word exists in dict if (exists $dict{$w}){ print "$word $w\n"; } } } }
Re^3: Comparing Lines within a Word List
by Anonymous Monk on Apr 27, 2016 at 16:40 UTC

    That's fine of course, we just get a lot of "do my homework for me" type questions here. The code works (at least on my machine) and as far as I can tell from your descriptions in this thread it generates the output you're looking for, so feel free to just use it. The learning curve of this particular code may be a little steep, and it contains a lot of benchmarking/statistics code which of course you don't need, but at least the theory behind a Trie is still something you can look into if you like. Doing a linear search on a long word list is inefficient when run often (which doesn't seem to be the case in your situation) so the size/speed tradeoff of a trie can be very useful.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1161650]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (5)
As of 2022-05-22 19:08 GMT
Find Nodes?
    Voting Booth?
    Do you prefer to work remotely?

    Results (81 votes). Check out past polls.