http://qs321.pair.com?node_id=1059756

corfuitl has asked for the wisdom of the Perl Monks concerning the following question:

hi, I want to find the max fuzzy matching between a sentence in a file and a sentence in another file. I have already the following code, but something goes wrong and it does not work. In addition, How can I get the max fuzzy matching?
#!/usr/bin/perl use String::Approx qw(amatch); use Text::Fuzzy; my $f1='bmc_pe.it.txt'; open (FILE1, "<:encoding(utf8)", $f1) or die "can't open file '$f1' $! +"; my @mt = <FILE1>; my $f2='bmc_mt.it.txt'; open (FILE2, "<:encoding(utf8)", $f2) or die "can't open file '$f2' $! +"; my @tm = <FILE2>; my $max=0; my $lm; my $i, $j; for $i (0 .. $#mt) { for $k (0 .. $#tm) { print if amatch($mt[$i], $mt[$k] ['i', '25%']); }

Replies are listed 'Best First'.
Re: find the max fuzzy matching - perl
by kcott (Archbishop) on Oct 26, 2013 at 05:48 UTC

    G'day corfuitl,

    Welcome to the monastery.

    As ww points out (above), if you don't tell us what the problem is, we're not in much of a position to advise how to fix whatever that problem might be. Read the guidelines in "How do I post a question effectively?" to find out how to get better answers. When you understand that, "How do I change/delete my post?" explains how to add the additional information to your original post.

    Having said that, there are some basic problems with your code which you should address immediately. This exercise may fix whatever your problem is. Even if it doesn't, it'll certainly provide better information for us to help you.

    • Add use strict; and use warnings; near the start of your script. See strict and warnings. I strongly recommend you do this with all your scripts.
    • If you don't understand the output from strict or warnings, add use diagnostics; for more detailed messages. I'd recommend removing (or, at least, commenting out) this line in your production code. See diagnostics.
    • Consider adding use autodie; to your scripts. It saves having to hand-craft all the "... or die "Some specific message: $!";" pieces of code; it also means that you don't need to check that you haven't accidently omitted such code. See autodie.
    • Take a look at the documentation for open. Lexical filehandles are generally a better choice than package variables.
    • Use meaningful variable names. @mt and @tm mean nothing to me; their meaning may well elude you when you revisit this code in six months or so for upgrade or maintenance work. Each is converted to the other by a simple swapping of their two characters: this makes your code highly error prone. The same comments apply for $lm and any similarly meaningless names you may have elsewhere.
    • You've coded "my $i, $j;". This is wrong three times over:
      1. It is syntactically wrong: see my.
      2. It's also wrong because, while it looks like you're attempting to declare $j, that's not a variable you use in your script: you either meant $k here, or $j further down your code.
      3. Assuming you're attempting to declare these variables for use in the for loops, that's not the way to do it.

        The reasons why are somewhat subtle and are explained in perlsyn: Foreach Loops: the typical gotcha occurs because values assigned to those variables within the loop are not visible outside the loop; the value of the variable will be the same before and after the loop regardless of how it might have been modified within the loop.

        Don't worry if that's confusing or seems rather heavy going. Just declare your loop variables when you code your loop, like so:

        for my $i (...) { # $i available (and localised to) here }

        and for nested loops

        for my $i (...) { # $i available (and localised to) here for my $j (...) { # The same $i available here # $j available (and localised to) here for my $k (...) { # The same $i and $j available here # $k available (and localised to) here } } }
    • Finally, you're missing a closing brace ('}') at the end of your posted script.

    -- Ken

Re: find the max fuzzy matching - perl
by ww (Archbishop) on Oct 25, 2013 at 23:28 UTC

    And just what are the DETAILS! of "something goes wrong and it does not work"?

    IOW, what actually happens? Nothing? Error messages? Warnings? Output other than what you expect?

    Please tell us. Make it easy for us to help you.