Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: Finding a _Similar_ Substring? (Fuzzy Searching?)

by BUU (Prior)
on May 21, 2004 at 02:52 UTC ( [id://355161]=note: print w/replies, xml ) Need Help??


in reply to Finding a _Similar_ Substring? (Fuzzy Searching?)

Actually, reading your requirements, it sounds like a better solution might be to define a list of characters that "don't matter" when you're matching (or doing whatever you want to do). An easy way to do this would be something like:
my @ignore=(' ','-'); #whatever for(@ignore){ s/$_//g; } #match against $_

Replies are listed 'Best First'.
Re: Re: Finding a _Similar_ Substring? (Fuzzy Searching?)
by hv (Prior) on May 21, 2004 at 03:04 UTC

    Since in this type of situation I'd normally expect the one pattern to be matched against many strings, I'd usually aim to approach this instead by modifying the regexp:

    my @ignore=(' ','-'); #whatever my $ignoreclass = sprintf '[%s]', join '', map quotemeta, @ignore; $re = join $ignoreclass, split //, $re;

    Of course this is only so simple if the initial pattern is a simple string: a full-on regexp is rather more difficult to introduce such modifications to reliably.

    Hugo

Re: Re: Finding a _Similar_ Substring? (Fuzzy Searching?)
by TomDLux (Vicar) on May 21, 2004 at 03:14 UTC

    If your ignore set are too complicated for character classes, you can OR them together into a regex. I doubt it would be necessary here, more likely for sets fo words.

    my $ignoreStrings = join "|", @ignore; my $deleteThese = qr/$ignoreStrings/g; $strting =~ s/$deleteThese//;

    By the way, you're using $_ to represent the various elements of @ignore, but also to denote the default object of s///. That's why I tend to avoid defaults .... better to be explicit, self-documenting, and avoid irritating errors.

    --
    TTTATCGGTCGTTATATAGATGTTTGCA

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://355161]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (4)
As of 2024-04-19 22:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found