Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

grep : partial and/or exact match

by symŽ (Acolyte)
on Nov 30, 2001 at 07:40 UTC ( [id://128538]=perlquestion: print w/replies, xml ) Need Help??

symŽ has asked for the wisdom of the Perl Monks concerning the following question:

Suspect code:
@restricted_words = qw ( funk shucks crud ); if (grep /$word/, @restricted_words){ print "wash your mouth out!\n"; }else{ print "Such a nice boy\n";}

Ok that works fine for exact matches, but what if the word is funkhead or crudshorts?
I know I can probably accomplish this without grep, but can I do it with grep?

Replies are listed 'Best First'.
Re: grep : partial and/or exact match
by chipmunk (Parson) on Nov 30, 2001 at 07:57 UTC
    Yup, you could do it with grep, by reversing the match. Match the restricted word against the input word:
    my @restricted_words = qw ( funk shucks crud ); if (grep $word =~ /$_/, @restricted_words) { print "wash your mouth out!\n"; } else { print "Such a nice boy\n"; }
    That code will be very slow, however, because it's recompiling the regex each time. Although you could avoid that problem with qr//, it would be simpler to skip grep and just make one big regex:
    my @restricted_words = qw ( funk shucks crud ); my $restricted_re = '(?:' . join('|', @restricted_words) . ')'; $restricted_re = qr/$restricted_re/; if ($word =~ $restricted_re) { print "wash your mouth out!\n"; } else { print "Such a nice boy\n"; }
    This approach will give you both speed and flexibility. Enjoy!
Re: grep : partial and/or exact match
by dvergin (Monsignor) on Nov 30, 2001 at 07:47 UTC
    Since you have already written the code that illustrates the issue you wanted to test, you are within reach of an answer to your own question. A little fiddling and:
    use strict; my @restricted_words = qw (funk shucks crud); my $word = 'crudshorts'; if (grep $word =~ /$_/, @restricted_words){ print "wash your mouth out!\n"; }else{ print "Such a nice boy\n"; }
    and run it to see that you are not as nice as you hoped.

    Update: The above example includes a correction I made after chipmonk graciously pointed out a goof-up; a correction which turns out (quite by coincidence -- I swear it) to be just what he suggests at the top of his response to this item.

    BTW: The ultimate robustness of this approach is perhaps open to question.

    I have long since stopped trying to memorize the manuals and docs. While working on a project I just keep a test.pl script open in my editor and a command line window at the ready so I can test things on the fly. For questions like this it is much faster than trying to sort out the answer by reading the docs.

    Actually, expressing my questions by writing a few lines of script to illustrate the point of ignorance is often a very helpful exercise in itself. And I remember the answer better because I took a moment to write out the sample code and then got instant response from the command line (often impossible in the context of a larger project).

    I also keep test.pl and a command line window at the ready while reading through Perl Monks. :-)

    All in all, a highly recommended working style. Try it. The quick validating feedback in the midst of a larger project is very satisfying. You may find it soon becomes second nature.

    As Saint Larry says, "Perl is an empirical science." That means you verify an assertion, frame a question, or test a theory by actually running some code.

    ------------------------------------------------------------
    "Perl is a mess and that's good because the
    problem space is also a mess.
    " - Larry Wall

Re: grep : partial and/or exact match
by blakem (Monsignor) on Nov 30, 2001 at 07:58 UTC
    You might also want to take a quick look at Regexp::Common...
    specifically the /$RE{profanity}/ functionality.

    -Blake

Re: grep : partial and/or exact match
by dws (Chancellor) on Nov 30, 2001 at 08:08 UTC
    Ok that works fine for exact matches, but what if the word is funkhead or crudshorts?

    What do you want to do with "funkhead" or "crudshorts"? Think carefully. If you want to reject them based on your list of restricted words, you're going to get a lot of false positives based on legitimate words like "crude". (The same happens when you substitute "real" dirty words.)

    If you don't want a false positive, you'll find that \b is your friend. See perlman:perlre.

      how this partial mismatching is done in this scenario. file1 contents: he/is/man/reg30 don't/you/reg31 what/goes/on/reg32 file2 contents: /is/man/reg30 on/reg32 try/to/do/reg65 output should be: don't/you/reg31

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://128538]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (4)
As of 2024-03-28 21:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found