http://qs321.pair.com?node_id=229980

powerhouse has asked for the wisdom of the Perl Monks concerning the following question:

Ok Perl Monks, I really could use your wisdom here :o)

I've created a site in perl and MySQL. when my visitors register, I have it check a database table of obscene usernames and return "bad" if it's in there.

Here is how I am searching it:

$sth = $dbh->prepare (qq{ SELECT * FROM `forbidden_usernames` WHER +E `f_user` LIKE ? }); $sth->execute("$str"); $row = $sth->fetchrow_hashref(); $sth->finish(); if ($str =~ /$row->{f_user}/i) { return("bad"); }


Ok, now my problem is that if I even try to use my name as a username it is returning it as bad.

I did have a wildcard in the $sth->execute("$str"); like this: $sth->execute("%$str%");

I thought that might have been the problem, but it was not. when I took out the wild cards on both sides of it, It is still saying any username is bad.

However, if I just take out the LIKE and make it a = then it won't get something like asshole when I only have ass in the database and visaversa. Is there a "contains" like Perl's =~ for MySQL, that you know of in your infinite wisdom?

Or should I go ask on a MySQL board, if I can find one?

thx,
Richard

Replies are listed 'Best First'.
Re: Perl and MySQL - performing a search...
by Trimbach (Curate) on Jan 26, 2003 at 16:40 UTC
    The reason why nonmatching rows are reading as "bad" is because of your regex... if you use your name as $str and then do a SQL lookup then the query will yield zero rows and $row will be set to undef (assuming "Richard" isn't a bad name in your database. :-D) Therefore this $str =~ /$row->{f_user} becomes logically equivalent to "Richard" =~ // And since there's an empty string somewhere in "Richard" the result will always be true... it'll always be true whenever your SQL lookup yields zero rows.

    That said, it's a little weird to do a LIKE SQL on your database and then turn right around and do (nearly) the same thing in a Perl regex. If you wanna compare names to your "bad names" table remember that LIKE will only work if the "name" is the same length or shorter than an entry in your database, so "%ass%" will match "asshole" but "%asshole%" will never match "ass". Depending on what you're trying to do you might be better off if you just do the whole "tell me if it's a bad name" routine in Perl and forget trying to do it in SQL. :-D

    Gary Blackburn
    Trained Killer

      Yeah, a lookup hash with least-common-denominator bad words might work ok. By "LCD" I mean your hash would only contain "ass" but not "asshole" or "dumbass" or whatever. That way you can do something like:
      sub i_am_bad { my $name = shift; my %bad_words = ( .. ); # Some long list. foreach my $bad_word(keys %bad_words) { return "Bad!" if $name =~ m/$bad_word/i; } return undef; } # Later... print "Bad username!" if i_am_bad($name);
      This has the advantage of getting multiple permutations of different curse words, but keep in mind it'll also disallow innocent words, too. For example, there's probably no permutation of "fuck" that would be ok, but "ass" will disallow "passover" and "class" and "sassy". That might be a small price to pay, though, if you're running, say, a church website. :-)

      Alternatively you could just write down every possible variation of every curse word you could imagine and just do a eq instead of a m// but unless your imagination is very good, or you've spent a goodly period of time in some military service, you're not likely to cover all the objectionable possibilities. ;-)

      Gary Blackburn
      Trained Killer

        Excellent, thank you!
        Before I got this, here is what I came up with.
        Please tell me what you think.

        The reason I would like it to stay in the database, is because I am making an Admin interface for our company, where we can add new names delete them, and do a ton of other things not related to the forbidden names, such as manage the registered users, change our hosting packages, domain prices and all that.

        So here is what I came up with...

        my $is_bad = 0; $sth = $dbh->prepare (qq{ SELECT * FROM `forbidden_usernames` }); $sth->execute(); while($row = $sth->fetchrow_hashref()) { if ($row->{f_user} =~ m/$str/i || $str =~ m/$row->{f_user}/i) +{ $is_bad++; last; } } $sth->finish(); if ($is_bad != 0) { return("bad"); }


        That checks it both ways.

        I have not yet tested it. But what do you think, would it work about like yours does?

        Thank you!!

        Richard.
      So, your saying I should POSSIBLY have something like a hash, with all my obscene names in it, then have perl just run through the hash with a foreach, and do a "contains" search?

      Something like that?

      Even If I did that, would not it have the same problem? if asshole =~ ass would never be true, because asshole is being searched for in the whole word ass, therefore it would never find it.

      So, what do you suggest I do, to eliminate this problem?

      I'm open to suggestions. Thanks for any you have :o)

      thx,
      Richard
        You don't need to have a hash or to use a regular expression. All you have to check is whether the fetch found anything or not.
        Is there a module that will do this for me, if I pass a username to it?

        Anyone Know of one?

        thx,
        Richard

Re: Perl and MySQL - performing a search...
by CountZero (Bishop) on Jan 26, 2003 at 22:38 UTC

    Provided you have a good collection of bad usernames, this should do the trick:

    $sth = $dbh->prepare ("SELECT count(*) FROM forbidden_usernames WHERE +f_user LIKE ?"); $sth->execute($str); ($bad) = $sth->fetchrow_array; print "Bad username\n" if $bad;

    Warning: untested code! It's too late to cobble a mySQL database together to test it and my "Programming the Perl DBI" is upstairs.

    CountZero

    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Re: Perl and MySQL - performing a search...
by trs80 (Priest) on Jan 27, 2003 at 04:04 UTC
    There is regular expression support in MySQL that allows for queries like this:
    sub detect_profanity { my $str = shift; my $sth = $dbh->prepare (qq{ SELECT * FROM `forbidden_usernames` W +HERE `f_user` REGEXP ? }); $sth->execute($str); my $row = $sth->fetchrow_hashref(); $sth->finish(); return "bad" if $row; }
    You can also use several special characters/constructs to limit the matches. See the MySQL docs for more info.
      EXCELLENT, Thank you, one and all!

      thx,
      Richard
Re: Perl and MySQL - performing a search...
by Mr. Muskrat (Canon) on Jan 27, 2003 at 14:37 UTC

    I'm surprised that no one has mentioned Regexp::Common.
    Here is the synopsis from the Regexp::Common::profanity docs...

    use Regexp::Common qw /profanity/; while (<>) { /$RE{profanity}/ and print "Contains profanity\n"; }
Re: Perl and MySQL - performing a search...
by Bilbo (Pilgrim) on Jan 27, 2003 at 09:33 UTC

    You don't seem to be interested in which words in your list are matched, just in whether any words are matched, so why not use the rows method which tells you how many rows were matched by your query.

    $sth = $dbh->prepare (qq{ SELECT * FROM `forbidden_usernames` WHERE `f +_user` LIKE ? }); $sth->execute("$str"); my $rows = $sth->rows; $sth->finish(); if ($rows) { return("bad"); }

    Or you might just want to return the nuumber of rows matched.

    Note: Untested code

    Update: Actually I suppose that the suggestion of CountZero above is a better way of achieving the same result. My suggestion would only be more appropriate if you might be interested in which words were matched.