Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

spam problem

by batmanor (Novice)
on Aug 22, 2008 at 22:09 UTC ( [id://706329] : perlquestion . print w/replies, xml ) Need Help??

batmanor has asked for the wisdom of the Perl Monks concerning the following question:

PerlMonks Last March you helped me with this code. It worked, but now it no longer does. I have a query form with three fields (Comments, Address & name) and am trying to control at least some of the spam by weeding out comments with html script I assume my PERL syntax error is simple, but cannot find where it is.
# If http or <a href= tags are present abort the message if ([ $FORM{'Comments'}, $FORM{'addr'}, $FORM{'name'} ] ~~ /http|htm +l|a href/i) { &html_message; }
Thank you, David

Replies are listed 'Best First'.
Re: spam problem
by FunkyMonk (Chancellor) on Aug 22, 2008 at 22:29 UTC
    This code doesn't give a syntax error with perl 5.10, but will with any earlier version. Perhaps it would be better if you copy & pasted the error message.

    That said, I have a couple of guesses:

    • You are really running 5.10, aren't you?
    • You've capitalised "Comment", but not "addr" and "name". Is that correct?
      FunkyMonk, The Cap for Comment is correct as are addr and name. The we site is on a large University of Va server and its version of Perl is 4.0 Patch level 35 , So I have to wrk with that. Can you suggest a sequence that would work with Perl 4.0 much obliged
        Wow. Perl 4.035. That's old, real old. According to perlhist 4.035 was released 1992-Jun-23. The first version of Perl I can remember using was 4.036, but it wasn't until 1998 (with 5.004.04) that I started using Perl seriously (at least that's the earliest Perl file I can find).

        You have my sympathies, having to deal with Perl 4.

Re: spam problem
by brycen (Monk) on Aug 23, 2008 at 00:03 UTC

    May I humbly suggest a totally different approach? If you escape the HTML then spammers get no benefit from it. Also see the "nofollow" attribute to <a href>.

    In your case substitute &lt; for < in the message body then no HTML tags will work and the spammers get no benefit. Or do both: reject messages with the string "a href", but also escape the < to prevent other trickery. Or use:

    use URI::Escape; my $escaped = uri_escape( $unescaped_string );
Re: spam problem
by Sagacity (Monk) on Aug 23, 2008 at 03:20 UTC

    Here is a filter that you can start with and modify to your liking. I use it for an email script that uses a hash, but the process is not as important as the filter itself.

    sub validate_mail { my $self = shift; $self->{hackattempt} = "false"; my @entries = @_; foreach my $i (@entries) { #Catch comment spam/injection attempts!!! #Here is the filter you can modify -- if ($i =~ /(\.\.)|[\\]+|[\<\>]+|[\{\}]+|[\(\)]+|[\|]+|[\[]+|[\]]+/ +gi) { $self->{hackattempt} = "true1"; $self->{hackpattern} = $i; return $self->{hackattempt}; last; } #Remove any hi-jack attempts!!! if ($i =~ /BCC/gi) { $self->{hackattempt} = "true2"; $self->{hackpattern} = $i; return $self->{hackattempt}; last; } next; } return $self->{hackattempt}; }

    My scripts use it to trigger an email that gives me the ip address and a sample of the entry (I can look at the sample and determine if it is spam or just a mistake), that code is not shown here. I then use this information to block the ip's using .htaccess -- This stops wasting resources on spammers and helps you to manage the posting process. The BCC portion of this code is aimed at an attempt to inject Blind Carbon Copy email listing into the header of the email envelope.

    Goodluck,