http://qs321.pair.com?node_id=604258

exodist has asked for the wisdom of the Perl Monks concerning the following question:

I am writing some web stuff in perl-CGI, I need to have an account registration page, I want to verify it is a real person and not a script the same way most sites do with an image containing random text that has been scewed so an OCR program cannot read it. I am wondering what the best way to accomplish this is.

P.S. I am a teacher, one of my students is blind, he informed me that many sites now also offer an audio alternative to the image, does anyone know about those as well?

Thank you!
  • Comment on howto: Perl CGI, image with random scewed text for account creations

Replies are listed 'Best First'.
Re: howto: Perl CGI, image with random scewed text for account creations
by GrandFather (Saint) on Mar 11, 2007 at 23:13 UTC

    You may find that Authen::Captcha solves the image part of the problem at least.


    DWIM is Perl's answer to Gödel
      I think there's a move to use GD::SecurityImage, instead of Authen::Captcha, although I don't know why, excite the latter hasn't been updated in yarns. There's a drop in replacement for Authen::Captcha called GD::SecurityImage::AC to make the switch easier.

       

      -justin simoni
      skazat me

Re: howto: Perl CGI, image with random scewed text for account creations
by jettero (Monsignor) on Mar 12, 2007 at 00:53 UTC
    I have no idea how to do the audio except via festival or something like that. I have in the past used a machine for generating random weak captchas that make my sites too much effort to bother with — since there's never actually anything of value for the bot to do, banning URLs and things as necessary.

    In general the captchas are all something the jerks can get past if they try hard, so it's best to plan on being in an arms race.

    First of all, I don't want to store anything, so I make the thing parse the CGI args into a hash, which includes some secret part — that I'm sure they could figure out if they spent the effort, but I could always change it.

    my $msg = sha1_base64( join("^G", ("secret part", map {$cgi->param($_)} grep {not m/^(?:usec|utime|tcode)$/} $cgi->param))); my $code = $1 if $msg =~ m/(.{4})\z/;

    I use GD and GD::Text::Align to draw things, and I use Math::Trig for various angle calculations. The code is all a bit sloppy, as it grew rather organically, but it technically functions.

    This captcha, where I rotate, resize and translate one font is considered very weak. The random circles sometimes make it impossible for humans to read as well, so I also have a js onClick event to reload the image (thus re-randomizing its rendering).

    -Paul

Re: howto: Perl CGI, image with random scewed text for account creations
by agianni (Hermit) on Mar 12, 2007 at 00:57 UTC
    I have no experience with it and it looks like it would likely take more heavy lifting to get working, but Authen::PluggableCpatcha provides both standard scrambled text captchas as well as audio output. Fortunately, the module comes with a decent looking tutorial.
Re: howto: Perl CGI, image with random scewed text for account creations
by gloryhack (Deacon) on Mar 12, 2007 at 04:36 UTC
    I spent quite a long time looking at the various CAPTCHA schemes that are out there in the wild, and what I didn't like about them was that they discriminated against the blind and particularly those who are both blind and hearing impaired. I'm neither, and none of my clients are, either, but I'm just that kind of a guy. So I developed my own, one that plugs a randomly generated non-word string into a sentence and asks the user to find it and enter it into a text input. So far, in about eight months of using it on my low-volume site, it has worked quite well and no one has complained.

    Screen readers, in general, will spell out the garbage strings, making it easy enough for the blind to find them and comply. Those who are both blind and deaf can find the garbage string via their Braille terminals... or so goes the theory, since so far no one who's both blind and deaf has contacted me. This was true even before my CAPTCHA went online, though.

    It's an easy enough thing to do. From a predefined list of sentences (which could come out of the fortune program), select a sentence at random and a random point within that sentence in which to plug a garbage string. Generate the garbage string, test that it doesn't exist in a dictionary, and plant it in that random spot. Explain to the user that he's supposed to find that non-word and type it into the text input. Use caching similar to that of Authen::Captcha to keep track of what's been recently served and to whom. Bingo bango bongo, an accessible CAPTCHA.

    Nuthin' to it but to do it.

      Would this not be too easy a pick to bypass? With the use of Aspell or any spellchecker, I could quite reliably find the typo automatically. If you added a word that would be a good English word but clearly didn't fit in the sentence, that would be another matter, but that would be also very hard to do randomly with certainty that it will be clear what the inserted word is.

        'Tis the nature of arms races. My grandchildren will be playing this game long after I'm buried in a box.

        I think the next step for any CAPTCHA is to add some additional noise into the equation along with some dynamic firewalling, and that's my plan for my own implementation. The policy will be simply "bounce off of the defense x times in y seconds and you're firewalled away for z seconds". That'll work for a while, then it'll be time for a radical rethink. Again.

      over at akismet they use simple math problems. Simple multiple choice problems would also work. Keeping the data in plain text, so all the sensorilly deprived but still plain text capable people can handle it -- brilliant.
Re: howto: Perl CGI, image with random scewed text for account creations
by deeknow (Novice) on Mar 12, 2007 at 03:35 UTC
    There's also GD::SecurityImage which requires that you have GD installed (obviously), we've used this in a couple of production apps for CAPTCHA purposes. The usability thing is a concern tho with this Text-in-image approach, would be interested in hearing how others attack an audio alternative

      Actually, (and oddly), GD::SecurityImage also has support for an Image::Magick backend as well. You're probably better off using the GD backend. The amount of support for fiddling with the image'd text is pretty low though, and probably fairly easy to crack. The included font with the package is pretty well suited for this application, though.

      I personally have used this module for CAPTCHA work on my web app and also have had requests for an audio-version of the CAPTCHA image. I've been looking at the Authen::PluggableCaptcha module, but there's no audio support for it, yet. Sigh.

       

      -justin simoni
      skazat me

        Hi, I' m the author of GD::SecurityImage. Image::Magick was an early addition and it was a request from someone else. It does not use the full power of Image::Magick (which is far more powerful than GD, but slower) and is merely a compatibility layer. You can use several different fonts and randomly changing styles/particles/scramble to make it hard for OCRs. But it may be possible to crack.. I don't know (I got zero feedback on this subject).

        I didn't like Authen::Captcha' s approach with "graphic letters" and it's lack of plug-ability. It has it's own flat file database, while I prefer DB & sessions. I also like Tim Toady ;) I've added a sample part to show the generated images: GD::SecurityImage
Re: howto: Perl CGI, image with random scewed text for account creations
by zentara (Archbishop) on Mar 12, 2007 at 12:39 UTC
    I would use flite to generate the audio. It is fast and reasonably clear. If you wanted true randomness, you could generate and output directly to the cgi (the way graphic files are done... just slap on an audio header and print binary).

    But you may run into problems, like not being able to compile flite on your webserver, or it draining too much resource on a heavy activity site. In that case, you could pre-record a series of random .au files ( with flite or someone with a clear voice) and upload them daily. Even a hundred random image-audio pairs would probably be sufficient to defeat automated scripts.

    Whatever you do, make sure your audio is going to be in a format(bitrate and sampling frequency) that all systems will be able to play. Quite often, especially in cgi, people will go for the smallest audio file size, like 8-bit, 8khz audio, which will not play everywhere. Better to stick with a high quality standard, that almost all systems support, like cd quality at 16 bit, 44.1Hkz.


    I'm not really a human, but I play one on earth. Cogito ergo sum a bum
Re: howto: Perl CGI, image with random scewed text for account creations
by hacker (Priest) on Mar 13, 2007 at 03:16 UTC

    The image issue is a bit at odds with Section 508, especially for blind or deaf users of your site.

    I personally like the way Drupal solved it with a captcha alternative (which I use on two of my websites).. they just ask a simple math question in a form, offering an image as an alternative (admin can toggle).

    But let's also not forget about merlyn's neat little hack to brute-force OCR (better techniques have surfaced since that time).

    My personal favorite would have to be the one I saw a few years ago (image-based, though you could make it text) that asked you to pick the one item that did NOT match the other 3. You'd be shown 3 fruits and a monkey for example.

    You could also go with a multiple choice kind of captcha, like "I like to read a ______ when I relax" and your dropdown could include things like "hat", "apple", "book", and so on.

    You could try to put something in session when a GET request is made and when a form is submitted you check the session for that variable. You'd use this to filter out badly-written bots that submit POST requests directly without requesting the parent page first. This is easily defeated by bots that behave like a web browser, however.

    Lots of ways to go about it, I'd stick with a text-based one to start, and make it reasonably complex enough not to be easily "guessed".

Re: howto: Perl CGI, image with random scewed text for account creations
by davidnicol (Acolyte) on Mar 12, 2007 at 23:39 UTC
    keyword "captcha" also, festival text2wave can be used for an audio captcha.
Re: howto: Perl CGI, image with random scewed text for account creations
by pileofrogs (Priest) on Mar 13, 2007 at 18:47 UTC

    Are you sure you want to go the captcha route? Why not ask for an email address and email them the access code? Emailing the access code eliminates problems for blind users because we can assume they have a way to read/hear their email. Plus, you then have a way to track users and contact them. If you want your service to be anonymous, then that might be a bad thing.

    On the other hand, bots could generate a million fake accounts by using a million self-generated email addresses, but it that really what you're worried about? Is your registration for a service that bots would want? Does it include giving users an email address or something?

    Hope that helps!
    --Pileofrogs

Re: howto: Perl CGI, image with random scewed text for account creations
by Phaysis (Pilgrim) on Mar 16, 2007 at 07:06 UTC
    There's no doubting it; captchas are ultimately hackable and as such are not much of a defense against the determined. The spammer's workaround scenario (which is in practice as we speak) goes like so:
    1. An unscrupulous spammer finds a board or guestbook (the victim) that has been protected by a captcha.
    2. He trains a spambot to the victim's form.
    3. Somewhere on another site (the bait, also run by the spammer), some user (an unknowing agent) manually clicks for a form to post something to that site.
    4. The bait site calls the spambot which grabs a form from the victim site, fills it with spam, pulls the URL of the captcha image served with the victim form, and feeds that captcha URL in the bait's form.
    5. The unknowing agent fills the bait form, decodes the captcha (which appears to come from the bait site), and submits.
    6. The bait site passes the captcha code to the spambot and then goes about its business.
    7. The spambot then adds the final captcha piece to the puzzle and submits the spam-filled form to the victim site
    You folks are correct to say it is an arms race. There are several tacts one could take to forego any nefariousness, but rest assured that if the stakes are high enough the forgoing will be foregone.

    Never take your eye off the smart bully.

    (Ph) Phaysis (Shawn)
    If idle hands are the tools of the devil, are idol tools the hands of god?