Re: Image Character Recognition

You'll notice that if the developers on those sites have done it right, there are various forms of noise in those images (lines in a grid, pseudo-random placement of pixels, etc.). The noise is there specifically to foul OCR software in order to stop people from doing preciely what you're trying to do.

The problem with these methods is that they pretty much stop any blind user from using these web sites, but that's another issue.

----
I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
-- Schemer

: () { :|:& };:

Note: All code is untested, unless otherwise stated

Comment on Re: Image Character Recognition Download Code

Replies are listed 'Best First'.
Re: Re: Image Character Recognition by cees (Curate) on Dec 05, 2003 at 00:19 UTC
The problem with these methods is that they pretty much stop any blind user from using these web sites, but that's another issue. A good way to handle that issue is to also offer a sound byte that reads out the word. This is also hard for automated scripts to bypass, and if a blind person is using a computer you can pretty much guarantee that they have sound. I don't know of any websites that actually implement both methods though. I guess the issue with implementation is that the graphic image is easy to randomly generate using software, but the sound byte is much more challenging to generate! Anyway, this is getting off topic, so I will stop here...	[reply]
Re: Re: Re: Image Character Recognition by duff (Parson) on Dec 05, 2003 at 00:30 UTC
I guess the issue with implementation is that the graphic image is easy to randomly generate using software, but the sound byte is much more challenging to generate! Not really. Just create a series of files each with it's own audio for a single letter, then string them all together and send the concatenated audio. We have an application that uses a similar technique to generate real-time navigation information for ships. PerlJam	[reply]
Re: Re: Re: Re: Image Character Recognition by cees (Curate) on Dec 05, 2003 at 00:52 UTC
Just create a series of files each with it's own audio for a single letter, then string them all together and send the concatenated audio. But wouldn't it then be easy to generate a program that knows about the signature of each of those letters? The challenge is randomizing the sound of each letter so that it is easy to understand for a human, but hard to recognize by a program. This is especially tricky since there is already some really good speech recognition software out there. Perhaps you can overlay the sound of the letters/words with some music, or something else that will not affect the ability of the person to hear the word.	[reply]


go ahead... be a heretic
	PerlMonks