|We don't bite newbies here... much|
Re: Constructive criticism of a dictionary / text comparison scriptby Not_a_Number (Prior)
|on Aug 30, 2003 at 08:49 UTC||Need Help??|
Hi allolex. There is a problem that nobody has yet mentioned. It concerns this line:next if $element =~ /[^A-Za-zĄ-’]/;
This is doing a lot more than you want it too, I think. Basically, it means "ignore any $element containing a character not in the set defined between square brackets". It is therefore stripping out, for example, any 'word' with attached punctuation. For example, in a sentence such as:
"Shut up!" he said.
you are throwing away three quarters of your 'words'! And you are also, of course, ignoring hyphenated words
It also means that the line:$element =~ s/[\s\,\!\?\.\-\_\;\)\(\"\']//g;
never actually does anything, with or without surplus backslashes...