Problems? Is your data what you think it is? | |
PerlMonks |
Re: Constructive criticism of a dictionary / text comparison scriptby Not_a_Number (Prior) |
on Aug 30, 2003 at 08:49 UTC ( [id://287887]=note: print w/replies, xml ) | Need Help?? |
Hi allolex. There is a problem that nobody has yet mentioned. It concerns this line: next if $element =~ /[^A-Za-zĄ-’]/;This is doing a lot more than you want it too, I think. Basically, it means "ignore any $element containing a character not in the set defined between square brackets". It is therefore stripping out, for example, any 'word' with attached punctuation. For example, in a sentence such as: "Shut up!" he said. you are throwing away three quarters of your 'words'! And you are also, of course, ignoring hyphenated words It also means that the line: $element =~ s/[\s\,\!\?\.\-\_\;\)\(\"\']//g;never actually does anything, with or without surplus backslashes... hth dave
In Section
Seekers of Perl Wisdom
|
|