in reply to Constructive criticism of a dictionary / text comparison script
Hi allolex. There is a problem that nobody has yet mentioned. It concerns this line:
next if $element =~ /[^A-Za-zĄ-’]/;This is doing a lot more than you want it too, I think. Basically, it means "ignore any $element containing a character not in the set defined between square brackets". It is therefore stripping out, for example, any 'word' with attached punctuation. For example, in a sentence such as:
"Shut up!" he said.
you are throwing away three quarters of your 'words'! And you are also, of course, ignoring hyphenated words
It also means that the line:
$element =~ s/[\s\,\!\?\.\-\_\;\)\(\"\']//g;never actually does anything, with or without surplus backslashes...
hth
dave
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Re: Constructive criticism of a dictionary / text comparison script
by allolex (Curate) on Aug 30, 2003 at 08:56 UTC |
In Section
Seekers of Perl Wisdom