http://qs321.pair.com?node_id=28747

eduardo has asked for the wisdom of the Perl Monks concerning the following question:

maverick and I were talking at lunch, about trying to figure out what the largest word that could be typed solely with the left hand was. I, am not a perl master, more specifically I am certainly not a regexp master... so I never think of solutions in terms of regular expressions... my solution was:
sub without_regexp { my $largest = ""; #create my set my %set = ( ); foreach (qw( y h n u j m i k o l p )) { $set{$_} = 1; } #open file, and for every word that is input, if it's #longer than the current longest, for every character, if #it is in the set, jump out, otherwise, if all characters #are NOT in the set (the word had none of the offending #letters), then make it the largest... open (INFILE, "</usr/dict/words") || die "error $!"; while (<INFILE>) { chomp; if (length($_) > length $largest) { if (! scalar grep { $set{$_} } split('', lc($_))) { $largest = $_; } } } close (INFILE) || die "error $!"; print "LARGEST FOUND: $largest\n"; }
it gave me an answer that seems plausible enough (although i haven't bothered to check it): aftereffect... all of those letters are NOT in the letters set that i told it. maverick however, being the super genious that he is, said: "i would just make a character class of all of the char's in the right hand, negate it with ^ and if the word matched that, it didn't have any of those charachters, so it was a possible solution. His code, as i understood it, would look like this (well, his would be nicer than this, but this is a good guess):
sub with_regexp { my $largest = ""; #for every line in the file, if it's length is greater #than the current longest, and it matches the set that #is all of the caracters in the RIGHT hand, not'ed, then #it is the largest open (INFILE, "</usr/dict/words") || die "error $!"; while (<INFILE>) { chomp; if (length($_) > length $largest) { if (/[^yhnujmikolp]/i) { $largest = $_; } } } close (INFILE) || die "error $!"; print "LARGEST FOUND: $largest\n"; }
it however gave me the answer: antidisestablishmentarianism which clearly has letters in that set! (n, i, etc...) so i thought, maybe my logic is wrong... what happens if i just not the test, like so:
sub with_regexp { my $largest = ""; #for every line in the file, if it's length is greater #than the current longest, and it matches the set that #is all of the caracters in the RIGHT hand, not'ed, then #it is the largest open (INFILE, "</usr/dict/words") || die "error $!"; while (<INFILE>) { chomp; if (length($_) > length $largest) { #NOTICE THE ! AT THE BEGINING if (! /[^yhnujmikolp]/i) { $largest = $_; } } } close (INFILE) || die "error $!"; print "LARGEST FOUND: $largest\n"; }
then the answer it gives me is: Honolulu which is a word that has all of the leters within that set... so, in other words, the original test, should have worked!!!! ok... so, why are the two first sections of code not equivalent, and why does the second one do exactly what I would expect it to, but not the first? Thanks!