D'Oh!! has asked for the wisdom of the Perl Monks concerning the following question:
Can someone help me with a regex that will test to see if a scalar contains ALL of the vowels (aeiou) in ANY order.
So $word = "alighthouse" would return true since it contains all of the vowels.
I know this is a simple question but I'm a beginner taking a perl class and would appreciate any help. Basically, it should return true for any scalar that has ALL of the vowels in it.
Thanks
Re: regex testing for ALL of the vowels in a scalar
by davido (Cardinal) on Feb 11, 2004 at 05:08 UTC
|
Here's my take on it. Someone's going to do a pure regex solution... but not me. Still, this seems like a pretty simple to follow solution, so it might be useful to you.
use strict;
use warnings;
my $string = "Understand, this string really contains all of the vowel
+s.";
my $count;
$string =~ /$_/i and $count++ for qw/ a e i o u y /;
print "They're all there!\n" if $count == 6;
The meat and potatos of this solution is in the $string =~ /$_/i and... line. Start with the "for qw/ a e i o u y /" part. ...That creates a little loop that distributes each vowel, one by one, to $_. Next, the string is pattern matched with the contents of $_ (each vowel, one by one). The logical short circuit operator ("and") ensures that $count is only incremented when a match occurs. So each time through the loop you're testing one vowel to see if it's found in the string, and incrementing $count if the vowel exists. Then you loop back, and check for the existance of the next vowel in the list.
The hardest part is the logical short circuit "and". Just think of it like this... for 'and' to be true, both the left hand side and the right hand side must be true. If the left hand side isn't true (ie, if there's no match), there's no point in even evaluating the right hand side, since the 'and' operator already knows that it's going to fail (return false). That means that $count++ only gets evaluated if the match succeeds, and thus, $count only gets incremented if the match is successful.
Hope this helps...
| [reply] [d/l] [select] |
|
Indeed, this is the approach I had in mind. My idea was to look for the
least frequently occurring vowel first, up to the most frequently occurring
vowel last. That way candidates could be excluded as quickly as possible.
The following one-liner, when run on /usr/share/dict/words
perl -nle '++$f{lc $_} for split //; END{ print "$_\t$f{$_}" for sort keys %f}'
... allows me to determine that the order is u o a i e. My test, relying on the benefit of lazy evaluation, would be:
# assuming target word is in $_
my $all_there = /u/ and /o/ and /a/ and /i/ and /e/;
Disclaimer about case sensitivity: you might want to $_ = lc $_
beforehand, or add the i modifier to each RE.
I find this code is very elegant: clear and straightforward, and above all, the
person coming along behind you will understand immediately what is going in without
requiring major regexp-fu.
| [reply] [d/l] [select] |
Re: regex testing for ALL of the vowels in a scalar
by Roger (Parson) on Feb 11, 2004 at 04:05 UTC
|
Good candidate for an OBFU. :-)
In the following example, $found is the number of unique vowels found in your string. Just test if it is equal to 5.
use strict;
use warnings;
my $word = "alighthousa";
my $found;
if (($found = ( keys %{{map { $_ => 1 } ($word =~ m/([aeiou])/g) }} ))
+ == 5) {
print "Found all 5 vowels\n";
} else {
print "Found only $found vowels\n";
}
Ok, may be I should explain what is happenning here:
($word =~ m/([aeiou])/g)
# returns a list of all vowels found in $word
map { $_ => 1 } ($word =~ m/([aeiou])/g)
# builds a list for initializing a hash
{ map { $_ => 1 } ($word =~ m/([aeiou])/g) }
# a reference to an anonymous hash initialized with the list
keys %{ { map { $_ => 1 } ($word =~ m/([aeiou])/g) } }
# return keys of the dereferenced anonymous hash
# the keys in a hash are unique, this effectively
# eliminates duplicates, and gives a list of unique
# vowels found in the string
$found = ( keys ... )
# this assigns the number of unique vowels found into $found
| [reply] [d/l] [select] |
|
$_ = 'A Lighthouse';
if (5 == keys %{{map {lc $_ => undef} m/([aeiou])/ig}}) {
print "Found at least one of each vowel.\n";
}
else {
print "Not all vowels were found.\n";
}
| [reply] [d/l] |
Re: regex testing for ALL of the vowels in a scalar
by etcshadow (Priest) on Feb 11, 2004 at 05:06 UTC
|
This is an odd case to use a regex for, actually (it kind of demonstrates a lack of understanding of what a regex is good for).
That being said, though, this:
/(?=.*a)(?=.*e)(?=.*i)(?=.*o)(?=.*u)/i
should do you pretty easily. Granted, it makes use of positive look-ahead which isn't strictly speaking "regular". The basic description of the regex is: there exists a place in the string (the beginning of the string, for example), after which exists an "a" AND after which exists an "e", etc.
To do this with a true regular expression (and hence without lookahead assertions), you'd want to do it by essentially creating a list of all 5-factorial permutations of aeiou, and, over these permutations like so (written out kind of long-hand so that the idea should be transparent):
my @regex_pieces = ();
foreach my $permutation (@permutations) {
my @vowels = split //, $permutation;
push @regex_pieces, join(".*", @vowels);
}
my $regex = qr/@{[join("|", @regex_pieces)]}/i;
------------
:Wq
Not an editor command: Wq
| [reply] [d/l] [select] |
|
Without having benchmarked it, I have a feeling that your lookahead regex will fail faster if it's anchored:
/^(?=.*a)(?=.*e)(?=.*i)(?=.*o)(?=.*u)/i
| [reply] [d/l] |
|
Yes, the look for a string containing this and that idiom ...
/(?=.*?this)(?=.*?that)/
Very nice.
| [reply] [d/l] |
Re: regex testing for ALL of the vowels in a scalar
by Abigail-II (Bishop) on Feb 11, 2004 at 10:22 UTC
|
/(?{local %_; @_{qw {a e i o u}} = ();})
(?:(.)(?{delete $_{$1}}))*
(?(?{keys %_})(?!)|)/x;
Abigail
| [reply] [d/l] |
Re: regex testing for ALL of the vowels in a scalar
by eyepopslikeamosquito (Archbishop) on Feb 11, 2004 at 08:40 UTC
|
If you like obfus, try a.pl:
#!perl -p
!(y&aA&&&&y&eE&&&&y&iI&&&&y&oO&&&&y&uU&&)&&y&&&cd
and b.pl:
#!perl -p
$_ x=!!eval"1@{[<&&y&{aA,eE,iI,oO,uU}&&>]}"
When run with, for example:
perl a.pl /usr/share/dict/words
they both seem to work ok.
BTW, can anyone recognize what
inspired these solutions?
/-\
| [reply] [d/l] [select] |
|
|