Actually, I'm glad you brought this up. In 5.8.4, there's improved ability (thanks to
me) to create your own Unicode classes, and even build cascading ones. The documentation is in
perlunicode, and here's an example (you must have Perl 5.8.4 for this to work):
package MyUnicode;
sub InLetters {
return << 'END';
0041 005a
0061 007a
END
}
sub InVowels {
return << 'END';
0041
0045
0049
004f
0055
0061
0065
0069
006f
0075
END
}
sub InConsonants {
return << 'END';
+MyUnicode::InLetters
-MyUnicode::InVowels
END
}
package main;
my $string = "Chicken Stromboli";
while ($string =~ /(\p{MyUnicode::InConsonants}+)/g) {
print "consonant cluster: '$1'\n";
}
__END__
consonant cluster: 'Ch'
consonant cluster: 'ck'
consonant cluster: 'n'
consonant cluster: 'Str'
consonant cluster: 'mb'
consonant cluster: 'l'
I could write about that, and explain the new '&' class operand, which allows you to do the intersection of two or more Unicode classes.
I like this idea. Maybe I can do this and one other topic -- I don't want the article to be too widely scoped.
_____________________________________________________
Jeff
[japhy]Pinyan:
Perl,
regex,
and
perl
hacker, who'd like a
job (NYC-area)
s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;