Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re: japhy's regex article for the TPJ

by graff (Chancellor)
on May 19, 2004 at 02:12 UTC ( [id://354497]=note: print w/replies, xml ) Need Help??


in reply to japhy's regex article for the TPJ

Maybe this would be too esoteric or somewhat "ahead of its time", but a little more exposure for the unicode tricks that are now possible with Perl RE's could yield some useful surprises for the average reader.

For example, making up expressions and character classes with things like \p{Punctuation} or \p{CurrencySymbol} (or their short forms \p{P}, \p{Sc}) -- and having these work regardless of what language the text is in -- has a certain attraction to it. (Or maybe I just don't realize what a nerd I am to think so.)

Replies are listed 'Best First'.
Re: Re: japhy's regex article for the TPJ
by japhy (Canon) on May 19, 2004 at 04:03 UTC
    Actually, I'm glad you brought this up. In 5.8.4, there's improved ability (thanks to me) to create your own Unicode classes, and even build cascading ones. The documentation is in perlunicode, and here's an example (you must have Perl 5.8.4 for this to work):
    package MyUnicode; sub InLetters { return << 'END'; 0041 005a 0061 007a END } sub InVowels { return << 'END'; 0041 0045 0049 004f 0055 0061 0065 0069 006f 0075 END } sub InConsonants { return << 'END'; +MyUnicode::InLetters -MyUnicode::InVowels END } package main; my $string = "Chicken Stromboli"; while ($string =~ /(\p{MyUnicode::InConsonants}+)/g) { print "consonant cluster: '$1'\n"; } __END__ consonant cluster: 'Ch' consonant cluster: 'ck' consonant cluster: 'n' consonant cluster: 'Str' consonant cluster: 'mb' consonant cluster: 'l'
    I could write about that, and explain the new '&' class operand, which allows you to do the intersection of two or more Unicode classes.

    I like this idea. Maybe I can do this and one other topic -- I don't want the article to be too widely scoped.

    _____________________________________________________
    Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a job (NYC-area)
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://354497]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (4)
As of 2024-04-24 20:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found