Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re^3: One regex construct to handle multiple string types

by ww (Archbishop)
on Nov 29, 2008 at 12:34 UTC ( [id://726772]=note: print w/replies, xml ) Need Help??


in reply to Re^2: One regex construct to handle multiple string types
in thread One regex construct to handle multiple string types

Precise definition depends on the language.

Mastering Regular Expressions, 2nd Ed., Jeffery E. F. Friedll, published by O'Reilly characterizes\w in its "Common Metacharacters..." chapter, this way:

Part-of_word character   Often the same as [a-zA-Z0-9_], although some ools omit the underscore, while others include all the extraalphanumerics characters in the locale. If Unicode is supported, \w usually refers to all alphanumerics (notable exception: Sun's Java regex package whose \w is exactly [a-zA-Z0-9_</c>).
Regular Expressions Pocket Reference (also from O'Reilly) defines \w as:
  • \p{IsWord} for Perl
  • and as [A-Za-z0-9_] for Java.

Regretably, the definition of \p{isWord} -- [_\p{L1}\p{Lu}\p{Lt}\p{Lo}\p{Nd} -- is, for me, almost impenetrable but Friedll's characterization may be as good as you'll get without deep study of perlretut and friends.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://726772]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (5)
As of 2024-04-19 03:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found