Re^2: One regex construct to handle multiple string types

Replies are listed 'Best First'.
Re^3: One regex construct to handle multiple string types by Krambambuli (Curate) on Nov 29, 2008 at 10:41 UTC
It's not about \w, but about backtracking. \w* initially 'eats' 2L, but then is forced to ... well... put the 'L' back on the table to let \S have it. Hmm... maybe 'eating' is not the best image for what's going on with backtracking regexps...? :) Krambambuli ---	[reply]
Re^4: One regex construct to handle multiple string types by pobocks (Chaplain) on Nov 29, 2008 at 10:46 UTC
AHHH! Thank you. Enlightenment is mine. "Must-match" versus "may-match" - separating the men from the boys of regex tug-o-war. `for(split(" ","tsuJ rehtonA lreP rekcaH")){print reverse . " "}print "\b.\n";`	[reply] [d/l]
Re^3: One regex construct to handle multiple string types by ww (Archbishop) on Nov 29, 2008 at 12:34 UTC
Precise definition depends on the language. Mastering Regular Expressions, 2nd Ed., Jeffery E. F. Friedll, published by O'Reilly characterizes`\w` in its "Common Metacharacters..." chapter, this way: Part-of_word character Often the same as `[a-zA-Z0-9_]`, although some ools omit the underscore, while others include all the extraalphanumerics characters in the locale. If Unicode is supported, `\w` usually refers to all alphanumerics (notable exception: Sun's Java regex package whose `\w` is exactly [a-zA-Z0-9_</c>). Regular Expressions Pocket Reference (also from O'Reilly) defines `\w` as: `\p{IsWord}` for Perl and as `[A-Za-z0-9_]` for Java. Regretably, the definition of `\p{isWord}` -- `[_\p{L1}\p{Lu}\p{Lt}\p{Lo}\p{Nd}` -- is, for me, almost impenetrable but Friedll's characterization may be as good as you'll get without deep study of perlretut and friends.	[reply] [d/l] [select]
Re^3: One regex construct to handle multiple string types by JadeNB (Chaplain) on Nov 30, 2008 at 18:38 UTC
Of course, Re^3: One regex construct to handle multiple string types has already answered why you get the indicated match, but …. While it doesn't seem to be in the documentation at perldoc.perl.org, the Perl 5.10 documentation for perlre has a section called "Character Classes and other Special Escapes" that says: \wMatch a "word" character (alphanumeric plus "_") UPDATE: Ah, found it, at perlre.	[reply] [d/l]


Do you know where your variables are?
	PerlMonks