Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

Re: With Unicode, \d is wrong if you mean [0-9]

by hardburn (Abbot)
on Oct 26, 2004 at 15:46 UTC ( #402651=note: print w/replies, xml ) Need Help??

in reply to With Unicode, \d is wrong if you mean [0-9]
in thread Regex help

No, \d will match digit characters in many languages (as TimToady mentioned). I think it's more accurate to say that it's wrong to mean [0-9], as letting people put in digits in whatever langauge they want is usually the right thing.

"There is no shame in being self-taught, only in not trying to learn in the first place." -- Atrus, Myst: The Book of D'ni.

Replies are listed 'Best First'.
Re^2: With Unicode, \d is wrong if you mean [0-9]
by hv (Parson) on Oct 27, 2004 at 10:40 UTC

    Hmm, if you want to add one to it, it probably wants to consist of [0-9]+ rather than \d+.


      That could be construed as a bug in Perl's internal grok_number() routine.

        Patches welcome. :)

        If anything, I think I'd rather see things like grok_number() become hooks so you can intercept them, whether to augment, replace, or just add:

        warn "Hey, scare's over now, you can go back to 2 digits" if $num =~ /^20\d\d\z/;

        Of course we probably don't reach grok_number() unless the lexer has already spotted a digit, so we'd need hot-pluggable grammars as well.


Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://402651]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (3)
As of 2021-09-18 10:51 GMT
Find Nodes?
    Voting Booth?

    No recent polls found