Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

Re^2: what is the best way to seperate the digits and strings from variable ?

by dbwiz (Curate)
on May 09, 2005 at 10:58 UTC ( #455142=note: print w/replies, xml ) Need Help??

in reply to Re: what is the best way to seperate the digits and strings from variable ?
in thread what is the best way to seperate the digits and strings from variable ?

Please, test your code.

\d* will match 0 (ZERO) or more digits.

Thus, it will happily match an empty string at the beginning of a string like "abc123" and return a 0 length string. Try it.

The right expression to use in this case is \d+.

Moreover, a capturing regular expression should always be used with a test:

my $num; my $variable = "abc123"; if ( $variable =~ /(\d+)/) { $num = $1; }

Replies are listed 'Best First'.
Re^3: what is the best way to seperate the digits and strings from variable ?
by polettix (Vicar) on May 09, 2005 at 13:04 UTC
    Please read the OP and the variable names in the test code before slapping hands. You can be right from your implied point of view, but I had my rationale when posting (which implies that I tested my code, of course)

    Here, I'm considering digits as a character class, not as components of a number whose semantic is different from that of a string; this is why I name my variable $start_digits, not $num as you do. As you've surely noted, the OP never talks about numbers, always about digits (anyway, as I noted in my post, the OP was not clear about the usage of this extracted data).

    Thus, when I thought about putting "*" or "+"*, I considered that if there were no digits it was good, and the returned string would be empty.

    I think you can agree with me that, had the OP asked for initial letters, the regex:

    my $variable = "abc123"; my ($letters) = $variable =~ /^([a-zA-Z]*)/;
    would do the job.

    *I swear I thought about that!

    Flavio (perl -e 'print(scalar(reverse("\nti.xittelop\@oivalf")))')

    Don't fool yourself.

      Testing your code is a great concept. Of course, we all have to agree on the specs so we can all agree on what tests are needed. Your code works just fine for a certain subset of possibilities, dbwiz's code works just fine for a different subset of possibilities, both work just fine for the subset of posssibilities as presented by the OP. In the absence of a better spec, we all make assumptions that show the world that we, individually, live in more than they show the world that the OP lives in. (Which is why, if you look back at questions I pose, they're usually quite long-winded - to reduce the "absence of a better spec".)

      As for the initial letters, not that we're straying from the initial thread here ;-), I'd recommend matching with /^([[:alpha:]]*)/ instead. Again, we have to agree on a spec of what "initial letters" means (does it mean English letters, or can it include accented characters, or letters in other non-Roman lettering systems?). If it includes other languages, I like letting perl worry about that stuff for me ;-). Note that it is perfectly reasonable to only accept straight-ascii for some things. We just can't tell from what has been stated so far. (And I've just revealed a bit more about the world I live in.)

        In doing some theatre I think it's (should be) allowed to stretch specs at will - at least, until they don't STRRRRAAAAAAAPPP miserably! I'm a lazy guy (as you implicitly, and correctly, noted with your /^([[:alpha:]]*)/ regex :), but I couldn't accept that reply the very time I've actually actively thougth about "+" and "*"!

        One of the things that PM lacks is a pub section, were dbwiz and I could enjoy a beer laughing about all that! (Others would be welcome as well).

        Flavio (perl -e 'print(scalar(reverse("\nti.xittelop\@oivalf")))')

        Don't fool yourself.

      frodo72, Don't fool yourself, as you signature says.

      dbwiz has given you good advice. Assigning $1 without testing is almost a capital sin in Regex parlance. Your initial code would pass a test against the only example provided by the OP, but it would fail in many other cases.


      A reply falls below the community's threshold of quality. You may see it by logging in.
Re^3: what is the best way to seperate the digits and strings from variable ?
by Errto (Vicar) on May 10, 2005 at 02:49 UTC
    Please, test your code.

    Testing code in replies is encouraged (if that), but not required. One must simply have the common sense to accept that occasionally untested code will prove itself wrong, and one must be willing to correct it should that occur.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://455142]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (3)
As of 2023-10-01 06:05 GMT
Find Nodes?
    Voting Booth?

    No recent polls found