http://qs321.pair.com?node_id=1017070

live4tech has asked for the wisdom of the Perl Monks concerning the following question:

Hello. I do not understand why the following regex does not match "123dog":

/\A\d+[a-z]+\z/

I can get this to match with an app that checks regex expressions, but not within a perl program.

Also, I can get /^\d+[a-z]+$/ to always match, either within the perl program or in a regex checking app.

Does anyone know why this is? Am I missing something obvious?

Replies are listed 'Best First'.
Re: REGEX problem with anchors
by mbethke (Hermit) on Feb 05, 2013 at 06:50 UTC
    perl -E'say "OK" if "123dog"=~/\A\d+[a-z]+\z/' works for me. You're not using some stoneage Perl that didn't have \A, are you? Should have been there since before 5.8.

      Thanks for this, it lead to the answer!

      The one-liner worked for me as well and so I went back to my code (which I should have posted) and realized I forgot to chomp the input strings (the user was prompted to enter input strings and a regex). I chomped the regex, not the input strings. A novice mistake, I apologize!

        In case you are using Eclipse withe epic plugin as IDE for development you can check your regular expression with the RegExp View. Also, in case you got leading whitespaces or a linebrake at the end of your string - you can easily see in debug mode in the variables view. For me always a great help.

        ^_^

        Cheers!

        Tobias
Re: REGEX problem with anchors
by kcott (Archbishop) on Feb 05, 2013 at 09:52 UTC

    G'day live4tech,

    I see you've resolved your problem. There can be rare instances when you want to preserve all the input, including terminal newlines. In these cases, you can use \Z (uppercase) instead of the more usual \z (lowercase).

    $ perl -Mstrict -Mwarnings -E ' while (<>) { say "z-match" if /\A\d+[a-z]+\z/; say "Z-match" if /\A\d+[a-z]+\Z/; } ' 123dog Z-match

    Details are in: perlre - Regular Expressions under Assertions.

    -- Ken

      But of course, there is no reason to ever use \Z (uppercase)--because you can just use \z (lowercase) and specify the \n in your regex:
      /$my_regex \n \z/xms

      ...which makes it clearer what you are doing.

      When you are having regex problems, the first thing you want to do is nail down what is in your string:

      my $str = "123dog\n"; say "-->$str<--"; --output:-- -->123dog <--
      Or, to reveal the ord() of each character in the string:
      printf "%vd", $my_str; --output:-- 49.50.51.100.111.103.10 #6 ascii chars in '123dog', but outputs 7 codes. #Checking an ascii chart for the code 10: #line feed. Ah hah, I forgot to chomp() #the string!