http://qs321.pair.com?node_id=1040291


in reply to RegEx Headaches

G'day oryx3,

Welcome to the monastery.

I see a number of solutions that appear to be more complicated than necessary.

You've provided two pieces of sample input with expected output for each. In both cases, either of these will achieve what you want:

/\.(\d+)/g /(\d+)\./g

Here's my test:

$ perl -Mstrict -Mwarnings -de 1 Loading DB routines from perl5db.pl version 1.39_09 Editor support available. Enter h or 'h h' for help, or 'man perldebug' for more help. main::(-e:1): 1 DB<1> $_ = 'ActionLogs.1.1998.xml' + DB<2> x /\.(\d+)/g + 0 1 1 1998 DB<3> x /(\d+)\./g + 0 1 1 1998 DB<4> $_ = 'ActionLogs.1.2.3.4.5.6.7.8.9.xml' + DB<5> x /\.(\d+)/g + 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 DB<6> x /(\d+)\./g + 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 DB<7> q

-- Ken

Replies are listed 'Best First'.
Re^2: RegEx Headaches
by AnomalousMonk (Archbishop) on Jun 22, 2013 at 21:28 UTC
    ... more complicated than necessary.

    Since the decimal digit groups appear to be unambiguously delimited to begin with, no need to worry about a delimiter or capture group at all:

    >perl -wMstrict -le "$_ = 'ActionLogs.1.22.333.4.5.6.7.8.987.xml'; ;; my @digit_groups = m{ \d+ }xmsg; printf qq{'$_' } for @digit_groups; " '1' '22' '333' '4' '5' '6' '7' '8' '987'

      ++ Yes, that works and is less complicated still. :-)

      It wouldn't have occurred to me not to use a capture group. I checked the online docs and found in perlretut - Using regular expressions in Perl - Global matching (after following links from perlre):

      In list context, //g returns a list of matched groupings, or if there are no groupings, a list of matches to the whole regexp. [my emphasis]

      That seemed new to me (those links are for 5.16.2), so I checked back to the earliest online perldoc version (5.8.8) and, while in a different manpage (http://perldoc.perl.org/5.8.8/perlop.html#Regexp-Quote-Like-Operators) with different wording, that behaviour was current back then:

      In list context, it returns a list of the substrings matched by any capturing parentheses in the regular expression. If there are no parentheses, it returns a list of all the matched strings, as if there were parentheses around the whole pattern. [my emphasis, again]

      I've learned something new. Thankyou.

      -- Ken