OK, it took me a few minutes to completely understand how and
why this works. Here is my dissected version of the regex:
s/(\d+) # first number (group #1)
(?: # group #2
, # followed by a comma
( # group #3
(??{$++1}) # match previous number + 1 (group 4)
) # end group #3
)+ # end group #4, repeat
/$1-$+/gx; # substitute for the first number followed by
+the
# last matched one
Group #1 matches the first number in a sequence of numbers.
Then, the
??{$+ + 1} is used to match "the last number
plus one" (
$+ stands for whatever was matched by
the last set of grouping parenthesis). For the second number
in a sequence, the "last number"
is the one matched by group #1. But for subsequent numbers
(because of the
+), the last number matched (this
is, whatever the
??{$++1} matched last time) becomes
the "last number". So the thing repeats until the "last number
plus one" part doesn't match anymore (this is, until a non-consecutive
number is found), and then replaces the whole thing with the
first number (group #1), a dash, and the last number matched.
At first look, I thought the double parenthesis around
??{$++1} were unnecessary, but without them it does
not work, and here is why: $+ contains what was
matched by the last set of parenthesis, not the current
set. So by doubling the parenthesis, it makes $+ contain the
last thing matched by the current expression. Very clever!
--ZZamboni