I agree "pure" regex isn't the way to go, but...
Win8 Strawberry 5.30.3.1 (64) Wed 05/26/2021 9:07:19
C:\@Work\Perl\monks
>perl
use 5.018; # need lexicals in regexes, regex extensions
use strict;
use warnings;
my @Test = (
'43:1:1; 43:1:2; 43:1:3; 43:1:4; 43:1:5; 43:1:6; 27:3:7; 27:3:8; 27:
+3:9; 65:1:4; 65:1:18',
'987:23:45; 987:23:46; 65:1:17; 65:1:19',
);
for my $data (@Test) {
print "'$data' \n";
my $rx_base = qr{ (?> \d+ : \d+ :) }xms;
my $rx_tail = qr{ (?> \d+) }xms;
my $rx_sep = qr{ (?> ;? \s*) }xms;
my @run;
$data =~ s{
($rx_base) ($rx_tail) (?{ push @run, $^N })
(?: $rx_sep \1 ($rx_tail) (?{ push @run, $^N })
(?(?{ $run[-1] - $run[-2] != 1 }) (*F))
)+
}
{$1$2-$3}xmsg;
print "'$data' \n\n";
}
^Z
'43:1:1; 43:1:2; 43:1:3; 43:1:4; 43:1:5; 43:1:6; 27:3:7; 27:3:8; 27:3:
+9; 65:1:4; 65:1:18'
'43:1:1-6; 27:3:7-9; 65:1:4; 65:1:18'
'987:23:45; 987:23:46; 65:1:17; 65:1:19'
'987:23:45-46; 65:1:17; 65:1:19'
(I think this could be scaled back to pre-5.10 regexes if necessary.)
Update: Here's another version that I think is a bit nicer.
It avoids "absolute" capture group variables and backreferences. It
is also not push-y, using plain scalars that are
self-initializing.
Win8 Strawberry 5.30.3.1 (64) Tue 06/01/2021 11:31:49
C:\@Work\Perl\monks
>perl
use 5.018; # need lexicals in regexes, regex extensions
use strict;
use warnings;
my @Test = (
'43:1:1; 43:1:2; 43:1:3; 43:1:4; 43:1:5; 43:1:6; 27:3:7; 27:3:8; 27:
+3:9; 65:1:4; 65:1:18',
'987:23:45; 987:23:46; 65:1:17; 65:1:19',
);
for my $data (@Test) {
print "'$data' \n";
my $rx_base = qr{ (?> \d+ : \d+ :) }xms;
my $rx_tail = qr{ (?> \d+) }xms;
my $rx_sep = qr{ (?> \s* ; \s*) }xms;
my ($start, $prev, $end);
$data =~ s{
($rx_base) \K ($rx_tail) (?{ $start = $end = $^N })
(?: $rx_sep \g-2 ($rx_tail) (?{ ($prev, $end) = ($end, $^N) })
(?(?{ $end - $prev != 1 }) (*F))
)+
}
{$start-$end}xmsg;
print "'$data' \n\n";
}
^Z
'43:1:1; 43:1:2; 43:1:3; 43:1:4; 43:1:5; 43:1:6; 27:3:7; 27:3:8; 27:3:
+9; 65:1:4; 65:1:18'
'43:1:1-6; 27:3:7-9; 65:1:4; 65:1:18'
'987:23:45; 987:23:46; 65:1:17; 65:1:19'
'987:23:45-46; 65:1:17; 65:1:19'
Give a man a fish: <%-{-{-{-<
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.