Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

I agree "pure" regex isn't the way to go, but...

Win8 Strawberry 5.30.3.1 (64) Wed 05/26/2021 9:07:19 C:\@Work\Perl\monks >perl use 5.018; # need lexicals in regexes, regex extensions use strict; use warnings; my @Test = ( '43:1:1; 43:1:2; 43:1:3; 43:1:4; 43:1:5; 43:1:6; 27:3:7; 27:3:8; 27: +3:9; 65:1:4; 65:1:18', '987:23:45; 987:23:46; 65:1:17; 65:1:19', ); for my $data (@Test) { print "'$data' \n"; my $rx_base = qr{ (?> \d+ : \d+ :) }xms; my $rx_tail = qr{ (?> \d+) }xms; my $rx_sep = qr{ (?> ;? \s*) }xms; my @run; $data =~ s{ ($rx_base) ($rx_tail) (?{ push @run, $^N }) (?: $rx_sep \1 ($rx_tail) (?{ push @run, $^N }) (?(?{ $run[-1] - $run[-2] != 1 }) (*F)) )+ } {$1$2-$3}xmsg; print "'$data' \n\n"; } ^Z '43:1:1; 43:1:2; 43:1:3; 43:1:4; 43:1:5; 43:1:6; 27:3:7; 27:3:8; 27:3: +9; 65:1:4; 65:1:18' '43:1:1-6; 27:3:7-9; 65:1:4; 65:1:18' '987:23:45; 987:23:46; 65:1:17; 65:1:19' '987:23:45-46; 65:1:17; 65:1:19'
(I think this could be scaled back to pre-5.10 regexes if necessary.)

Update: Here's another version that I think is a bit nicer. It avoids "absolute" capture group variables and backreferences. It is also not push-y, using plain scalars that are self-initializing.

Win8 Strawberry 5.30.3.1 (64) Tue 06/01/2021 11:31:49 C:\@Work\Perl\monks >perl use 5.018; # need lexicals in regexes, regex extensions use strict; use warnings; my @Test = ( '43:1:1; 43:1:2; 43:1:3; 43:1:4; 43:1:5; 43:1:6; 27:3:7; 27:3:8; 27: +3:9; 65:1:4; 65:1:18', '987:23:45; 987:23:46; 65:1:17; 65:1:19', ); for my $data (@Test) { print "'$data' \n"; my $rx_base = qr{ (?> \d+ : \d+ :) }xms; my $rx_tail = qr{ (?> \d+) }xms; my $rx_sep = qr{ (?> \s* ; \s*) }xms; my ($start, $prev, $end); $data =~ s{ ($rx_base) \K ($rx_tail) (?{ $start = $end = $^N }) (?: $rx_sep \g-2 ($rx_tail) (?{ ($prev, $end) = ($end, $^N) }) (?(?{ $end - $prev != 1 }) (*F)) )+ } {$start-$end}xmsg; print "'$data' \n\n"; } ^Z '43:1:1; 43:1:2; 43:1:3; 43:1:4; 43:1:5; 43:1:6; 27:3:7; 27:3:8; 27:3: +9; 65:1:4; 65:1:18' '43:1:1-6; 27:3:7-9; 65:1:4; 65:1:18' '987:23:45; 987:23:46; 65:1:17; 65:1:19' '987:23:45-46; 65:1:17; 65:1:19'


Give a man a fish:  <%-{-{-{-<


In reply to Re: Regexp substitution on variable-length ranges with embedded code? (updated) by AnomalousMonk
in thread Regexp substitution on variable-length ranges with embedded code? by Polyglot

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (2)
As of 2022-05-26 02:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you prefer to work remotely?



    Results (92 votes). Check out past polls.

    Notices?