Modifying muliple matched strings in text

nysus has asked for the wisdom of the Perl Monks concerning the following question:

In the following piece of markdown text, I want to replace the spaces with a hyphen and also make the link absolute with a slash in front. So I want to go from:

blah blah [click me](click me) more stuff
blah [link here](link here) blah blah
...
[download]

to:

blah blah [click me](/click-me) more stuff
blah [link here](/link-here) blah blah
...
[download]

So this is as far as I got and I'm stymied:

#!/usr/bin/env perl
use strict;
use warnings;

my $text = "blah blah [click me](click me) more stuff\nblah [link here
+](link here) blah blah";
$text =~ s/(\[[^]]+]\()/$1\//g; # make links absolute
$text =~ /(\[[^]]+]\(\/)([^\)]+)/;
my $part1 = $1;
my $part2 = $2;
my $orig_part2 = $part2;
$part2 =~ s/ /-/g;
$text =~ s/\Q$part1$orig_part2\E/$part1$part2/g;
[download]

The obvious problem with this code is that only the first link gets the space replaced:

blah blah [click me](/click-me) more stuff
blah [link here](/link here) blah blah
[download]

I'm drawing a blank on how I might loop through all matches to links and modify those matches. Or maybe there is an entirely better way of pulling this off. Thanks!

UPDATE:

OK, I noodled around with this some more. On a lark, I tried a "global" match (which I didn't know existed outside substitution), and came up with this:

my $text = "blah blah [click me](click me) more stuff\nblah [link here
+](link here) blah blah";
$text =~ s/(\[[^]]+]\()/$1\//g; # make links absolute
my @matches = $text =~ /(\[[^]]+]\(\/)([^\)]+)/g;
while (@matches) {
  my $part1 = shift @matches;
  my $part2 = shift @matches;
  my $orig_part2 = $part2;
  $part2 =~ s/ /-/g;
  $text =~ s/\Q$part1$orig_part2\E/$part1$part2/g;
}
[download]

Though it seems to work, I have a hunch this isn't ideal.

$PM = "Perl Monk's";
$MCF = "Most Clueless ~~Friar~~ ~~Abbot~~ ~~Bishop~~ ~~Pontiff~~ ~~Deacon~~ ~~Curate~~ ~~Priest~~ Vicar";
$nysus = $PM . ' ' . $MCF;
Click here if you love Perl Monks

Comment on Modifying muliple matched strings in text Select or Download Code

Replies are listed 'Best First'.
Re: Modifying muliple matched strings in text (updated) by haukex (Archbishop) on Jul 12, 2020 at 13:36 UTC
Though it seems to work, I have a hunch this isn't ideal. Of course, the best way is to use a proper parser. In this case, the original Markdown is a Perl script itself, and the regex extracted from it looks like the following (modified slightly to put it into `qr//` form). Of course, now there are tons of different Markdown variations and parsers, and probably several that are more robust than this "just a regex" parser, but as long as your Markdown input is simple enough, it should probably be ok. `<update>` There are caveats to this approach, though - for example, the following regex will also operate on code blocks! As usual, more representative sample data will result in more accurate solutions `:-)` `</update>` `my $g_nested_brackets; $g_nested_brackets = qr{ (?> # Atomic matching [^\[\]]+ # Anything other than brackets \| \[ (??{ $g_nested_brackets }) # Recursive set of nested brackets \] )* }x; my $anchors = qr{ ( # wrap whole match in $1 \[ ($g_nested_brackets) # link text = $2 \] $ # literal paren [ \t]* <?(.?)>? # href = $3 [ \t] ( # $4 (['"]) # quote char = $5 (.?) # Title = $6 \5 # matching quote )? # title is optional $ ) }xs;` [download] I've taken this regex and modified it to modernize it a bit and only capture the things we're interested in: use warnings; use strict; my $anchors = qr{ (?(DEFINE) (?<nested_brackets> (?> [^\[\]]+ \| \[ (?&nested_brackets) \] ) ) ) \[ (?<text> (?&nested_brackets) ) \] $ (?<link> [ \t]* <? .? >? [ \t] (?: (?<titlequote>['"]) .*? \k<titlequote> )? ) $ }xs; my $input = <<'END'; blah blah [click me](click me) more stuff blah [link here](link here) blah blah END my $expect = <<'END'; blah blah [click me](/click-me) more stuff blah [link here](/link-here) blah blah END (my $output = $input) =~ s{$anchors}{ my ($t, $l) = @+{qw/ text link /}; $l =~ s/\s+/-/g; "[$t](/$l)" }ge; use Test::More tests=>1; is $output, $expect; [download]	[reply] [d/l] [select]
Re^2: Modifying muliple matched strings in text (updated) by nysus (Parson) on Jul 14, 2020 at 13:32 UTC
Holy cow, there's some really advanced regex stuff going on here I've never seen before. I will study this closely and pick up some new tricks. Thanks. I thought about researching a markdown parser but I figured it was easier, for now, just to roll my own before going down a big wormhole. $PM = "Perl Monk's"; $MCF = "Most Clueless ~~Friar~~ ~~Abbot~~ ~~Bishop~~ ~~Pontiff~~ ~~Deacon~~ ~~Curate~~ ~~Priest~~ Vicar"; $nysus = $PM . ' ' . $MCF; Click here if you love Perl Monks	[reply]
Re: Modifying muliple matched strings in text by LanX (Saint) on Jul 12, 2020 at 13:00 UTC
Hint: Call a function in the substitution part (activated by /e modifier) which returns your transformation. `s/PATTERN/replace($1,$2)/eg` Not sure if you even need to pass the matchgroups $1,... into the sub, but it's certainly more readable. Edit Pattern should match the links `[$1]($2)` (pseudo syntax). I recommend using the /x flag for readability. The transformations of $2 should happen inside the replace() sub HTH! Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery}	[reply] [d/l] [select]
Re^2: Modifying muliple matched strings in text by nysus (Parson) on Jul 14, 2020 at 13:24 UTC
Very cool. Didn't know that was even possible (well probably read about it once and then forgot). Thanks. $PM = "Perl Monk's"; $MCF = "Most Clueless ~~Friar~~ ~~Abbot~~ ~~Bishop~~ ~~Pontiff~~ ~~Deacon~~ ~~Curate~~ ~~Priest~~ Vicar"; $nysus = $PM . ' ' . $MCF; Click here if you love Perl Monks	[reply]
Re: Modifying muliple matched strings in text by tybalt89 (Monsignor) on Jul 12, 2020 at 15:01 UTC
`#!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11119206 use warnings; my $text = "blah blah [click me](click me) more stuff\nblah [link here +](link here) blah blah"; s!]$\K.*?$! '/' . $& =~ tr~ ~-~r !ge for $text; print "$text\n";` [download]	[reply] [d/l]

Back to Seekers of Perl Wisdom

Edit