Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re: wiki regex reprocessing replacement

by LanX (Cardinal)
on Feb 15, 2020 at 15:19 UTC ( #11112989=note: print w/replies, xml ) Need Help??


in reply to wiki regex reprocessing replacement

> I'll update an SSCCE soon.

here we go

(update: please see Re: wiki regex reprocessing replacement (UPDATED^2) for better testcases including wrong markup)

use strict; use warnings; use Data::Dump qw/pp dd/; use Test::More; my $wiki = '_/one *two*/ three_ null _/four *five*/ six_ null _/seven *eight +*/ nine_'; my $expected = '<u><i>one <b>two</b></i> three</u> null <u><i>four <b>five</b></i +> six</u> null <u><i>seven <b>eight</b></i> nine</u>'; my $pre = qr/(^|\s|>)/; my $post = qr/($|\s|<)/; my %h = ( '*' => 'b' , '/' => 'i' , '_' => 'u' , ); sub tf { s{ $pre ([_*/]) (.*?) \2 (?=$post)}{$1<$h{$2}>$3</$h{$2}>}xg +}; $_=$wiki; my $DBG = 1; diag "IN <= '$wiki'\n\n" if $DBG; for my $i (1..3) { tf(); diag "$i: '$_'\n\n" if $DBG; } is($_,$expected," repeated replace works"); done_testing;

# IN <= '_/one *two*/ three_ null _/four *five*/ six_ null _/seven *e +ight*/ nine_' # # 1: '<u>/one *two*/ three</u> null <u>/four *five*/ six</u> null <u> +/seven *eight*/ nine</u>' # # 2: '<u><i>one *two*</i> three</u> null <u><i>four *five*</i> six</u> + null <u><i>seven *eight*</i> nine</u>' # # 3: '<u><i>one <b>two</b></i> three</u> null <u><i>four <b>five</b></ +i> six</u> null <u><i>seven <b>eight</b></i> nine</u>' # ok 1 - repeated replace works 1..1
# IN <= '_/one *two*/ three_ null _/four *five*/ six_  null _/seven *eight*/ nine_'
# 
# 1: '/one *two*/ three null /four *five*/ six  null /seven *eight*/ nine'
# 
# 2: 'one *two* three null four *five* six  null seven *eight* nine'
# 
# 3: 'one two three null four five six  null seven eight nine'
# 
ok 1 -  repeated replace works
1..1

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery

Replies are listed 'Best First'.
Re^2: wiki regex reprocessing replacement
by hippo (Chancellor) on Feb 15, 2020 at 16:59 UTC

    Here's a non-recursive way which I think fits your criteria:

    use strict; use warnings; use Test::More; my $wiki = '_/one *two*/ three_ null _/four *five*/ six_ null _/seven *eight +*/ nine_'; my $expected = '<u><i>one <b>two</b></i> three</u> null <u><i>four <b>five</b></i +> six</u> null <u><i>seven <b>eight</b></i> nine</u>'; my %h = ( '*' => 'b' , '/' => 'i' , '_' => 'u' , ); my $DBG = 1; sub flip { my $s = shift; my $z = $h{$s}; $h{$s} = $z =~ /\// ? substr ($z, 1, 1) : "/$z"; return "<$z>"; } sub tf { diag "Pre: '$_'\n\n" if $DBG; s{([_*/])}{flip($1)}eg }; $_ = $wiki; diag "IN <= '$wiki'\n\n" if $DBG; tf(); is ($_, $expected, " repeated replace works"); done_testing;
      Many thanks, :)

      ... but ...

      The testsuite should have also included markup which must not be replaced

      My fault sorry, I thought it's obvious by the $pre and $post regex.

      The markup must come in pairs and be embraced by special word boundaries.

      (whitespace or other markup or tag-brackets or ... depending on pre/post)

      Hence a _ inside a word is forbidden, which makes sense for joined_identifiers .

      I've updated the tests in Re: wiki regex reprocessing replacement (UPDATED^2) with markup to ignore

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

        My fault sorry, I thought it's obvious by the $pre and $post regex.

        I noticed that they were of no use to the sample dataset so just applied Occam's Razor to them. Given the new(ly stated) criteria and the extra complications those entail I'm happy to wait for tybalt89's solution*.

        *which has appeared even as I was typing this. Top work!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://11112989]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (7)
As of 2020-07-02 16:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?