http://qs321.pair.com?node_id=11113016


in reply to Re: wiki regex reprocessing replacement
in thread wiki regex reprocessing replacement

Wow, thanks :)

And the test suite ++

> One thing I don't understand is the inclusion of > < characters in the pre- and post-markup tag delimiters

Because the repetitive solution with tf() needs to ignore previous runs.

*/_word_/* -> <b>/_word_/</b> -> <b><i>_word_</i></b> -> etc.

The recursive solution with rec() doesn't really need it, one of the reasons why I prefer this approach.

> probably because I'm not familiar with wikisyntax.

No you are not wrong, there was information missing.

In this particular case the syntax is also meant to coexist with more verbose html tags.

There are cases where one doesn't want to have a whitespace in between neighboring tags.

Just compare Re^3: Good Intentions: Wikisyntax for the Monastery and the complaint about 'ARGV'<br> not expanding.

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery

Replies are listed 'Best First'.
Re^3: wiki regex reprocessing replacement
by AnomalousMonk (Archbishop) on Feb 16, 2020 at 23:47 UTC

    You're welcome.

    Here are some test cases I've added since posting. I'd be very interested to hear your comments, especially as regards the "questionable" ones.

    '--- tests added 16feb20 after pm#11113014 post ---', '"failing" (i.e., no transformation) tests', [ '' => '', ], [ '*' => '*', ], [ '*_/' => '*_/', ], [ ' * _ / ' => ' * _ / ', ], [ '*fail/' => '*fail/', ], [ ' * fail / ' => ' * fail / ', ], 'possibly questionable transformations', [ '__' => '<u></u>', ], [ ' __ ' => ' <u></u> ', ], [ '__ __' => '<u></u> <u></u>', ], [ ' __ __ ' => ' <u></u> <u></u> ', ], [ '____' => '<u></u><u></u>', '???' ], [ ' ____ ' => ' <u></u><u></u> ', '???' ], [ '______' => '<u></u><u></u><u></u>', '???' ], [ ' ______ ' => ' <u></u><u></u><u></u> ', '???' ], [ '________' => '<u></u><u></u><u></u><u></u>', '???' ], [ ' ________ ' => ' <u></u><u></u><u></u><u></u> ', '???' ], [ '__ __ __ __' => '<u></u> <u></u> <u></u> <u></u>', ], [ ' __ __ __ __ ' => ' <u></u> <u></u> <u></u> <u></u> ', ],
    In this particular case the syntax is also meant to coexist with more verbose html tags.
    There are cases where one doesn't want to have a whitespace in between neighboring tags.
    Can you supply some test cases for variations, especially WRT intermixtures with standard HTML?


    Give a man a fish:  <%-{-{-{-<

      > especially as regards the "questionable" ones

      Yes sorry.

      I didn't want to over complicate the question, and just wrote .*? between the markup.

      Actually I'm using now something like (\S.*?(?<=\S)) to enforce at least one non-whitespace between the markers.

      The objective of the question was "How best to allow * / _ to be chained and or nested".

      The recursive approach does it already pretty well.

      And actually nesting these markups is of rather low priority in the to-do list

      > Can you supply some test cases for variations, especially WRT intermixtures with standard HTML?

      That's my project: Wikisyntax for the Monastery =)

      JS-regex is mostly compatible to Perl4 regex.

      these are some tests I use ATM

      sub is_tf { my ($in,$out,$label) = @_; is( rec( $in ) => $out => "$label: \t'$in'\t->\t'$out'" ); } sub no_tf { my ($in,$label) = @_; is_tf($in,$in,$label); } no_tf( '**' => "no letter" ); is_tf( '*A*' => '<b>A</b>' => "one letter"); is_tf( '*A B*' ,'<b>A B</b>' , "multi word"); no_tf( '* A*' , "before non-whitespace"); no_tf( '*A *' , "after non-whitespace"); no_tf( "*A\nB*" , "line break"); is_tf( '*A *B*' ,'<b>A *B</b>' , "after non-whitespace prolonged"); is_tf( '/**/' ,'<i>**</i>' , "nested no letter");

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery