Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re: How to enforce match priority irrespective of string position

by tybalt89 (Monsignor)
on Mar 08, 2021 at 00:49 UTC ( [id://11129304]=note: print w/replies, xml ) Need Help??


in reply to How to enforce match priority irrespective of string position

Finally, a "sort of" test case :)

#!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11129253 use warnings; local $_ = <<END; Point 1.3.4: A piece of text. Point 1.3.5: A piece of text. Point 1.3.6: Another piece of text. Point 1.3.6: For some reason this +piece of text isn't finished yet. Point 1.3.6: In fact, this piece of text even broke into a new line. Point 1.3.7: Finally, a new piece of text. END my @parts; push @parts, $& while / (Point\s[\d.]+:) .*? (?=Point|\z) (?!\1) /gsx; use Data::Dump 'dd'; dd \@parts;

Outputs four chunks, just like you asked for:

[ "Point 1.3.4: A piece of text.\n\n", "Point 1.3.5: A piece of text.\n\n", "Point 1.3.6: Another piece of text. Point 1.3.6: For some reason th +is piece of text isn't finished yet.\n\nPoint 1.3.6: In fact, this pi +ece of text even broke into a new line.\n\n", "Point 1.3.7: Finally, a new piece of text.\n\n", ]

Replies are listed 'Best First'.
Re^2: How to enforce match priority irrespective of string position
by Polyglot (Chaplain) on Mar 08, 2021 at 01:33 UTC

    And that method worked! (Though I've had to restructure a bit to accommodate, as that was not in a simple substitution form.) I don't mind doing whatever is necessary to get things working, though...so thank you very much! I'll certainly upvote this when I get my next day's rations.

    This part seems to be the crucial bit: (?=Point|\z) (?!\1). I find this sort of syntax confusing because it always seems to me that the "Point" here should have precedence over anything coming afterward in the regex sequence, in this case the "\1" backreference. If "Point" is already detected from the forward assertion, why can it be matched again (overlapped) by this reference, even if in the negative?

    Well, no complaints at the moment, certainly, as at least the script is now past this hurdle. Thank you.

    Blessings,

    ~Polyglot~

      Because (?= and (?! are ZERO-WIDTH assertions.

        I appreciate knowing that, but while the assertion may be "zero-width," my mind still stumbles on the point that "Point" is certainly not zero-width. I'd always understood the "zero-width" aspect to be more related to the capturing and positioning of the match within the string. Are all look-arounds zero-width? If so, why must a look-behind be always of a specified length (width) that is not variable?

        Well, it may be that it's just too abstract for me.

        Thank you for your explanation.

        Blessings,

        ~Polyglot~

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11129304]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (5)
As of 2024-04-24 04:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found