http://qs321.pair.com?node_id=11129258


in reply to How to enforce match priority irrespective of string position

I have trouble fully grasping your intention, especially because your example text and your description overlap.

Could it be you are looking for recursive parsing, where anything in "quotes" won't be broken up at period?

perldocs have examples for implementing this.

  • Comment on Re: How to enforce match priority irrespective of string position

Replies are listed 'Best First'.
Re^2: How to enforce match priority irrespective of string position
by Polyglot (Chaplain) on Mar 07, 2021 at 12:24 UTC

    I am, of course, dealing with some exceptions in a body of text. The text has some irregularities, but could be parsed correctly if only I am able to impose a strict ordering of match priority. It isn't an issue of quotes, nor is nesting involved; it's actually an issue of some potential "false positives" that must be initially skipped in favor of a more favorable match unless that more favorable match cannot be found--in which case the "false positive" might be the correct match. Does this make sense?

    Blessings,

    ~Polyglot~

        I sure was hoping someone would be able to suggest a regexp secret that I had not yet learned. I was hoping there would be some way of doing this. I may have to just pre-parse looking for the false positives, and exchange them temporarily for a marker of some sort before parsing a second time. I'm not even sure if that would work. I'll have to ponder that some more. I need to be able to reorder the sentences following a specific ruleset and in a specific order, by order of appearance in the sentence.

        Sigh. Too bad regex can't do everything!

        Blessings,

        ~Polyglot~

      I'd say use Hippo's template of an SSCCE Re: Matching a string in a parenthesized block (regex help) to write some tests for
      • what you want and
      • what you don't want.
      This would certainly be beneficial for you too.

      Other than that, |-or conditions with swallowing can prioritize areas, like "quoted" ones. demo

      DB<132> $_ = 'phrase. "phrase1.phrase2" phrase. phrase' 0 'phrase. "phrase1.phrase2" phrase. phrase' DB<133> split /(".*?"|\.)/ 0 'phrase' 1 '.' 2 ' ' 3 '"phrase1.phrase2"' 4 ' phrase' 5 '.' 6 ' phrase' DB<134>