Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Is there an easy way to parse and modify regular expressions programmatically

by DrWhy (Chaplain)
on Jul 21, 2015 at 18:00 UTC ( [id://1135679]=perlquestion: print w/replies, xml ) Need Help??

DrWhy has asked for the wisdom of the Perl Monks concerning the following question:

I have a system where a user provides a set of regular expressions and a set of replacement strings (what you put in the right side of the s/// operator). All of these 'rules' (not to be confused with Perl 6 rules) can be embedded inside each other. The problem with embedding is that it can make managing numeric capture groups references difficult to deal with. If you embed one rule in another and they both have capture groups, the numbering of capturing groups gets shifted in the final combined regex gets shifted around. I'd like to be able to not require the users to think about this problem which means that part of my preprocessing code that runs before actually creating the final regex needs to some how figure out where all the capture groups are and renumber the references and back references appropriately. This seems to be not an easy problem, but I'm hoping that there might either be a simple solution other than parsing the whole regex myself or a good regex parser around that I could use to manage this problem. Does anyone here know of good tools for finding out where all the capture groups are inside a regular expression?

--DrWhy

"If God had meant for us to think for ourselves he would have given us brains. Oh, wait..."

  • Comment on Is there an easy way to parse and modify regular expressions programmatically

Replies are listed 'Best First'.
Re: Is there an easy way to parse and modify regular expressions programmatically
by Anonymous Monk on Jul 21, 2015 at 18:12 UTC

    I haven't worked with it myself, but maybe PPIx::Regexp is worth looking into. I suspect that similar to PPI, it probably doesn't support the full range of Perl features.

    However it sounds like you might want to be using named capture groups with unique names rather than plain numbered capture groups?

Re: Is there an easy way to parse and modify regular expressions programmatically
by AnomalousMonk (Archbishop) on Jul 22, 2015 at 02:56 UTC

    In addition to named capture groups, already suggested, there are \gn and \g{n} relative numbered capture group backreferences, where n may be negative or positive.

    But I agree that the problem of having users define regexes using any regex construct under the sun and then be able to combine these pieces together in ways that will still "Do What You Want" is formidable. I think that, in the end, you will have to impose (substantial) limitations on what users can define. And yes, please give us some reasonable examples!


    Give a man a fish:  <%-(-(-(-<

Re: Is there an easy way to parse and modify regular expressions programmatically
by stevieb (Canon) on Jul 21, 2015 at 18:07 UTC

    Please give a couple-to-three examples of data, user-supplied regexes, how they get convoluted, and what your expected output should be.

    Are you working with consistent (format is exactly the same) incoming data?

Re: Is there an easy way to parse and modify regular expressions programmatically
by Laurent_R (Canon) on Jul 21, 2015 at 18:45 UTC
    I agree with stevieb, please provide some examples of your input data and desired output, and, if possible, of your current code, with explanations about where it does not match your expectations.

    I think there are some possible solutions (named captures, for example, etc.), but we need to know more to figure out if they are appropriate and fit your bill.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1135679]
Approved by BrowserUk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (4)
As of 2024-04-25 06:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found