Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

A complex recursive regex,help

by OM_Zen (Scribe)
on Feb 06, 2003 at 00:40 UTC ( [id://233001]=perlquestion: print w/replies, xml ) Need Help??

OM_Zen has asked for the wisdom of the Perl Monks concerning the following question:

Hi , Great Monks of the Monastery ,please help on this

my $a = "x|xx|xx|xxxx|xx|xxx|xx|xx|x"; $a =~ s/x\|xx\|x/x\|x x\|x/g; print "[$a]\n"; This gives the output [x|x x|xx|xxxx|x x|xxx|x x|xx|x]

The string above has this pattern x|xx|x ,that has to be changed to x|x x|x,( my intent is to change |xx| alone to |x x| and not any other occurances of xx ) when I am doing this , the part of the code that has the pattern like x|xx|xx|x is changed to x|x x|xx|x, I understand that when the string is parsed and the pattern of x|xx|x occurs , it is changed to x|x x|x, but the next time when it occurs together the parser would have crossed the pattern and hence changes only the first occurance and leaves the second one as it was in the string .

I am asking the way to recursively do pattern matching or doing backtracking or something to match the second pattern as well

Alternately , a colleague had a while loop going

while($_ =~ s/x\|xx\|x/x\|x x\|x/g){ }


I have come to the monastery for an answer that will be in doing this recursively or through some other pattern to do this through regular expression itself

20030208 Edit by Corion - changed text from within code to normal text.

Replies are listed 'Best First'.
Re: A complex recursive regex,help
by tachyon (Chancellor) on Feb 06, 2003 at 00:46 UTC

    You don't need recursion. Just use the positive look back and look forward assertions which don't eat string. See perlman:perlre

    my $a = "x|xx|xx|xxxx|xx|xxx|xx|xx|x"; $a =~ s/(?<=\|)xx(?=\|)/x x/g; print "[$a]\n"; __DATA__ [x|x x|x x|xxxx|x x|xxx|x x|x x|x]

    You don't say what you want to do with the edge cases:

    $a = 'xx|xx|xx'; # should this be (use example above): [xx|x x|xx] # or should it be [x x|x x|x x] # in which case you will need to add an extra regex because you can't +have # variable width lookbacks. this regex just processes those edge cases $a =~ s/^xx(?=\|)|(?<=\|)xx$/x x/g; # combined with the first part gives you some real perl line noise # if you want it all in a single regex $a =~ s/^xx(?=\|)|(?<=\|)xx(?=\|)|(?<=\|)xx$/x x/g;

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

      Hi Tachyon ,

      my $a = "x|xx|xx|xxxx|xx|xxx|xx|xx|x"; $a =~ s/(?<=\|)xx(?=(\||$))/x x/g; # this is actually doing the regex edge cases too Tachyon


        Well *actually* it does not. Your test string does not *actually* contain the edge cases ( xx at the begining and end of the string):o) If your test string had contained the edge cases you would no doubt have noted the following failure case.....

        my $a = "xx|xx|xx|xxxx|xx|xxx|xx|xx|xx"; $a =~ s/(?<=\|)xx(?=(\||$))/x x/g; print $a; __DATA__ xx|x x|x x|xxxx|x x|xxx|x x|x x|x x

        Here is one way to fix your regex:

        $a =~ s/(?<=\|)|^xx(?=(\||$))/x x/g; # and here is a way using the zero width boundary assertion \b # that will match | ^ or $ but also matches any non aphlanumeric # so would potentially fail on '|xx,xx|' type strings # if they exist in practice as your real data is no doubt # not really xx..... $a =~ s/\bxx\b/x x/g;

        cheers

        tachyon

        s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

      Hi Tachyon ,

      my $a = "x|xx|xx|xxxx|xx|xxx|xx|xx|x"; $a =~ s/(?<=\|)xx(?=\|)/x x/g; print "[$a]\n";


      This is the backward and forward assertions that I needed ,man , I have to learn more on this ,man THANKS A BUNCH Tachyonfor the help

        Glad to help. The only bummer is that you cna't have variable width lookbacks (positive or negative) like (?<=x*) or (?<!x*) You can have variable width look forward assertions though (?=x*) or (?!x*) ar OK.....

        cheers

        tachyon

        s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Re: A complex recursive regex,help
by BrowserUk (Patriarch) on Feb 06, 2003 at 00:55 UTC

    See how you get on with this?

    $s = "x|xx|xx|xxxx|xx|xxx|xx|xx|x"; $s =~ s[(x\|x)(x\|x)([x\|]*)][$1 $2$3]g; print $s; x|x x|xx|xxxx|xx|xxx|xx|xx|x

    Examine what is said, not who speaks.

    The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.

      Hi BrowserUk,

      My request was not clear, I think , Actually I need all the |xx| changed to |x x| and hence now I have used the tachyon answer of backward and forward assertion which I did not know much on before and I have seen your node once that uses < > code and then assigning $1 to a local variable inside the regex itself which was a lovely piece of regex , but I sure have to read and practice it thoroguhly BrowserUk, about backward, forward asssertions,and few other concepts of regex,THANKS A BUNCH



        Sorry. I misunderstood your requirement (though it could have been stated a little more clearly:^).

        This one (I think does what you want, edge case as well and is relatively simple.

        $s = "xx|xx|xxxx|xx|xxx|xx|xx|xx"; $s =~ s[(?<!x)(x)(x)(?!x)][$1 $2]g; print $s; x x|x x|xxxx|x x|xxx|x x|x x|x x

        Update Thinking about it, stick with tachyon's as this will fall over if your data contains any occurance of xx without |'s or x's on either side.


        Examine what is said, not who speaks.

        The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.

Re: A complex recursive regex,help
by Enlil (Parson) on Feb 06, 2003 at 00:59 UTC
    TMTOWTDI
    my $a = "x|xx|xx|xxxx|xx|xxx|xx|xx|x"; 1 while ($a =~ s/\|xx\|/|x x|/); print "[$a]\n"; __DATA__ [x|x x|x x|xxxx|x x|xxx|x x|x x|x]

    but as tachyon mentioned: what to do with the edge cases? (this method is not optimal by any means.)

    -enlil

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://233001]
Approved by Enlil
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (2)
As of 2024-04-18 23:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found