Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

regex for regex?

by shy2 (Initiate)
on Jun 26, 2006 at 22:11 UTC ( [id://557672]=perlquestion: print w/replies, xml ) Need Help??

shy2 has asked for the wisdom of the Perl Monks concerning the following question:

hi, not sure if there's an obvious solution to this, but is there a way to parse a script and look for any regular expression? thanks in advance

Replies are listed 'Best First'.
Re: regex for regex?
by saintmike (Vicar) on Jun 26, 2006 at 22:35 UTC
Re: regex for regex?
by Solo (Deacon) on Jun 27, 2006 at 07:17 UTC
    Perhaps B::Concise helps?

    # in code.pl, for example if ( /test1/ ) { print; } my $re = qr/test2/; my @array = split( /test3/, $ARGV[0] );

    perl -MO=Concise,-exec code.pl | grep "</>"

    produces the output,

    code.pl syntax OK 3 </> match(/"test1"/) s/RTIME 9 </> qr(/"test2"/) s/64 e </> pushre(/"test3"/) s/64

    "</>" is the symbol for an OP with a regular expression. Someone smarter than I might tell you whether this will catch all the regex cases. If the regex is read in at runtime (with YAML or Storable, for example) I think B::Concise would tell you there was a regex involved, but not what it was.

    --Solo

    --
    You said you wanted to be around when I made a mistake; well, this could be it, sweetheart.

      I think B::Concise would tell you there was a regex involved, but not what it was.

      Yes. You're actually searching for the match operator (not regexp construction), so it doesn't matter how the regexp was constructed, as long as the match operator is in static code.

      >perl -MO=Concise -e "$re = eval 'qr/test3/'; '' =~ $re" | find "</>" -e syntax OK c </> match() vKS

      Keep in mind that '' =~ $re is short for '' =~ /$re/.

Re: regex for regex?
by GrandFather (Saint) on Jun 26, 2006 at 22:38 UTC

    The mantra "Only Perl can parse Perl" applies here. Regexen can be hidden in so many ways and things that look like regexs can be present in so many places that trying to find all regexes in a arbitary Perl script would be a very large problem. A trivial search for =~ and =! will find some (and may find some false hits). A search for m/ and s/ may find some more. A search for [\s=][ms][-`~!@#$%^&*(){}[\]:";',.<>/?\\|] may find a few more. But any simplistic search will be unreliable.


    DWIM is Perl's answer to Gödel
      The mantra "Only Perl can parse Perl" has been proven wrong by PPI.

        My understanding from the documentation for PPI is that is analyses Perl Documents in isolation so I suspect there are simple cases where a regex may be provided by a module and used in the Perl Code being analysed that may not be recognised by PPI. However I don't have PPI available (it doesn't seem to be in ActiveState's ppm repositories) so I can't test that.

        Note that PPI doesn't claim to parse Perl Code, 'only' Perl Documents. I agree PPI very likely suffices for the OP's purpose, but it's not clear that PPI invalidates the mantra. :)


        DWIM is Perl's answer to Gödel

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://557672]
Approved by Paladin
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (8)
As of 2024-04-19 09:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found