Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Another Parse::RecDescent Question

by sth (Priest)
on Nov 03, 2003 at 20:36 UTC ( [id://304241]=perlquestion: print w/replies, xml ) Need Help??

sth has asked for the wisdom of the Perl Monks concerning the following question:

Hello Fellow Monks, I am trying to use Parse::RecDescent to parse Sql as well. I am trying to parse the sql into (placeholder / comment). Basically if I have 'select * from foo where bar = ? --This is a comment', I want to capture the 'select * from foo where bar = ?' and then '--This is a comment'. Here is the grammar,

$parser = new Parse::RecDescent (q{ startrule : PlaceHolder Comment | PlaceHolder Comment : DD_Style DD_Style : m{(\s+?--.*$)} { print "Comment : \n\t", $item[1], "\n\n" } PlaceHolder : Xopen | Sprintf Sprintf : /.*=\s+?%s.*?/ { $item[1] =~ s{%s}{%%s}g; print "After : \n\t", $item[1], "\n\n" } Xopen : /.*=\s+?\?.*?/ { $item[1] =~ s{\?}{%s}g; print "After : \n\t", $item[1], "\n\n" } }) unless $parser;
I set RD_TRACE on and see that is matches, Select name from foo_name where id = ? , but does not match the '--This is a Comment' I know I am missing something obvious, but I can't find it.

Any help would by much appreciated!

STH

Replies are listed 'Best First'.
Re: Another Parse::RecDescent Question
by Abigail-II (Bishop) on Nov 03, 2003 at 21:56 UTC
    Just like many other parsers, PRD allows for optional whitespace between 'tokens'. So, after PRD has matched a token, it will first eat all whitespace before attempting to match another token. Therefore, if you don't change the whitespace eating behaviour, PRD will never match a token that starts with whitespace. And DD_Style is a token that starts with whitespace, the regex starts with \s+?.

    You can do one of two things. You can either make use of the <skip> directive (or some by some other means) to alter the 'eating of whitespace between tokens' behaviour, or you remove the leading \s+? from the DD_Style token. I'd prefer to do the latter, after all, the comment starts with the --, and not with the whitespace.

    Abigail

      I plan to change it. I was thinking in terms of regex, not tokens. This is my first PRD attempt, learning as I go....

      Thanks Abigail-II!

Re: Another Parse::RecDescent Question
by Paladin (Vicar) on Nov 03, 2003 at 21:20 UTC
    I'm no P::RD expert, but I did get your code to find the comment by changing the regex for DD_Style from
    m{(\s+?--.*$)}
    to
    m{(\s*?--.*$)}

    I'm not sure where the space before the -- went though.

      Thanks Paladin! That's what I needed a objective set of eyes. I threw the ? in last because the Placeholder rule was matching the space. I should have looked more carefully, well actually I stared at it too long :-). Thanks again.

Re: Another Parse::RecDescent Question
by allolex (Curate) on Nov 04, 2003 at 06:36 UTC

    I think Abigail-II has your answer, but please allow me to make a suggestion about RecDescent grammar style. You could probably improve the legibility of your grammar by slightly altering the your grammar notation. The idea is to divide the expressions into groups which will represent the individual items in the returned array.

    my $prd_grammar = q( startrule : PlaceHolder Comment | PlaceHolder PlaceHolder : Xopen | Sprintf Comment : DD_Style DD_Style : '--' /.*/ { print "Comment : \n\t" . $item[2] . "\n\n" } Sprintf : '=' /.*/ '%s' /.*/ { print "After : \n\t%%s" . $item[4] . "\n\n" } Xopen : /.*/ '?' /.*/ { print "After : \n\t%s", $item[3], "\n\n" } ); new Parse::RecDescent($prd_grammar);

    Often just putting everything into one regex is fine, but your problem with whitespace made me think of this. This notational style allows you (mostly) to avoid explicitly accounting for spaces in your grammar.

    I don't think I like the '.*' notation here (maybe /\w+/?), but the notational style is the idea I'm trying to get across, so I'll leave your regexen up to you (mostly). :) There's always one more way to do it.

      This was just a start/experiment not a final version, but that is exactly the advice I'm looking for.

      Thank You allolex

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://304241]
Approved by Ovid
Front-paged by broquaint
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (5)
As of 2024-04-19 13:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found