http://qs321.pair.com?node_id=462581


in reply to Order of Precedence in Parse::RecDescent grammar

Ideally, you'd want:

expr : bin_op_2 { $item[1] } bin_op_2 : bin_op_2 /[+-]/ bin_op_1 { [ @item[2, 1, 3] ] } | bin_op_1 { $item[1] } bin_op_1 : bin_op_1 /[*\\\/%]/ term { [ @item[2, 1, 3] ] } | term { $item[1] }

but the module doesn't allow left-recursive grammars. Here's how left-recursion is removed:

bin_op_2 : bin_op_1 bin_op_2_(s?) { treeify($item[1], map { @$_ } @{$item[2]}); } bin_op_1 : term bin_op_1_(s?) { treeify($item[1], map { @$_ } @{$item[2]}); } bin_op_2_ : /[+-]/ bin_op_1 { [ $item[1], $item[2] ] } bin_op_1_ : /[*\\\/%]/ term { [ $item[1], $item[2] ] }

Since it returns a list, we need to spend extra effort making it into a tree again. treeify is defined as:

sub treeify { my $t = shift; $t = [ shift, $t, shift ] while @_; return $t; }

Parse::RecDescent provides a shortcut in the form of <leftop>:

# Lowest precedence. bin_op_2 : <leftop: bin_op_1 SUM bin_op_1 > { treeify(@{$item[1]}); } bin_op_1 : <leftop: term PROD term > { treeify(@{$item[1]}); } # Highest precedence. SUM : '+' | '-' PROD : '*' | '/' | '\\' | '%'

Here's the final product (with a couple of operators added for demonstration purposes):

use strict; use warnings; use Data::Dumper (); use Parse::RecDescent (); my $grammar = <<'__EOI__'; { use strict; use warnings; sub treeify { my $t = shift; $t = [ shift, $t, shift ] while @_; return $t; } } parse : expr EOF { $item[1] } expr : assign { $item[1] } # Lowest precedence. assign : IDENT '=' assign { [ $item[2], $item[1], $item[3] ] } | log_or { $item[1] } log_or : <leftop: log_and LOG_OR log_and > {treeify(@{$item[1]})} log_and : <leftop: sum LOG_AND sum > {treeify(@{$item[1]})} sum : <leftop: prod SUM prod > {treeify(@{$item[1]})} prod : <leftop: term PROD term > {treeify(@{$item[1]})} # Highest precedence. term : '(' expr ')' { $item[2] } | NUMBER { [ $item[0], $item[1] ] } # Tokens NUMBER : /\d+/ IDENT : /\w+/ LOG_OR : '||' LOG_AND : '&&' SUM : '+' | '-' PROD : '*' | '/' | '\\' | '%' EOF : /^\Z/ __EOI__ # $::RD_HINT = 1; # $::RD_TRACE = 1; my $parser = Parse::RecDescent->new($grammar) or die("Bad grammar\n"); foreach ( '1*2*3', '1%2*3', '1+2*3', 'a=b=c=1', ) { print(Data::Dumper->Dump([ $parser->parse($_) ], [ $_ ])); print("\n"); }

You could use the following for assignment, but I think it's slower:

sub treeify_r { my $t = pop; $t = [ pop, pop, $t ] while @_; return $t; } assign : <rightop: IDENT ASSIGN assign > { treeify_r(@{$item[1]}); } ASSIGN : '='

Update: Changed
expr : bin_op_5 { $item[1] }
to
expr : assign { $item[1] }

Changed
'\'
to
'\\'

Replies are listed 'Best First'.
Re^2: Order of Precedence in Parse::RecDescent grammar
by suaveant (Parson) on Jun 02, 2005 at 14:26 UTC
    This seems to work... I am guessing it is basically the shortcut way to do what blokhead proposed... I was able to get it to return a little less depth in the tree, though it isn't quite perfect.

    Here is my grammar, for posterity's sake.

    my $grammar = <<'GRAMMAR' { use strict; use warnings; sub treeify { my $t = shift; # my($l,$r) = (shift,shift); while(@_) { my($l,$r) = (shift,shift); $t = [ 'op',$l, (ref $t eq 'ARRAY' && !$#$t ? @$t : $t), (ref + $r eq 'ARRAY' && !$#$r ? @$r : $r) ] } return $t; } } startrule: bin_op bin_op: comp_op # Lowest precendance. comp_op: <leftop: add_op COMP add_op> { treeify(@{$item[1]}); } add_op: <leftop: prod_op SUM prod_op > { treeify(@{$item[1]}); } prod_op : <leftop: mod_op PROD mod_op > { treeify(@{$item[1]}); } # Highest precendance. mod_op : <leftop: term MOD term > { treeify(@{$item[1]}); } SUM : '+' | '-' PROD : '*' | '/' MOD : '%' COMP : />=?|<=?|!=|==?|le|ge|eq|ne|lt|gt/ term: function | '(' bin_op ')' { $item[2] } | number | string if: /if/i '(' list_bin_op list_bin_op bin_op ')' { ['func',@item[1,3 +..5]] } concat: /concat/i '(' list_bin_op(s) bin_op ')' { ['func',$item[1],@ +{$item[3]},$item[4]] } left: /left/i '(' list_bin_op bin_op ')' { ['func',@item[1,3,4]] } right: /right/i '(' list_bin_op bin_op ')' { ['func',@item[1,3,4]] } ifnull: /ifnull/i '(' list_bin_op bin_op ')' { ['func',@item[1,3,4]] + } function: if | concat | left | right | ifnull number: /[+-]?\d+/ { $item[1] } string: { $_ = extract_quotelike($text); chop; substr($_,0,1,''); $_ +; } { [@item] } #$thisparser->startrule($item[2],1,$type) list_bin_op: bin_op ',' { $item[1] } GRAMMAR ;
    This is actually parsing MySQL-like statements, so there is a bit in there for some function definitions, seems to work, though. Thanks all!

                    - Ant
                    - Some of my best work - (1 2 3)

      You're giving % higher precedence than *. Are you sure that's correct? According to your grammar, 4*2%3 is equivalent to 4*(2%3) (8) rather than (4*2)%3 (0).

      The words op and func are probably redundant. Do you really need to know that something is a op rather than specifically an addition, substratction, multiplication, etc.?

      Does extract_quotelike remove the quotes? If so, you have a problem if someone create a string called op or func. That's why I had term in the tree. This is the only level you removed, and it hurts you.

      Update: Oops, I'm mistaken on that last one. If it's a ref, it's an op or func. It's it's not, it's a number or string. It's not very orthogonal to flattern terms, but it works for this grammar.

        Oh.. my bad... apparently they are equal in precedence... oops. Oh well... that wasn't the only problem, so it was still useful :)

                        - Ant
                        - Some of my best work - (1 2 3)