Re: Order of Precedence in Parse::RecDescent grammar

in reply to Order of Precedence in Parse::RecDescent grammar

Ideally, you'd want:

   expr     : bin_op_2 { $item[1] }

   bin_op_2 : bin_op_2 /[+-]/ bin_op_1  { [ @item[2, 1, 3] ] }
            | bin_op_1                  { $item[1] }

   bin_op_1 : bin_op_1 /[*\\\/%]/ term  { [ @item[2, 1, 3] ] }
            | term                      { $item[1] }
[download]

but the module doesn't allow left-recursive grammars. Here's how left-recursion is removed:

   bin_op_2  : bin_op_1 bin_op_2_(s?)
               { treeify($item[1], map { @$_ } @{$item[2]}); }

   bin_op_1  : term bin_op_1_(s?)
               { treeify($item[1], map { @$_ } @{$item[2]}); }

   bin_op_2_ : /[+-]/     bin_op_1 { [ $item[1], $item[2] ] }
   bin_op_1_ : /[*\\\/%]/ term     { [ $item[1], $item[2] ] }
[download]

Since it returns a list, we need to spend extra effort making it into a tree again. treeify is defined as:

      sub treeify {
         my $t = shift;
         $t = [ shift, $t, shift ]
            while @_;
         return $t;
      }
[download]

Parse::RecDescent provides a shortcut in the form of <leftop>:

   # Lowest precedence.
   bin_op_2 : <leftop: bin_op_1 SUM bin_op_1 >
              { treeify(@{$item[1]}); }

   bin_op_1 : <leftop: term PROD term >
              { treeify(@{$item[1]}); }
   # Highest precedence.

   SUM      : '+' | '-'
   PROD     : '*' | '/' | '\\' | '%'
[download]

Here's the final product (with a couple of operators added for demonstration purposes):

use strict;
use warnings;

use Data::Dumper      ();
use Parse::RecDescent ();


my $grammar = <<'__EOI__';

   {
      use strict;
      use warnings;

      sub treeify {
         my $t = shift;
         $t = [ shift, $t, shift ]
            while @_;
         return $t;
      }
   }

   parse   : expr EOF { $item[1] }

   expr    : assign { $item[1] }


   # Lowest precedence.
   assign  : IDENT '=' assign { [ $item[2], $item[1], $item[3] ] }
           | log_or           { $item[1] }
   log_or  : <leftop: log_and LOG_OR  log_and > {treeify(@{$item[1]})}
   log_and : <leftop: sum     LOG_AND sum     > {treeify(@{$item[1]})}
   sum     : <leftop: prod    SUM     prod    > {treeify(@{$item[1]})}
   prod    : <leftop: term    PROD    term    > {treeify(@{$item[1]})}
   # Highest precedence.

   term    : '(' expr ')'   { $item[2] }
           | NUMBER         { [ $item[0], $item[1] ] }


   # Tokens

   NUMBER   : /\d+/
   IDENT    : /\w+/

   LOG_OR   : '||' 
   LOG_AND  : '&&'
   SUM      : '+' | '-'
   PROD     : '*' | '/' | '\\' | '%'

   EOF      : /^\Z/

__EOI__


# $::RD_HINT  = 1;
# $::RD_TRACE = 1;

my $parser = Parse::RecDescent->new($grammar)

or die("Bad grammar\n");

foreach (
  '1*2*3',
  '1%2*3',
  '1+2*3',
  'a=b=c=1',
) {
   print(Data::Dumper->Dump([ $parser->parse($_) ],
                            [ $_                 ]));
   print("\n");
}
[download]

You could use the following for assignment, but I think it's slower:

      sub treeify_r {
         my $t = pop;
         $t = [ pop, pop, $t ]
            while @_;
         return $t;
      }

   assign   : <rightop: IDENT ASSIGN assign >
              { treeify_r(@{$item[1]}); }

   ASSIGN   : '='
[download]

Update: Changed
expr : bin_op_5 { $item[1] }
to
expr : assign { $item[1] }

Changed
'\'
to
'\\'

Comment on Re: Order of Precedence in Parse::RecDescent grammar Select or Download Code

Replies are listed 'Best First'.
Re^2: Order of Precedence in Parse::RecDescent grammar by suaveant (Parson) on Jun 02, 2005 at 14:26 UTC
This seems to work... I am guessing it is basically the shortcut way to do what blokhead proposed... I was able to get it to return a little less depth in the tree, though it isn't quite perfect. Here is my grammar, for posterity's sake. my $grammar = <<'GRAMMAR' { use strict; use warnings; sub treeify { my $t = shift; # my($l,$r) = (shift,shift); while(@_) { my($l,$r) = (shift,shift); $t = [ 'op',$l, (ref $t eq 'ARRAY' && !$#$t ? @$t : $t), (ref + $r eq 'ARRAY' && !$#$r ? @$r : $r) ] } return $t; } } startrule: bin_op bin_op: comp_op # Lowest precendance. comp_op: <leftop: add_op COMP add_op> { treeify(@{$item[1]}); } add_op: <leftop: prod_op SUM prod_op > { treeify(@{$item[1]}); } prod_op : <leftop: mod_op PROD mod_op > { treeify(@{$item[1]}); } # Highest precendance. mod_op : <leftop: term MOD term > { treeify(@{$item[1]}); } SUM : '+' \| '-' PROD : '*' \| '/' MOD : '%' COMP : />=?\|<=?\|!=\|==?\|le\|ge\|eq\|ne\|lt\|gt/ term: function \| '(' bin_op ')' { $item[2] } \| number \| string if: /if/i '(' list_bin_op list_bin_op bin_op ')' { ['func',@item[1,3 +..5]] } concat: /concat/i '(' list_bin_op(s) bin_op ')' { ['func',$item[1],@ +{$item[3]},$item[4]] } left: /left/i '(' list_bin_op bin_op ')' { ['func',@item[1,3,4]] } right: /right/i '(' list_bin_op bin_op ')' { ['func',@item[1,3,4]] } ifnull: /ifnull/i '(' list_bin_op bin_op ')' { ['func',@item[1,3,4]] + } function: if \| concat \| left \| right \| ifnull number: /[+-]?\d+/ { $item[1] } string: { $_ = extract_quotelike($text); chop; substr($_,0,1,''); $_ +; } { [@item] } #$thisparser->startrule($item[2],1,$type) list_bin_op: bin_op ',' { $item[1] } GRAMMAR ; [download] This is actually parsing MySQL-like statements, so there is a bit in there for some function definitions, seems to work, though. Thanks all! - Ant - Some of my best work - (1 2 3)	[reply] [d/l]
Re^3: Order of Precedence in Parse::RecDescent grammar by ikegami (Patriarch) on Jun 02, 2005 at 17:00 UTC
You're giving `%` higher precedence than ``. Are you sure that's correct? According to your grammar, `42%3` is equivalent to `4(2%3)` (8) rather than `(42)%3` (0). The words `op` and `func` are probably redundant. Do you really need to know that something is a op rather than specifically an addition, substratction, multiplication, etc.? Does `extract_quotelike` remove the quotes? If so, you have a problem if someone create a string called `op` or `func`. That's why I had `term` in the tree. This is the only level you removed, and it hurts you. Update: Oops, I'm mistaken on that last one. If it's a ref, it's an op or func. It's it's not, it's a number or string. It's not very orthogonal to flattern terms, but it works for this grammar.	[reply] [d/l] [select]
Re^4: Order of Precedence in Parse::RecDescent grammar by suaveant (Parson) on Jun 02, 2005 at 19:54 UTC
Oh.. my bad... apparently they are equal in precedence... oops. Oh well... that wasn't the only problem, so it was still useful :) - Ant - Some of my best work - (1 2 3)	[reply]

In Section Seekers of Perl Wisdom