Parsing CCL (Common Command Language) commands

e_bachmann has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to parse CCL commands like:

       dylan
       "bob dylan"
       dylan or zimmerman
       bob and (dylan or zimmerman)
[download]

I need to seach a database for each token and combine the results.

What the best approach for this task?
I've been looking at a lot of parsers, but can't find anything similar.
I've found a grammar for such a comman, but failed to build any perl code from this

Any surgestions?

:-) Erik Bachmann

Examle grammar:

  CCL-Find ::= CCL-Find Op Elements
             | Elements.

  Op ::= "and" | "or" | "not"
  -- The above means that Elements are separated by boolean operators.

  Elements ::= '(' CCL-Find ')'
             | Set
             | Terms
             | Qualifiers Relation Terms
             | Qualifiers Relation '(' CCL-Find ')'
             | Qualifiers '=' string '-' string
  -- Elements is either a recursive definition, a result set reference
+, a
  -- list of terms, qualifiers followed by terms, qualifiers followed
  -- by a recursive definition or qualifiers in a range (lower - upper
+).

  Set ::= 'set' = string
  -- Reference to a result set

  Terms ::= Terms Prox Term
          | Term
  -- Proximity of terms.

  Term ::= Term string
         | string
  -- This basically means that a term may include a blank

  Qualifiers ::= Qualifiers ',' string
               | string
  -- Qualifiers is a list of strings separated by comma

  Relation ::= '=' | '>=' | '<=' | '<>' | '>' | '<'
  -- Relational operators. This really doesn't follow the ISO8777
  -- standard.

  Prox ::= '%' | '!'
  -- Proximity operator
[download]

Comment on Parsing CCL (Common Command Language) commands Select or Download Code

Replies are listed 'Best First'.
Re: Parsing CCL (Common Command Language) commands by mugwumpjism (Hermit) on Oct 17, 2002 at 11:46 UTC
Left logical grammars such as you have listed are very mechanically converted to `recursive descent' parsers. You might want to check out Parse::RecDescent. You should be able to go straight from your grammar to a working parser. Otherwise, make yourself a `next_token()`, an `eat_token()` function, and to make things easier, a `save_position()` and `restore_position()` pair to allow you to easily put already eaten tokens back onto the front of the stream (not necessary with some grammars, see books on recursive descent and LL grammars for more) and write code like this: sub expect_Elements { my $stream = shift; my $expression; $p = save_position($stream); if (next_token() eq "(") { eat_token($stream); $expression = [ expect_CCL_Find($stream) ]; # It might make more sense to die() here if (eat_token() ne ")") { $expression = undef; } } else { if ($expression = expect_Set($stream)) { } elsif ($expression = expect_Terms($stream)) { } elsif ($expression = expect_Qualifiers($stream)) { if ($rel = expect_Relation($stream)) { if (next_token($stream) eq "(") { eat_token(); $expression = [ expect_CCL_Find($stream) ]; (eat_token() eq ")") or $expression = undef; } elsif ( my $terms = expect_Terms($stream) ) { $expression = [ $rel, $expression, $terms ]; } else { $expression = undef; } } elsif (next_token($stream) eq "=") { eat_token(); $string1 = expect_string($stream); $op = next_token($stream); $string2 = expect_string($stream); if (!defined $string1 or !defined $string2 or !defined $op or $op ne "-") { $expression = undef; } else { $expression = [ "=", $expression, $string1, "-", $ +string2 ]; } } else { $expression = undef; } } else { $expression = undef; } } if (defined $expression) { return $expression; } else { restore_position($stream, $pointer); return undef; } } [download] I think it should be fairly obvious why you'd want a module (or flex/yacc) to write this for you for larger grammers, but every programmer should write at least one recursive descent parser :-).	[reply] [d/l]
Re: Parsing CCL (Common Command Language) commands by perlcgi (Hermit) on Oct 17, 2002 at 12:01 UTC
The best approach IMHO is not to reinvent the wheel. As CCL is a standard, (ANSI/NISO Z39.58-1992), it is hard to believe that code is not already available somewhere. That said, the many implementations in library catalogs for example, are commercially sensitive and you may well have to roll your own. On the positive side, much of the work has been done for you, given that you have the grammar. I would check out Damian Conway's module Parse::RecDescent. The following are useful starting sources: Data Munging with Perl by davorg - I found this treatment of Parse::RecDescent particularly easy to digest The FAQ Damian Conway's TPJ article Best of luck, perlcgi	[reply]
Re:x2 Parsing CCL (Common Command Language) commands by grinder (Bishop) on Oct 17, 2002 at 12:18 UTC
Just keep in mind that Parse::RecDescent is painfully slow for anything beyond a toy. It's not for nothing that Damian Conway is planning on writing a Parse::FastDescent. In the meantime, you'll get good mileage out of Parse::Yapp, which basically redoes yacc(1) in Perl. print@_{sort keys %_},$/if%_=split//,'= & *a?b:e\f/h^h!j+n,o@o;r$s-t%t#u'	[reply]
Re: Re:x2 Parsing CCL (Common Command Language) commands by e_bachmann (Novice) on Oct 18, 2002 at 09:50 UTC
Thank you for alle the surgestions! print@_{sort keys %_},$/if%_=split//,'= & *a?b:e\f/h^h!j+n,o@o;r$s-t%t#u' This REALLY says all about my relations to parsers! :-( I've gone for the YAPP and taken the Calp.yp example and replaced the grammer with the CCL grammer (leaving out the "Set" statement for now). Compiled it using: `yapp -v myParser.yp` [download] with one message: `1 shift/reduce conflict` [download] Then I've written a test: `use MyParser; $parser=new MyParser(); $value=$parser->YYParse(yylex => \&lexer_sub, yyerror => \&error_sub +); $nberr=$parser->YYNberr(); $parser->YYData->{DATA}= [ 'a and ( b or c)' ]; $data=$parser->YYData->{DATA}[0];` [download] All this gives is a: `Syntax error. Can't locate object method "new" via package "MyParser" (perhaps you f +orgot to l oad "MyParser"?) at myparser.pl line 2, <STDIN> line 1.` [download] How do I really activate this parser? How do I add my functions to process the searching? "If there are two or more ways to do something and one of them can lead to a disaster, then one or more persons will do it" Edgar A. Murphy, Captain in U.S. Air Force (1949)	[reply] [d/l] [select]
Re: Parsing CCL (Common Command Language) commands by JaWi (Hermit) on Oct 17, 2002 at 11:57 UTC
Well, you could use Yapp or YALALR to create a LALR parser for you according to a specific (Yacc like) grammar. Since yours is already clearly defined in BNF there shouldn't be much trouble converting it. Since you didn't specify a particular database, I assume you're free to choose. In that case I would suggest MySQL, since it has this nice RegExp feature, where you can query according to some (simplified) regular expression. This would allow you to convert your parse tree into a valid (MySQL) regexp. Simply select everything which looks like your regexp and your finished. Hope it helps (a bit), -- JaWi "A chicken is an egg's way of producing more eggs."	[reply]

Back to Seekers of Perl Wisdom