Re: Perl not BNF-able??

Replies are listed 'Best First'.
Re^2: Perl not BNF-able?? by anonymized user 468275 (Curate) on Jul 01, 2005 at 11:45 UTC
The problem with 'Try it' is that it's an all or nothing proposition. For lisp, the BNF is small enough not to worry about that. I can see that you can't wrap up a piece of non-context-free syntax in isolation. However, what I expected was that it might be possible to describe different syntax-elements for each type of contextual variation. I had a bad feeling about eval from the start, but I will still need to see where the process breaks down when worked through from first principles before I understand exactly what will go wrong (assuming the context-free issue is insurmountable, that is). One world, one people	[reply]
Re^3: Perl not BNF-able?? by dave_the_m (Monsignor) on Jul 01, 2005 at 12:30 UTC
A little warning. The yacc parser config in perl (perly.y) is 763 lines long, while the lexer (toke.c) is 10998 lines long. That should be a clue as to how much special-case code perl needs that the parser can't handle itself. Dave.	[reply]
Re^4: Perl not BNF-able?? by chromatic (Archbishop) on Jul 01, 2005 at 16:42 UTC
I was in a conversation the other day where Larry said that the reason he did that was to be able to say that Perl 5 had a smaller parser than did Perl 4. Now that he's refactoring the tokenizer, I think he regrets that choice somewhat.	[reply]
Re: Perl not BNF-able?? by jonadab (Parson) on Jul 02, 2005 at 21:58 UTC
For lisp, the BNF is small enough not to worry about that Well, yeah, lisp barely even has syntax. That's the opposite end of the spectrum. I can see that you can't wrap up a piece of non-context-free syntax in isolation. However, what I expected was that it might be possible to describe different syntax-elements for each type of contextual variation. Yes, in principle. Perl can be parsed, obviously, because perl does parse it, QED. But the question is, what is required to make that happen? Perl has both left-associative and right-associative operators, for instance (although Perl5 is less whole-hog in this regard than Perl6 will be). Perhaps the larger issue, though, is that there are various things that can change the rules for how to parse other things, and they don't all come right before the thing whose interpretation they change. Just for one little example, consider the /x modifier, which changes the interpretation of whitespace and number signs that occur in the middle of the regex, before it. For a Turing-complete parser, this is not a problem, because you just scan ahead and see what's coming. I'm not a BNF guru, but if it has a concept of lookahead, I'm not aware of it. Another problem is that the way BNF likes to handle things is to lump cases together and treat them as the same, but this presents serious problems when parsing Perl code, especially as regards the various quoting constructs, especially the quoting constructs related to pattern matching, because of the number of special cases. In the example above, a typical BNF approach would be to lump the allowed regular expression modifiers together (by defining a rule that can match any of them and then using that rule at given points in various larger rules), but as the example above illustrates, you can't do that, because a regular expression modifier can change the rules for parsing the regular expression that precedes it. It also can change the interpretation of what is parsed; /s and /m do that. BNF is not troubled by that wrinkle, since it is not concerned as much with semantics as it is with syntax -- but that is BNF's whole problem when it comes to Perl, because the syntax and the semantics are not fully separable: the parser needs (some) knowledge of the semantics in order to sort out the syntax. "In adjectives, with the addition of inflectional endings, a changeable long vowel (Qamets or Tsere) in an open, propretonic syllable will reduce to Vocal Shewa. This type of change occurs when the open, pretonic syllable of the masculine singular adjective becomes propretonic with the addition of inflectional endings." — Pratico & Van Pelt, BBHG, p68	[reply]
Re^2: Perl not BNF-able?? by TimToady (Parson) on Jul 03, 2005 at 02:02 UTC
Well, you can claim that Perl 6 is going whole hogger on context sensitivity than Perl 5, but the fact is that Perl 6 is cleaning up all those silly post-declarational switches you're carping about, and forcing them to be predeclarations (except for `/x`, which is going away entirely because it's mandatory). Left-to-right parsing is darn near mandatory in Perl 6. Lookahead dependencies are generally limited to one token. About the only exception I can think of offhand is deciding whether curlies are composing a hash or declaring a closure, and this is handled by thinking of both of them as closures, but then forcing an evaluation-time call on the closure if semantic analysis determines it's a hash composer. As for `/e`, that's also gone in favor of generalized expression interpolation.	[reply] [d/l] [select]
Re: Perl not BNF-able?? by jonadab (Parson) on Jul 04, 2005 at 12:40 UTC
Re^2: Perl not BNF-able?? by anonymized user 468275 (Curate) on Jul 04, 2005 at 08:48 UTC
This 471967 is the best explanation so far of why perl defies BNF although some other posts arguably provide better insight into the problems of parsing perl, which of course is also an important motivation for my OP. Yes I chose lisp in my comment as a contrary example because its BNF is about the simplest. Here is one version of it: http://cui.unige.ch/db-research/Enseignement/analyseinfo/LISP/BNFlisp.html One world, one people	[reply]
Re^3: Perl not BNF-able?? by QM (Parson) on Jul 05, 2005 at 20:57 UTC
Re: Perl not BNF-able?? by jonadab (Parson) on Jul 02, 2005 at 23:14 UTC
Oh, one other thing... I had a bad feeling about eval from the start If you think eval is problematic, consider the /e substitution modifier, which may be combined with nonstandard delimiters... `s ((foo))[{ some_possibly_lengthy_chunk_of_code_here(); somefunc($1) }]ex;` [download]	[reply] [d/l]


P is for Practical
	PerlMonks