http://qs321.pair.com?node_id=184966

ichimunki has asked for the wisdom of the Perl Monks concerning the following question:

Below is my code. I am happy with my grammar except that if a token is surrounded by punctuation then the token gets slurped into it. I want the third line of $text to result in ... <punct: ..><link: spaced><punct: ..> ..., not in ... <punct: ..[[><word: spaced><punct: ]]> ... (as happens currently). Anybody know how I can fix it? Also, if it looks like my grammar is screwy, or there are other things I can do to improve this, let me know-- I am just getting my arms around the basics of this very cool module, so pointers will be appreciated. Thanks!
#!/usr/bin/perl -w use strict; use Parse::RecDescent; my $grammar = join( '', <DATA> ); my $parser = Parse::RecDescent->new( $grammar ) or die "Error: Bad grammar\n"; #while(<STDIN>){ $text .= $_; } my $text =<< "SUB_STDIN"; A nicely [[spaced]] link. A poorly[[spaced]]link. Another poorly..[[spaced]]..link. SUB_STDIN my $results = $parser->startrule( $text ) or die "Error: Bad text\n"; __DATA__ startrule: <skip:''> bit(s) bit: eol | word | space | token | punct eol: /\n[ \t]*/ {print "<newline>\n" } space: /[ \t]+/ {print "< >" } word: /[\w\']+/ {print "<word: $item[1]>" } punct: /[^\w\s]+/ {print "<punct: $item[1]>" } token: link link: /\[\[(.+?)\]\]/ {print "<link: $item[1]>" }