Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re: Re: Re: Re: parsing an ASP file

by dada (Chaplain)
on May 25, 2004 at 09:21 UTC ( [id://356148]=note: print w/replies, xml ) Need Help??


in reply to Re: Re: Re: parsing an ASP file
in thread parsing an ASP file

well, your code looks surely good, but seems to be failing line count. on a simple ASP page of mine I get these results:

mine yours
HTM 1 HTM 1
ASP 31 ASP 1
HTM 31 HTM 31
ASP 44 ASP 31
HTM 46 HTM 46
ASP 50 ASP 46
HTM 50 HTM 50
ASP 55 ASP 50
HTM 59 HTM 59
ASP 73 ASP 59
HTM 75 HTM 75

that is, it counts correctly for HTM blocks, but doesn't increment the line number for ASP blocks. I tried moving the line $line += ... before the push, but it didn't help.

cheers,
Aldo

King of Laziness, Wizard of Impatience, Lord of Hubris

Replies are listed 'Best First'.
Re: Re: Re: Re: Re: parsing an ASP file
by Juerd (Abbot) on May 25, 2004 at 14:47 UTC

    seems to be failing line count.

    You're right. Because the regex can match a block of html and a block of asp in one go, in between $line already needs to be updated. So I removed the extra set of parens and the counter line again and added two new \n-counters: one for $1 and one for $2.

    my @parsed; my $line = 1; while ($asp =~ /\G((?: [^<]+ | <(?!%) )*) (?: <%(.*?)%> | ((?=<%)) )?/ +gsx) { $1 and push @parsed, [ $line, html => $1 ]; $line += $1 =~ tr/\n//; $2 and push @parsed, [ $line, asp => $1 ]; $line += $2 =~ tr/\n//; defined $3 and die "Unclosed ASP code block starting on line $line + near '", $asp =~ /\G(<%\s*\n?.*)/g, "'.\n"; }

    Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }

      /\G((?: [^<]+ | <(?!%) )*) (?: <%(.*?)%> | ((?=<%)) )?/gsx

      Just for fun, I wanted to try to make a Perl 6 rule from this. Here it is, untested and mostly guessed. I have no idea how to do line numbers, so I cheated and imagined a method of .pos for that :)

      rule code_begin ($type) { <{ { asp => '<%', php => '<?', plp => '<:' }.{$type} // fail "Unknown type: $type" }> } rule code_end ($type) { <{ { asp => '%>', php => '?>', plp => ':>' }.{$type} // fail "Unknown type: $type" }> } rule code_block ($type) { <code_begin $type> (.*?) <code_end $type> } rule code_document ($type) { [ # First, match any number of subsequent code blocks. [ { $?line := .pos.line } <code_block $type> :: { push @?blocks, [ $?line, code => $?code_block ] } ]* # If there is now a code_begin, obviously that is an open bloc +k. # (In PLP, that is valid, but let's assume for now that it's n +ot.) [ { $?line := .pos.line } <code_begin $type> \n* $?context := (\N<,15>) { fail "Unclosed code block on line $?line, near '$?contex +t'" } ]? # And then a piece of text. # (At least one character, to avoid having empty blocks.) [ { $?line := .pos.line } $?html := (.+?) [ <before <code_begin $type>> | $ ] :: { push @?blocks, [ $?line, html => $?html ] } ]? ]* } my @parsed = ($asp ~~ /<code_document 'asp'>/).{blocks}

      Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://356148]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (7)
As of 2024-04-25 15:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found