Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re: Re: parsing an ASP file

by dada (Chaplain)
on May 20, 2004 at 10:12 UTC ( #354910=note: print w/replies, xml ) Need Help??


in reply to Re: parsing an ASP file
in thread parsing an ASP file

yep. one thing I forgot to mention is that, for the application I'm currently writing (which is basically an ASP cross-reference generator) I need to have the line number where each block appears. so, the code I'm using is something more like:
sub get_asp_blocks { my($file) = @_; open(FILE, $file) or die "can't open '$file': $!\n"; my $dot = 1; my @blocks = ( ["HTM", $dot, ""] ); my $state = "HTM"; my $last; while(read(FILE, $char, 1)) { $dot++ if $char eq "\n"; if($last eq "<" && $char eq "%" && $state eq "HTM") { chop $blocks[-1][-1]; $state = "ASP"; push(@blocks, ["ASP", $dot, ""]); } elsif($last eq "%" && $char eq ">" && $state eq "ASP") { chop $blocks[-1][-1]; $state = "HTM"; push(@blocks, ["HTM", $dot, ""]); } else { $blocks[-1][-1] .= $char; } $last = $char; } close(FILE); return @blocks; }
this way, each element of the returned array contains three elements: the type (ASP or HTM), the line number, and the block itself.

cheers,
Aldo

King of Laziness, Wizard of Impatience, Lord of Hubris

Replies are listed 'Best First'.
Re: Re: Re: parsing an ASP file
by Juerd (Abbot) on May 23, 2004 at 22:57 UTC

    my $state = "HTM";

    The state is what I don't like. It means that everything needs to be done manually. So to get the line numbers, I'd probably just extend the regex with one set of all-enclosing parens (or for simple stand-alone scripts just use $&), and then count the number of \n characters found in it.

    my @parsed; my $line = 1; while ($asp =~ /\G( ((?: [^<]+ | <(?!%) )*) (?: <%(.*?)%> | ((?=<%)) ) +? )/gsx) { $2 and push @parsed, [ $line, html => $2 ]; $3 and push @parsed, [ $line, asp => $3 ]; defined $4 and die "Unclosed ASP code block starting on line $line + near '", $asp =~ /\G(<%\s*\n?.*)/g, "'.\n"; $line += $1 =~ tr/\n//; }

    Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }

      well, your code looks surely good, but seems to be failing line count. on a simple ASP page of mine I get these results:

      mine yours
      HTM 1 HTM 1
      ASP 31 ASP 1
      HTM 31 HTM 31
      ASP 44 ASP 31
      HTM 46 HTM 46
      ASP 50 ASP 46
      HTM 50 HTM 50
      ASP 55 ASP 50
      HTM 59 HTM 59
      ASP 73 ASP 59
      HTM 75 HTM 75

      that is, it counts correctly for HTM blocks, but doesn't increment the line number for ASP blocks. I tried moving the line $line += ... before the push, but it didn't help.

      cheers,
      Aldo

      King of Laziness, Wizard of Impatience, Lord of Hubris

        seems to be failing line count.

        You're right. Because the regex can match a block of html and a block of asp in one go, in between $line already needs to be updated. So I removed the extra set of parens and the counter line again and added two new \n-counters: one for $1 and one for $2.

        my @parsed; my $line = 1; while ($asp =~ /\G((?: [^<]+ | <(?!%) )*) (?: <%(.*?)%> | ((?=<%)) )?/ +gsx) { $1 and push @parsed, [ $line, html => $1 ]; $line += $1 =~ tr/\n//; $2 and push @parsed, [ $line, asp => $1 ]; $line += $2 =~ tr/\n//; defined $3 and die "Unclosed ASP code block starting on line $line + near '", $asp =~ /\G(<%\s*\n?.*)/g, "'.\n"; }

        Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://354910]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (5)
As of 2022-01-21 10:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    In 2022, my preferred method to securely store passwords is:












    Results (57 votes). Check out past polls.

    Notices?