Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re: Clunky parsing problem, looking for simple solution

by talexb (Chancellor)
on Jun 17, 2002 at 03:56 UTC ( [id://175020]=note: print w/replies, xml ) Need Help??


in reply to Clunky parsing problem, looking for simple solution

This is a little late, but I had so much fun writing it I thought I'd post it anyway.
#!/usr/bin/perl -w # Something to decompose BASIC If-Then-Else statements into simpler # statements, expanding IF .. THEN .. ELSE and IF .. THEN statements # recursively as necessary, and also expanding sub-statments separate +d by # ':' characters. # # In response to node 174889 on PerlMonks. use strict; my @Block = (); # Block to hold output statments in reverse or +der. my $Depth = 0; # Variable to indicate out depth of recursion. # Read DATA in from in-line data section, parse line and output resul +t. while (<DATA>) { print $_ . "expands to:\n"; my ( $LineNumber, $LineSource ) = m/^(\d+) (.+)$/; $Depth = 0; ParseStatement ( $LineSource ); DumpSimpleEquivalent ( $LineNumber ); } # Parse the BASIC statment, recursively if necessary. sub ParseStatement { my ( $ThisLine ) = @_; # This variable will reflect any additional lines added to the bloc +k # of statements. my $StatementCount = 0; # Check for the more difficult case of an IF .. THEN .. ELSE if ( $ThisLine =~ /^\s*IF(.+?)THEN(.+?)ELSE(.+?)$/ ) { my ( @Tokens ) = ( $1, $2, $3 ); # print "3 part statement$: $Tokens[0]-$Tokens[1]-$Tokens[2].\n"; # Only if we're at the first level do we need to output an end bl +ock. if ( $Depth == 0 ) { push ( @Block, "REM End of If Then Else block\n\n" ); } # If there's another level of IF statement within, call self recu +rsively. if ( $3 =~ /IF/ ) { my $LocalCount = 4; $Depth++; $LocalCount += ParseStatement ( $Tokens[2] ); push ( @Block, "ELSE" ); push ( @Block, "GOTO +$LocalCount" ); # To be fixed up later } else # Otherwise, handle normally: Split statement on ':', add then, t +he ELSE # statement and the GOTO to go around this block. { my $LocalCount = 3; my @SubStatements = split ( /:/, $Tokens[2] ); $LocalCount += @SubStatements - 1; push ( @Block, @SubStatements ); push ( @Block, "ELSE" ); push ( @Block, "GOTO +$LocalCount" ); # To be fixed up later } push ( @Block, $Tokens[1] ); push ( @Block, "IF$Tokens[0]THEN" ); } # OK, it's not an IF .. THEN .. ELSE; try just an IF .. THEN. We do +n't need # to worry about recursion. elsif ( $ThisLine =~ /^\s*IF(.+?)THEN(.+?)$/ ) { my ( @Tokens ) = ( $1, $2 ); # print "2 part statement: $Tokens[0]-$Tokens[1].\n"; if ( $Depth == 0 ) { push ( @Block, "REM End of If Then Else block\n\n" ); } my @SubStatements = split ( /:/, $Tokens[1] ); $StatementCount = @SubStatements - 1; push ( @Block, @SubStatements ); push ( @Block, "IF$Tokens[0]THEN" ); } return ( $StatementCount ); } # Routine to dump the expanded (simplified) version of BASIC code. sub DumpSimpleEquivalent { my $LineNumber = shift; my @NewBlock = reverse @Block; my @SubLine = ( (0..9), ('a'..'z') ); my $Index = 0; foreach ( @NewBlock ) { # Update line numbering, delete leading space. $NewBlock[ $Index ] =~ s/\+(\d)/$LineNumber.".".$SubLine[ $Index+$ +1 ]/e; $NewBlock[ $Index ] =~ s/^\s+//; print "$LineNumber.$SubLine[ $Index ] $NewBlock[ $Index++ ]\n"; } @Block = (); } __DATA__ 7130 IF B3<>0 THEN PRINT "FROM E TO S":W1=B4:X=B5:GOTO 6920 9810 IF K$="Y" AND RND(1) <.5 THEN GOTO 9820 ELSE GOTO 9770 8020 IF SO=0 THEN SO=1 ELSE SO=0 4835 IF V$="K" THEN A$="+K+" ELSE IF V$="M" THEN A$="!M!" ELSE IF V$=" +R" THEN A$="?R?" 1850 IF K3=0 AND EX(Q1,Q2)=0 THEN GOTO 8500 ELSE GOSUB 6000 1800 IF V$="K" THEN A$="+K+" ELSE IF V$="R" THEN A$="?R?" ELSE IF V$=" +M" THEN A$="!M!":Z1=R1:Z2=R2

--t. alex

"Mud, mud, glorious mud. Nothing quite like it for cooling the blood!" --Michael Flanders and Donald Swann

Update: After reading the other posts (I couldn't bear to read anything else till I was done my solution) I acknowledge that there are shortcomings in my solution..

  • No optimizations like deleting the second of two GOTO statements
  • A colon inside a string will muck up my statement separation operation

Update 2: Well, of course it goes without saying, but jepri mentioned that Parse::RecDescent could be used for parsing your BASIC syntax .. but that could be likened to using a transport to carry a single ream of paper.

I could also have implemented a b-tree structure to store the IF .. THEN .. ELSE statement pieces, but I decided just to write a quick and dirty solution, an array, instead. My solution ain't complete, but it will most likely take care of 90% of the job, and that sounded like what you needed.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://175020]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (11)
As of 2024-04-18 11:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found