Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Comment a block that match a keyword

by yorkwu (Novice)
on Aug 16, 2007 at 12:08 UTC ( [id://633022]=perlquestion: print w/replies, xml ) Need Help??

yorkwu has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

I'm trying to comment a block that match a keyword "TIMINGCHECK".
The block is always bracketed by `()'. In the begining, I thought this should be easy. But finally, I was beaten.

I've tried to read and deal with the data line-by-line. Soon, I found it would very hard to know the size of TIMINGCHECK block if I deal with the data line-by-line.

So, I turned to slurp the whole data and try to identify the range of TIMINGCHECK block by using Text::Balanced module. Actually, I'm in the middle of learning how to use Text::Balanced. However, I found another problem. Text::Balanced seems can only help me extract the range I want into a scalar but after I comment it by using like bellow statement,

$extracted_range =~ s/^/\/\//mg;

how could I insert it back to the location where I extracted it out. If I finally find out a way to insert it back, Is this procedure efficiency? I have lot of TIMINGCHECK block need to be commented in the data file. Bellow procedure doesn't look elegant.
find the block -> extract -> insert back -> find next block -> extract...

I thought I should not work on my problem in a good perl way. Could any monk give me some suggestions or hints to deal with this kind of problem?
Thanks in advance!
York
Original data ============= ... (CELL ... (TIMINGCHECK .... .... ) ) (CELL (CELLTYPE "SEDFQD1") (INSTANCE uTrigger/TrcInclCtrlReg_reg[13]) (DELAY (ABSOLUTE (IOPATH CP Q (0.10:0.15:0.25)(0.09:0.15:0.24)) ) ) (TIMINGCHECK (SETUP (posedge SI) (posedge CP) (0.14:0.23:0.41)) (SETUP (negedge SI) (posedge CP) (0.09:0.16:0.30)) ....(random lines) (HOLD (negedge SI) (posedge CP) (0.00:0.00:0.00)) (HOLD (negedge D) (posedge CP) (0.00:0.00:0.00)) ) ) What I hope it become ======================== ... (CELL ... // (TIMINGCHECK // .... // .... // ) ) (CELL (CELLTYPE "SEDFQD1") (INSTANCE uTrigger/TrcInclCtrlReg_reg[13]) (DELAY (ABSOLUTE (IOPATH CP Q (0.10:0.15:0.25)(0.09:0.15:0.24)) ) ) // (TIMINGCHECK // (SETUP (posedge SI) (posedge CP) (0.14:0.23:0.41)) // (SETUP (negedge SI) (posedge CP) (0.09:0.16:0.30)) // ....(random lines) // (HOLD (negedge SI) (posedge CP) (0.00:0.00:0.00)) // (HOLD (negedge D) (posedge CP) (0.00:0.00:0.00)) // ) )

Replies are listed 'Best First'.
Re: Comment a block that match a keyword
by BrowserUk (Patriarch) on Aug 16, 2007 at 12:37 UTC

    This is very dependant upon the correct and logical formatting of the input. It could probably be simplified through refactoring and needs a lot of testing, but the principles it uses might help:

    ##! perl -slw use strict; while( <DATA> ) { chomp; print, next unless m[TIMINGCHECK]; my $count = tr[(][(] - tr[)][)]; { s[^][//]; print; last unless defined( $_ = <DATA> ); chomp; $count += tr[(][(] - tr[)][)]; redo unless $count < 0; } print; } __DATA__ ... (CELL ... (TIMINGCHECK .... .... ) ) (CELL (CELLTYPE "SEDFQD1") (INSTANCE uTrigger/TrcInclCtrlReg_reg[13]) (DELAY (ABSOLUTE (IOPATH CP Q (0.10:0.15:0.25)(0.09:0.15:0.24)) ) ) (TIMINGCHECK (SETUP (posedge SI) (posedge CP) (0.14:0.23:0.41)) (SETUP (negedge SI) (posedge CP) (0.09:0.16:0.30)) ....(random lines) (HOLD (negedge SI) (posedge CP) (0.00:0.00:0.00)) (HOLD (negedge D) (posedge CP) (0.00:0.00:0.00)) ) )

    Output:

    c:\test>junk6 ... (CELL ... // (TIMINGCHECK // .... // .... // ) ) (CELL (CELLTYPE "SEDFQD1") (INSTANCE uTrigger/TrcInclCtrlReg_reg[13]) (DELAY (ABSOLUTE (IOPATH CP Q (0.10:0.15:0.25)(0.09:0.15:0.24)) ) ) // (TIMINGCHECK // (SETUP (posedge SI) (posedge CP) (0.14:0.23:0.41)) // (SETUP (negedge SI) (posedge CP) (0.09:0.16:0.30)) // ....(random lines) // (HOLD (negedge SI) (posedge CP) (0.00:0.00:0.00)) // (HOLD (negedge D) (posedge CP) (0.00:0.00:0.00)) // ) )

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Here is a simple demonstration that Perl does indeed have a "post-condition loop construct" (contrary to the claim nearby in this thread), and that it is useful. Here is a trivial refactoring of the above code to make use of that. I don't repeat the input nor output, but if you provide it with the former, then it will duplicate the latter.

      #!/usr/bin/perl -lw use strict; while( <DATA> ) { chomp; if( ! m[TIMINGCHECK] ) { print; next; } my $count= tr[(][(] - tr[)][)]; do { s[^][//]; print; last # Leaves the while( <DATA> ) loop if ! defined( $_= <DATA> ); chomp; $count += tr[(][(] - tr[)][)]; } while( 0 <= $count ); print; }

      Note that the last jumps out of both "loops", which can be considered an improvement since the original code would warn about trying to print an undefined value (when given incomplete input).

      And, here is one way I might refactor this (in order to avoid repeating the code used to read input and the code to count parens):

      #!/usr/bin/perl -lw use strict; my $count= -1; while( <DATA> ) { chomp; $count= 0 if m[TIMINGCHECK]; $count += tr[(][(] - tr[)][)] if 0 <= $count; s[^][//] if 0 <= $count; print; }

      Which also duplicates the above output, though it might not always agree on all inputs (it doesn't warn on incomplete input, certainly).

      - tye        

      There's a couple of things That smell bad to me about this. They're personal things; I don't have any doubts that the code works. It's just things that trouble me:
      • a block pretending to be a loop, and the use of redo and last therein.
      • (this is definatley just me) chomp that isn't the first thing done in the "loop". I'd have put it first, even if any subsequent print (etc) had to include \n.

      I'd be happier with a do block, rather than a bare block. Perhaps it's just me :)

      Like I said, personal things.

      update: added stuff about do after further thought

        I wasn't particularly enamoured with it, hence my "could be refactored" comment. The reason I didn't refactor it at the time was I couldn't see a nice way how to.

        Noting your "personal preferences" emphasis, I hope you don't mind if I respond with my take on things?

        1. a block pretending to be a loop, and the use of redo and last therein.

          And that repetition of 'do stuff' is a problem. In this case, having read a line at the top of the while loop, we need to

          • do some stuff (initialise out parens count from the first line)
          • enter the loop construct
          • do some more stuff (prepend the comment card and print)
          • read the next line, check for eof.
          • chomp the line we just read.
          • Do some more stuff (adjust the parens count from the new line)
          • decide whether to loop or not

          And the only way I know how to do that in perl (without artificial means like setting flags and/or double condition tests) is redo.

          You said: I'd be happier with a do block, rather than a bare block., but that doesn't work:

          #! perl -slw use strict; my $i =0; do{ print ++$i; redo if $i < 5; }; __END__ c:\test>junk2 1 Can't "redo" outside a loop block at c:\test\junk2.pl line 7.

          You'd have to do

          #! perl -slw use strict; my $i =0; do{{ print ++$i; redo if $i < 5; }}; __END__ c:\test>junk2 1 2 3 4 5

          That is, embed a bare block within the do block, and that is redundant and very obscure.

          You could adopt a Perl 6 like construct:

          LOOP:{ ... redo LOOP; }

          which could be construed as clearer. But frankly, redo in a bare block is a perfectly valid and useful construct and, I think, it is better to just become familiar with it than to obscure it. Indeed, it is actually the most flexible looping construct. It can be used to construct all many other looping constructs Perl has. Even the much decried but extremely flexible C-style for loop with its otherwise unique ability to vary multiple indexes concurrently.

          ## draw the diagonals for( my $x=0, my $y=0; $i < $xMax; $x++, $y++ ) { draw( $x, $y ); +} for( my $x=0, my $y=$yMax; $i < $xMax; $x++, $y-- ) { draw( $x, $y ); +}

          It's a little used feature, but when you need it, you need it:

        2. chomp that isn't the first thing done in the "loop". I'd have put it first, even if any subsequent print (etc) had to include \n.

          Hm. I'm not sure what the position of chomp has to do with the loop construct. The chomp has to follow the readline. The readline has to occur in the middle of the loop.

        I'm still not happy with the construction I posted, but I haven't come up with a better one.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
      Wow! what a cool way to find out the balanced bracketed block.
      Thank you, BrowserUK! I really learn a lot. York
      I quite like that solution, but it will fail if there are unbalanced brackets in quoted strings, like "foo(".

      If you want to consider that, you have to purge all quoted strings first. You can achieve this with the regexes from Regexp::Common::delimited.

        Yes. Hence my "This is very dependant upon the correct and logical formatting of the input." caveat.

        If there are any errors in the balancing of parens, it will fail horribly, but as the text is obviously source to some parser somewhere, it's a reasonable, pragmatic, economic ROI decision to say: This 'comment out timing checks script' is only usable on source that parses correctly using a.n.other tool. A pragmatic decision to save having to reverse engineer that a.n.other tool's parser from scratch and without the originial specs.

        It will also fail in many cases that would (probably; no spec!) be successfully parsed by that other parser. For example, if the close parens placements are coalesed on a single line, rather than laid out in a logically structured way as per the OPs example:

        (CELL (TIMINGCHECK (SETUP (posedge SI) (posedge CP) (0.14:0.23:0.41)) (SETUP (negedge SI) (posedge CP) (0.09:0.16:0.30)) ....(random lines) (HOLD (negedge SI) (posedge CP) (0.00:0.00:0.00)) (HOLD (negedge D) (posedge CP) (0.00:0.00:0.00)) ))

        In this case, the close paren of the (CELL block will also be commented out and the result will fail to parse with that other tool.

        In an ideal world one would go back to the authors of a.n.other tool, request a copy of their parser, or the specifications from which it was drawn, and produce a 'proper parser' script that understood all the rules of the input language and performed the required operation.

        But this isn't an ideal world, and time is money, and performing ad-hoc text munging tasks like this are exactly what Perl was invented for. (Amongst other things. :)

        But then again, it seems that the authors of a.n.other tool were either on-the-ball or responsive to in-use experience of using their tool, because it seems they already may have provided an option that make this entire thread redundant.

        It's just a shame that post hasn't received the attention and votes it deserves. It's a non-perl solution, but, assuming the poster has correctly recognised the nature of the data and OP is using the correct a.n.other tool, by far the best solution to the OPs problem.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Comment a block that match a keyword
by NetWallah (Canon) on Aug 16, 2007 at 14:05 UTC
    Same result in 3 lines of code (Use BrowserUk's __DATA__ block):
    while( <DATA> ) { m/\(TIMINGCHECK/ .. m/^\s*\)\s*$/ or print , next; print "//$_" }
    This leverages the "Toggle" nature of the ".." operator (perldoc perlop).

         "An undefined problem has an infinite number of solutions." - Robert A. Humphrey         "If you're not part of the solution, you're part of the precipitate." - Henry J. Tillman

      Try it with this slight variation of data:

      ... (CELL ... (TIMINGCHECK .... .... ) ) (CELL (CELLTYPE "SEDFQD1") (INSTANCE uTrigger/TrcInclCtrlReg_reg[13]) (DELAY (ABSOLUTE (IOPATH CP Q (0.10:0.15:0.25)(0.09:0.15:0.24)) ) ) (TIMINGCHECK (OTHERTEST (SETUP (posedge SI) (posedge CP) (0.14:0.23:0.41)) (SETUP (negedge SI) (posedge CP) (0.09:0.16:0.30)) ....(random lines) (HOLD (negedge SI) (posedge CP) (0.00:0.00:0.00)) (HOLD (negedge D) (posedge CP) (0.00:0.00:0.00)) ) ) )

      Your code is relying on th happy fortuity of the sample data, that the close of the block is a single ')' on a line by itself. Mine actually counts the parens to determine when the block is closed.

      Its a hack, but a useful one.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        OK - here is updated code - one line added, and some tweaking to count parens, but still using the flip-flop. This works with your variation of the data...

        I'm nor arguing your point that my original code was a hack - I simply wanted to illustrate the value of the flip-flop operator for this type of situation, including the fact that it CAN be use properly in production level code, and that it simplifies code.

        ##! perl -slw use strict; my $p=0; # Counts number of unmatched parens while( <DATA> ) { ##m/\(TIMINGCHECK/ .. do {$p++ for m/\(/g; $p-- for m/\)/g;$p==0} o +r print , next; ## Faster+simpler version, plagerizing BrowserUK's paren counting m +echanism m/\(TIMINGCHECK/ .. ($p+=tr[(][(] - tr[)][)])==0 or print ,next; print "//$_" } __DATA__ (CELL ... (TIMINGCHECK .... .... ) ) (CELL (CELLTYPE "SEDFQD1") (INSTANCE uTrigger/TrcInclCtrlReg_reg[13]) (DELAY (ABSOLUTE (IOPATH CP Q (0.10:0.15:0.25)(0.09:0.15:0.24)) ) ) (TIMINGCHECK (OTHERTEST (SETUP (posedge SI) (posedge CP) (0.14:0.23:0.41)) (SETUP (negedge SI) (posedge CP) (0.09:0.16:0.30)) ....(random lines) (HOLD (negedge SI) (posedge CP) (0.00:0.00:0.00)) (HOLD (negedge D) (posedge CP) (0.00:0.00:0.00)) ) ) )

             "An undefined problem has an infinite number of solutions." - Robert A. Humphrey         "If you're not part of the solution, you're part of the precipitate." - Henry J. Tillman

      This leverages the "Toggle" nature of the ".." operator
      You learn a new thing every day. Interesting. Range operators in perlop

      Clint

Re: Comment a block that match a keyword
by shoness (Friar) on Aug 16, 2007 at 16:19 UTC
    Are your library cells missing the checks? You could just leave these checks in place during SDF backannotation and just add "+notimingchecks" at compile or runtime to your simulator. VCS, MTI and NC all support that switch. You wouldn't have any Perl fun though!
A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://633022]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (3)
As of 2024-04-24 03:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found