G'day chengchl,
Welcome to the Monastery.
Here's a generic solution for your problem.
It handles:
-
Extraction of any block (i.e. there's no hard-coded or constant block number).
-
Extraction of multiple blocks.
-
Blocks of lines actually containing (plural) lines.
-
Rogue START or END tokens within START-END blocks.
-
Specification of wanted blocks in any order.
-
Invalid block specifications (e.g. out of range and non-integer identifiers).
In production code, you may want to add some form of validation and sanity checking,
such that the function is short-circuited if no valid blocks are specified
(which could mean not even having to open the input file).
The following shows the technique (specifically for testing via the command line);
you'll need to adapt this to your needs (e.g. change <DATA> to <$fh_r>).
I've embedded test data to check all the things I've said it handles;
you should create your own test data, which more realistically reflects your actual data,
and use that for any proof-of-concept or regression tests.
#!/usr/bin/env perl
use strict;
use warnings;
my %print_block = map { $_ => 1 } @ARGV;
my $found_block = 0;
while (<DATA>) {
next unless /^START$/ .. /^END$/;
++$found_block, next if /^START$/;
next if /^END$/;
print if $print_block{$found_block};
}
__DATA__
...
line BEFORE any wanted blocks
...
START
block A line 1
block A line 2 with rogue END token
block A line 3
block A line 4 with rogue START token
block A line 5
END
...
line BETWENN any wanted blocks
...
START
block B line 1
block B line 2 with rogue START token
block B line 3
block B line 4 with rogue END token
block B line 5
END
...
line BETWENN any wanted blocks
...
START
block C line 1
block C line 2 with rogue END token
block C line 3
block C line 4 with rogue START token
block C line 5
END
...
line BETWENN any wanted blocks
...
START
block D line 1
block D line 2 with rogue START token
block D line 3
block D line 4 with rogue END token
block D line 5
END
...
line AFTER any wanted blocks
...
Some example test runs (the script name is pm_1202989_flip_flop_selection.pl):
$ pm_1202989_flip_flop_selection.pl
$ pm_1202989_flip_flop_selection.pl 99
$ pm_1202989_flip_flop_selection.pl A B C
$ pm_1202989_flip_flop_selection.pl 1
block A line 1
block A line 2 with rogue END token
block A line 3
block A line 4 with rogue START token
block A line 5
$ pm_1202989_flip_flop_selection.pl 1 4
block A line 1
block A line 2 with rogue END token
block A line 3
block A line 4 with rogue START token
block A line 5
block D line 1
block D line 2 with rogue START token
block D line 3
block D line 4 with rogue END token
block D line 5
$ pm_1202989_flip_flop_selection.pl 3 4 2 # NOTE: specified order irre
+levant
block B line 1
block B line 2 with rogue START token
block B line 3
block B line 4 with rogue END token
block B line 5
block C line 1
block C line 2 with rogue END token
block C line 3
block C line 4 with rogue START token
block C line 5
block D line 1
block D line 2 with rogue START token
block D line 3
block D line 4 with rogue END token
block D line 5
$
[Side note:
As you're new here, you may have been surprised by certain responses.
You can safely ignore these;
a quick perusal of the "Worst Nodes" page should explain why.]
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.