comment on

I have no issue with that for a short (<1 page) piece of code.

For a short script, I don't see the advantage of a sub over just inlining the code. But since TMTOWTDI, it's fine.

I don't see any issue here at all. ... I don't see any case for "more flexible".

Just to be clear, I was talking about the general case, and especially for a longer script, where I disagree with this pattern. Personally, I think it's best to just read from the file in one place in the code, because as I said, I think it's more flexible across different input file formats. In a long script it would also become difficult to keep track of all the places that read the file, and what state they expect the filehandle to be in, and what state they leave it in.

You said "You are correct in that there is no 'unget' or 'un-read' for a line that has already been read." - that's what I was referring to. I still think a state machine approach is better, but if you disagree, perhaps you could show how you'd use the pattern you showed (a <DATA> in the main loop and a <DATA> in a sub) to read a file like the below __DATA__ section.

#!/usr/bin/env perl
use warnings;
use strict;

my @output;
use constant { STATE_IDLE=>0, STATE_IN_SECTION=>1 };
my $state = STATE_IDLE;
my @buf;
my $end_section = sub {
    if ( $state == STATE_IN_SECTION )
        { push @output, [@buf]; @buf = () }
    $state = STATE_IDLE;
};
while (<DATA>) {
    chomp;
    if ( my ($x,$y) = /^ (?: (.+) \s+ )? START (?: \s+ (.+) )? $/x ) {
        if ( defined $x ) {
            die "unexpected: $_\n" unless $state == STATE_IN_SECTION;
            push @buf, $x;
        }
        $end_section->();
        $state = STATE_IN_SECTION;
        push @buf, $y if defined $y;
    }
    elsif ( my ($z) = /^ (?: (.+) \s+ )? END $/x ) {
        die "unexpected: $_\n" unless $state == STATE_IN_SECTION;
        push @buf, $z if defined $z;
        $end_section->();
    }
    else {
        if ( $state == STATE_IN_SECTION )
            { push @buf, $_ }
        else {} # ignore outside of section
    }
}
$end_section->();

use Test::More tests=>1;
is_deeply \@output, [["a", "b"], ["c" .. "g"], ["h", "i"], ["j", "k"]]
    or diag explain \@output;

__DATA__
START
a
b
START c
d
e
f
g END
ignoreme
START h
i START j
k
[download]

In reply to Re^4: processing file content as string vs array by haukex
in thread processing file content as string vs array by vinoth.ree

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


P is for Practical
	PerlMonks