http://qs321.pair.com?node_id=853652

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

O dwellers of Perl, have I a mighty perplexing affair that causes me a grave despair. It is basically with regard to the different ways we read a file once it has been opened. I know that there's reading a file line wise or in bulk, that's via syntax similar to $line = <FH> or @lines = <FH> respectively, and that the second case might warrant undefing the variable $/ and then maybe splitting or doing array manipulation.

What I am confused about is when in a loop I have something like while(<FH>){...}, is this any different than $line=<FH> or are these granularly the same?

To delete a file we use unlink and to close a file handle we use close and then we have undef which can also close a filehandle for us, my heart tells me that using undef this way is somewhat frowned upon or sinful, is that justified?

While I feel I can handle files to some degree, I still find it difficult to summarize my knowledge with respect to the issue presented in this post, could ye wise ones steer forth to the podium and clarify these and thus thou shalt assist a needy knowledge-seeker.

Replies are listed 'Best First'.
Re: File Reading and Closing confusion
by zentara (Archbishop) on Aug 08, 2010 at 14:03 UTC
    Actually, the topic is so broad and full of complexities, you need to ask a more specific question, showing some code.

    As far as the code $text = <FILE> goes, look at the difference

    #!/usr/bin/perl use warnings; use strict; open (FILE,"< $0 ") or die "Couldn't open: $!"; # check the difference between these two lines my $text = do {local $/; <FILE>}; #my $text = <FILE>; close FILE; print "$text\n";
    In the abscence of a code example, these links may get you started. Basically, you can read a file line by line(default behavior), or sysread it in chunks. The diamond operator <> adds some magic. See diamond operator and "perldoc -f read"

    Some filehandles you cannot seek on, like a socket filehandle. See read and sysread and run "perldoc -q file" and read all the sections. Remember, if you are on a unix style OS, you have file descriptors associated with each filehandle in /proc/$pid/fd. See How to close all open file descriptors after a fork? and close filehandles Remember, the file descriptors (numbers) in /proc/$pid/fd/ is where the real action takes place( at least on linux).... the filehandles are just convenience handles.

    Alot of us Perl hackers are lazy, and don't close filehandles after use, just letting the system clean them up when the program exits. :-)

    The one common gotcha encountered with the mix of unix and win32 systems, is that the 2 system have different line endings, so when the default line-by-line reading behavior is used to read a windows file on linux, proper line reading can get broken. Read "perldoc -f binmode".


    I'm not really a human, but I play one on earth.
    Old Perl Programmer Haiku
Re: File Reading and Closing confusion
by shmem (Chancellor) on Aug 08, 2010 at 14:17 UTC
    and that the second case might warrant undefing the variable $/ and then maybe splitting or doing array manipulation.

    Erm...no. You undefine $/ to slurp a file into a scalar; whereas @array = <FH> populates @array with lines, trailing $/ chopped off implicitly or not, according to the command line switch -l (update: wrong, implicit chomp only happens in combination of l with the n or p switch).

    What I am confused about is when in a loop I have something like while(<FH>){...}, is this any different than $line=<FH> or are these granularly the same?

    Run perldoc -f readline. That said, no, those are not the same: while(<FH>){...} assigns to $_, but $line=<FH> assigns to $line. Other than that, they are the same.

    To delete a file we use unlink and to close a file handle we use close and then we have undef which can also close a filehandle for us, my heart tells me that using undef this way is somewhat frowned upon or sinful, is that justified?

    That's not frowned upon or sinful, since a common idiom is

    my @lines = do { open my $fh, '<', $file or die "$file: $!\n"; <$fh> } +;

    Perl scoping rules apply. As can be seen reading open, file handles are closed if they get out of scope (and are thus undefined implicitly). So, nothing wrong about undef $fh.

    One caveat, though: You might encounter problems closing a file handle - for example flushing buffers and closing a file residing on a remote mountpoint, which became unavailable during operation of your program.

    So, it is always wise to close your file handles explicitly and check the return value.

        That said, no, those are not the same: while(<FH>){...} assigns to $_, but $line=<FH> assigns to $line. Other than that, they are the same.

      Actually, a crucial difference is that while(<FH>){...} gets interpreted by Perl as:
      while (defined($_ = <FH>)) { ... }

      which effectively ends the while loop once there are no more lines to be read from the filehandle. With $line=<FH>, we need to check for defined-ness.

        my @lines = do { open my $fh, '<', $file or die "$file: $!\n"; <$fh> };

      Another way:
      my @lines = do { local @ARGV = $file; <> };
        With $line=<FH>, we need to check for defined-ness.

        No. While there are lines read, they are defined since they have $/ attached. If $/ is undef, no point for while, since that's slurp mode. An empty string on the last line with no $/ is - no line.

Re: File Reading and Closing confusion
by cdarke (Prior) on Aug 08, 2010 at 16:10 UTC
    Another difference not mentioned by the estemed monks above:
    while (<FH>) {...}
    assigns a line to $_ on each iteration of the loop, as mentioned, however outside a loop it does not. Just plain
    <FH>;
    does read a line from the file, but does not assign it to $_. So far as I know it is not stored. The assignment to $_ is magic that only occurs inside the condition of a while loop. $line = <FH>; works as expected - the line which has been read is assigned to $line.

    By the way, if you use sysread the you should open with sysopen, but there is also read which is used after our old friend open. With read you can specify how many characters to read, which can also be done with $/ (although I find read easier to um, read).
Re: File Reading and Closing confusion
by ikegami (Patriarch) on Aug 09, 2010 at 00:17 UTC

    What I am confused about is when in a loop I have something like while(<FH>){...}, is this any different than $line=<FH> or are these granularly the same?

    while (<FH>) is transformed into while (defined($_ = <FH>)).

    Yes, $_=<FH> has the same granularity as $line=<FH>, whatever that means.