http://qs321.pair.com?node_id=456341

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks How do we read a file character by character without loading it into the memeory. Regards, Sid

Replies are listed 'Best First'.
Re: To read Char-by-Char from a file
by cog (Parson) on May 12, 2005 at 11:07 UTC
    As long as each character has one byte:

    $/ = \1;

    Assigning $/ to a reference to an integer makes it separate input on each $x bytes ($x here being the integer).

    Note: Bytes, not characters.

    Try it out:

    perl -pe 'BEGIN{$/=\1}s/.*/<$&>/' file

Re: To read Char-by-Char from a file
by gaal (Parson) on May 12, 2005 at 11:10 UTC
    my $read; while ($read = read FILE, $char, 1) { print "got: $char\n"; } die "read error: $!" if not defined $read;

    Note that this reads a character at a time, not a byte; that's not the same thing unless you're using a 7/8-bit encoding. See read for more details.

    Update: in terms of memory, this actually does read more than one byte at a time; but this is normally what you want. Perl performs buffering for you so you don't make many needless system calls. The buffer doesn't endanger your memory. See also sysread for the low-level call.

Re: To read Char-by-Char from a file
by rob_au (Abbot) on May 12, 2005 at 11:10 UTC
    If you have an open file handle, see perlfunc:getc, otherwise cog's suggestion is quite elegant (assuming your character-set is not "wide", requiring more than one byte per character).

     

    perl -le "print unpack'N', pack'B32', '00000000000000000000001000000000'"

Re: To read Char-by-Char from a file
by eXile (Priest) on May 12, 2005 at 13:41 UTC
      Absolutely. This has the added benefit to bypass Perl's buffered IO, which is what one really wants if (s)he needs to read a precise amount of data from a "source".

      In addition, if you want to read byte-by-byte instead of char-by-char (which is different due to Unicode support), it's safer to use binmode:

      binmode FILE; sysread(FILE,$buffer,1);

      Flavio (perl -e 'print(scalar(reverse("\nti.xittelop\@oivalf")))')

      Don't fool yourself.
Re: To read Char-by-Char from a file
by Anonymous Monk on Apr 05, 2018 at 21:39 UTC
    Some of us like to do things ourselves, without shopping around for a pre-built library with pages of syntax and options. A simple and braindead way of reformatting things like xml or Oracle tnsnames.ora entries is to count opening and closing delimiters to determine where to use tabs, CR's, etc. Reading one charcter at a time makes this somewhat easy. In this case it doesn't matter if the file is being buffered or not because decisions about the output, not input, are being made per character.
      Unfortunately, some important syntax constructs in XML are longer than one character (e.g. <!-- or <![CDATA[ and their closing counterparts), but they can change how the following < should be interpreted. Good luck with implementing your own XML parser!

      ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
A reply falls below the community's threshold of quality. You may see it by logging in.