http://qs321.pair.com?node_id=1103787

PackerX has asked for the wisdom of the Perl Monks concerning the following question:

I have a file that I would like to update, line by line. This is what I'm attempting, which is not working.
open(DATA, "+<myfile.csv") || die "Error: $!"; while(<DATA>) { s/foo/bar/g; # print; } close(DATA);
If I uncomment #print, the output gives me exactly what I expect, but the file is not updating. What am I missing here? I need to update the file line by line because the file may be extremely large and I do not want to pull the whole thing into memory.

Replies are listed 'Best First'.
Re: Using read/write to update a file
by blue_cowdawg (Monsignor) on Oct 14, 2014 at 18:51 UTC

    Try this:

    #!/usr/bin/perl -w use strict; use Tie::File; tie my @ry,"Tie::File","myfile.csv" or die $!; $_ =~ s/foo/bar/g for @ry; untie @ry;


    Peter L. Berghold -- Unix Professional
    Peter -at- Berghold -dot- Net; Blog: http://blog.berghold.net Warning: No political correctness allowed.
Re: Using read/write to update a file
by toolic (Bishop) on Oct 14, 2014 at 18:51 UTC
    In your code, print prints to STDOUT, not your myfile.csv file. You could specify the DATA filehandle, but I think you'll just be appending lines, not replacing them. Consider Tie::File.

      I believe that you will end up overwriting the next record, not appending. So if you start with a file

      afoo bfoo cfoo dfoo
      you should end up with
      afoo abar cfoo cbar

      (untested)

      --MidLifeXis

Re: Using read/write to update a file
by Laurent_R (Canon) on Oct 14, 2014 at 19:00 UTC
    Although there are some ways around, you usually can't update an existing file, except sometimes when it has fixed length records (e.g. database context). The solution is usually to write to another file and then replace the old file by the new one, doing the renaming etc. Or, if your file is not too large, to first read from the file into memory (for example into an array of lines), to make the changes in memory, and to write the memory content onto the file at the end (that's essentially what happens when you open a file with a text editor, with MS Word or similar applications).

    Using the -i command line flag or the Tie::File module can enable you to get away with this, but this is only because the dirty house-keeping aspects are basically hidden behind a smokescreen.

      "Although there are some ways around, you usually can't update an existing file ... except sometimes ... only because the dirty house-keeping aspects are basically hidden behind a smokescreen."

      Chief Vitalstatistix, the famous leader of the Gaulish village had the only fear that the sky may fall on his head tomorrow.

      But fortunately this didn't happen until today ;-)

      Best regards, Karl

      «The Crux of the Biscuit is the Apostrophe»

Re: Using read/write to update a file
by karlgoethebier (Abbot) on Oct 14, 2014 at 19:19 UTC

    Hi and welcome PackerX!

    You print to STDOUT.

    Something like that you try to accomplish is described in the Perl Cookbook, recipe 7.10 (Modifying a File in Place):

    #!/usr/bin/env perl use strict; use warnings; use Data::Dump; my $file = qq(buff.txt); open( FH, "+<", $file ) or die "Opening: $!"; my @array = <FH>; dd \@array; s/cuke/beer/ for @array; seek( FH, 0, 0 ) or die "Seeking: $!"; print FH @array or die "Printing: $!"; truncate( FH, tell(FH) ) or die "Truncating: $!"; close(FH); dd \@array; __END__ karls-mac-mini:monks karl$ cat buff.txt foo bar nose cuke karls-mac-mini:monks karl$ ./inplace.pl ["foo\n", "bar\n", "nose\n", "cuke\n", "\n", "\n"] ["foo\n", "bar\n", "nose\n", "beer\n", "\n", "\n"] karls-mac-mini:monks karl$ cat buff.txt foo bar nose beer

    Please see also seek, tell, and truncate.

    Update: Posted a bit to late because of Germany vs. Ireland.

    Update2: Added forgotten link to tell.

    Regards, Karl

    «The Crux of the Biscuit is the Apostrophe»

Re: Using read/write to update a file
by MidLifeXis (Monsignor) on Oct 14, 2014 at 18:53 UTC

    You will likely need seek, tell, print (with a filehandle argument), and possibly truncate.

    Unless the replacement is the same length or shorter (and even then you still might), you will also need to write to a temporary file, and not just the original file, doing a rename after closing both files. If you do not (or if you don't use the Tie::File suggestion above), you risk corruption or truncation of the data file.

    --MidLifeXis

Re: Using read/write to update a file
by McA (Priest) on Oct 14, 2014 at 19:16 UTC

    Hi,

    you must be aware that a file is a stream of bytes. There is no inherent concept of a record or line. A line ends where a byte representing a Newline (UNIX-like) or two bytes representing ASCII-Carrige-Return and Newline are found.

    So, when you have a substitution which makes the line longer, you need more bytes between the first and the one markin the end. That means you have to move the content or the whole rest of the file towards the end. If the substitution makes the line smaller you have to shift the whole rest of the file towards the beginning of the file.

    Only when you know that the substitution will NOT change the length of a line you know that the line before the substitution and after the substitution will require the same space, you could implement something like you want.

    In general you always produce a copy of the original file with all substitutions and delete the original afterwards.

    When you're short of space you can split the original file into X parts having a certain count of lines, make the substitution with copy in that part and concatenate all resulting files afterwards. Something like that on a Unix shell:

    split -l 1000 myfile myfile. for file in myfile.* do perl -i -pe 's/3/2/g' $file done cat myfile.* > newfile rm myfile myfile.*

    where -l 1000 splits every 1000 lines and perl -i -pe 's/3/2/g' $file substitutes 3 by 2.

    UPDATE: That is nonsense. With my example you copy the whole file in smaller pieces and gain nothing. I shouldn't talk to my children while answering questions here. Sorry.

    Regards
    McA

Re: Using read/write to update a file
by PackerX (Initiate) on Oct 14, 2014 at 18:56 UTC
    Thank you, Tie::File works. But since I can't let things go, how does one overwrite using +< ?
Re: Using read/write to update a file
by aitap (Curate) on Oct 19, 2014 at 09:20 UTC

    By the way, there is a mode in Perl to run like sed, editing files in place. Like in sed, this mode is activated with -i switch, described in perlrun. In a Perl program this mode can be controlled via the $^I variable. There is also special ARGV filehandle which iterates over command-line filenames in @ARGV. It's easy to combine these two to edit a file without slurping it into memory:

    { local ($^I, @ARGV) = (".bak", $filename); # $filename is renamed to "$filename.bak" and new "$filename" is open +ed for writing while (<>) { # ARGV filehandle is opened to read from "$filename.bak" s/foo/bar/g; print; } } # at this point @ARGV and $^I are restored to normal

    This might look too low-level. There are modules implementing this behaviour: File::Inplace and IO::InSitu. The latter is described in the Perl Best Practices book by Damian Conway.