http://qs321.pair.com?node_id=1023340

littlewenwen has asked for the wisdom of the Perl Monks concerning the following question:

Dear All

I have a tab delimited dataset "RawFile":

abcd 123 456 defg cdefg 23 as 345 235 xsd swe

And I want to change all missing values to "Missing". So I wrote the following codes:

#!/usr/bin/perl use warnings; use strict; open(my $outfile,">","UpdatedFile") || die " \n"; open (my $infile,"<","RawFile") or die "Cannot open: $!\n"; while(<$infile>){ chomp; my @fieldsvar =split(/\t/); foreach (@fieldsvar) { if ($_ eq ""){$_="Missing"} } print $outfile "@fieldsvar\n"; };

The code works ok, except that I have to create a new file "UpdatedFile"; can anyone suggest some method so that the code can be re-written which can do a in-place update of the "RawFile"?

Replies are listed 'Best First'.
Re: how to in-place update a dataset?
by kcott (Archbishop) on Mar 14, 2013 at 03:47 UTC

    G'day littlewenwen,

    Welcome to the monastery.

    One way to do that is with Tie::File:

    $ cat > fred.dat abcd 123 456 defg cdefg 23 as 345 235 xsd swe
    $ perl -Mstrict -Mwarnings -e ' use Tie::File; tie my @fred_data, q{Tie::File}, q{fred.dat} or die $!; @fred_data = map { s/(^|\t)(?=\t|$)/${1}Missing/g; $_ } @fred_data +; untie @fred_data; '
    $ cat fred.dat abcd 123 456 defg cdefg 23 Missing as Missing 345 235 Missing xsd Missing swe Missing

    -- Ken

Re: how to in-place update a dataset?
by educated_foo (Vicar) on Mar 14, 2013 at 02:24 UTC
    The standard approach is to write the new contents to UpdatedFile, then rename it to RawFile after you're done. That way if something goes wrong, you still have the original file.

      Thank you. I am just wondering if the in-line update could be done. As I am still learning perl, I hope to take this opportunity to learn more.

      Again, thank you for your help.

        Hello littlewenwen, and welcome to the Monastery!

        Yes, you can do in-line updates using the standard Tie::File module:

        #! perl use strict; use warnings; use Tie::File; my $filename = 'RawFile.txt'; tie my @lines, 'Tie::File', $filename or die "Cannot tie file '$filena +me': $!"; for my $i (0 .. $#lines) { my @fields = split /\t/, $lines[$i], -1; @fields = map { $_ eq '' ? 'Missing' : $_ } @fields; $lines[$i] = join("\t", @fields); } untie @lines;

        Output in RawFile.txt:

        abcd 123 456 defg cdefg 23 Missing as Missing 345 235 Missing xsd Missing swe Missing

        Hope that helps,

        Update 1: Added -1 as LIMIT argument to split to get trailing empty fields.

        Update 2: Added final “Missing” to output; the final tab was missing from my input file on line 4. Thanks to kcott for the heads-up.

        Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Re: how to in-place update a dataset?
by jaredor (Priest) on Mar 14, 2013 at 07:15 UTC

    Use the -i option for command line perl, e.g,

    perl -pi -e 'chomp; ($_ = "\t$_\t") =~ s/\t(?=\t)/\tMissing/g;' \ -e 's/\A\t//; s/\t\Z/\n/;' RawFile
      Wow! There is so much information in all of your replies that I may need some time to digest. Thank you all very much for great help.