Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Printing Columns from Two Different Files

by gisrob (Novice)
on Jan 31, 2003 at 16:18 UTC ( [id://231653]=perlquestion: print w/replies, xml ) Need Help??

gisrob has asked for the wisdom of the Perl Monks concerning the following question:

Monks - I have two different directories containing files with the same name, e.g. /home/robs/data/10.06.94.txt /home/robs/newdata/10.06.94.txt I'd like to loop over the directories, open up each similarly named file, read it in to an array, and then print selected columns to a new file. Seems pretty basic, but a) I'm stumped on how to print to the third file, and b) I don't know how to automate this with OPENDIR and READDIR to do this iteratively for all files. Here's what I've come up with, but it doesn't work.
#!usr/bin/perl $folder1 = "C:/neaq/ArcData/bluefin/data_analysis/1994sptr/data"; chdir "$folder1"; $textfile = "10.06.94.txt"; open(INFILE, $textfile) or die "Cannot open: $textfile\n"; while(<INFILE>) { chomp; @data = split; ($mask, $x, $y, $dfront, $ftdens, $depth, $slope, $tuna, $temp, $s +ptr) = @data; } close(INFILE); $folder2 = "C:/neaq/ArcData/bluefin/data_dens/1994sptr"; chdir "$folder2"; $textfile2 = "10.06.94.txt"; open(INFILE2, $textfile2) or die "Cannot open: $textfile2\n"; while(<INFILE2>) { chomp; @data2 = split; ($mask, $x, $y, $ftdens) = @data2; } close(INFILE2); open(OUTFILE, "> newdens.txt"); while (<>) { print OUTFILE $data[1], $data[2], $data[3], $data[4], $data[5], $d +ata[6], $data[7], $data2[3], "\n"; } close(OUTFILE);
The code opens the outfile, but prints nothing to it, and just hangs. Any advice on how to accomplish this? Thanks

Replies are listed 'Best First'.
Re: Printing Columns from Two Different Files
by boo_radley (Parson) on Jan 31, 2003 at 16:59 UTC
    you've got a bunch of the pieces put together correctly, but there's a few areas where these individual bits don't mesh together right. You're reading two files (the while (<INFILE>) and while (<INFILE2>)), but you're not storing the results of those reads. In fact each run through the loop overwrites the information that was there previously. You might declare arrays outside of these loops and store the results of the splitting in them, so that you preserve the work you're doing.
    You don't seem to use $mask, $x, $y, $dfront, $ftdens, $depth, $slope, $tuna, $temp or $sptr in any useful fashion; is this part of a larger script? If not, you might consider dropping them entirely because they're not doing you any good.
    It may also benefit you to drop the the chdirs and simply open the files with :
    $folder2 = "C:/neaq/ArcData/bluefin/data_dens/1994sptr/"; $textfile2 = "10.06.94.txt"; open(INFILE2, "$folder2$textfile2") or die "Cannot open: $textfile2\n";

    Finally, the diamond operator doesn't do what I expect you expect it does. It runs through @ARGV (which itself gets populated with the script's) command line arguments, treating each element as a filename which gets opened, and read line by line. Since no arguments were specified, it's waiting for input on STDIN ("just hangs").
    As a rough, untested outline, here's my take on the problem. It's not fully fleshed out, but it provides a quick idea of what a solution might look like.
    open FH, "first file" || die "problem with first file : $!"; @first =<FH>; # slurp into array close FH; open FH, "next file" || die "problem with first file : $!"; @second =<FH>; # slurp into array close FH; # at this point, we've got the lines of each file in two different arr +ays. # if it's important that they have the same number of lines, you can # check with if (scalar (@first) != scalar (@second)) {... # # now start output to the third file. maybe check to see if it exists? open OUTPUT, ">output.txt"; # since we're reading from 2 different arrays # it's easier to use for (0..$#first) rather than foreach (@first) # so that we can apply the element number to both arrays. # my $sep=","; # specifies comma separated output. this might be bad if +your data has commas in it. for (0..$#first) { # this is a liability if the second file has more l +ines... print OUTPUT join $sep, split (/\s+?/, $first($_)), split (/\s+?/, + $second($_)); } close OUTPUT;
Re: Printing Columns from Two Different Files
by Hofmator (Curate) on Jan 31, 2003 at 17:19 UTC
    If you are processing the three files (two input, one output) really line by line, then one while loop is enough. Just open all three files beforehand and close them afterwards. Something like the following code should work then:
    my $folder1 = 'folder1' my $folder2 = 'folder2'; my $folder_out = 'outfolder'; my $textfile = 'file.txt'; open(my $file1, "< $folder1/$textfile") or die $!; open(my $file2, "< $folder2/$textfile") or die $!; open(my $outfile, "> $folder_out/$textfile") or die $!; while (my $line1 = <$file1> ) { my $line2 = <$file2>; my @data1 = split, $line1; my @data2 = split, $line2; print $outfile @data1[1..7], $data2[3], "\n"; } close($file1); close($file2); close($outfile);

    -- Hofmator

Re: Printing Columns from Two Different Files
by cfreak (Chaplain) on Jan 31, 2003 at 17:00 UTC

    I see a couple of problems: first, and I'm sure you're about to hear a lot of this: You should always  use strict; and use warnings;. Strict forces you to localize your variables with my, both directives (placed at the top of your program) are very useful for helping you debug your program).

    As for the immediate problems: While reading data in I notice that you have this (similar for the second one):

    @data = split; ($mask,$x,$y,$ftdens) = @data;
    split with no arguments splits on nothing for $_. I'm not certain of your data structure but this is going to leave you with an array of every character on each line, I'm not convinced that is what you want. Perhaps you could show us a sample of your data file. Also the @data and @data2 arrays as well as the variables that you create on the second line aren't even being used, and get clobbered everytime through the loop.

    Finally your output is the biggest problem. Your program is hanging because you've opened the output file and then did:

    while(<>) {

    That is causing your program to loop forever (thus the hang that you see). That while tells your program to read from the filehandle that you opened for writing. On top of that your @data is only going to contain the last line from the file that you opened.

    Here is a program that works on comma separated values. You'll have to taylor it for your own data structure :)

    #!/usr/bin/perl use strict; use warnings; my $file1 = "file1.txt"; my $file2 = "file2.txt"; my $outfile = "outfile.txt"; my @data1 = (); my @data2 = (); open(INFILE1,"$file1") or die "Couldn't open $file1: $!\n"; while(<INFILE1>) { chomp; my @line = split(/,/,$_); push(@data1,\@line); } close(INFILE1); open(INFILE2,"$file2") or die "Couldn't open $file2: $!\n"; while(<INFILE2>) { chomp; my @line = split(/,/,$_); push(@data2,\@line); } close(INFILE2); # now merge the data in the outfile open(OUTFILE,">$outfile") or die "Couldn't open $outfile: $!\n"; my $count = 0; foreach(@data1) { print OUTFILE @$_,$data2[$count]->[3]; $count++; } close(OUTFILE);

    I hope that helps. Post some of your data maybe we can help further

    Chris

    Lobster Aliens Are attacking the world!

    Update: Opps Fletch is correct split $_ splits on whitespace.

      split with no arguments splits on nothing for $_.

      Actually it splits $_ on whitespace. See perldoc -f split.</pedant>

      Thanks for the tips. I posted a snippet of data in one of the replies below. Let me know if it helps.
Re: Printing Columns from Two Different Files
by l2kashe (Deacon) on Jan 31, 2003 at 17:11 UTC
    Ok first off im going on the assumption that the files will always be the same name, second off I am assuming that you know what data you want every time. Also apparently your data file only has one line? If not then you are only getting the last line from the file each time. I have moved any ||'s and &&'s to newlines so that the code doesn't wrap so much
    #!/usr/bin/perl $base1 = "C:/neaq/ArcData/bluefin/data_analysis/1994sptr/data"; $base2 = "C:/neaq/ArcData/bluefin/data_dens/1994sptr"; $outfile = "C:/some/dir/to/reports"; opendir(BASE1, "$base1") || die "Cant open $base1\nReason: $!\n"; foreach $file ( grep(/\.txt$/, readdir(BASE1)) ) { warn "Skip: No matching $file in $base2\n" && next unless (-f "$base2/$file"); open(IN, "$base1/$file") || die "Cant access $file\nReason: $!\n"; while ( <IN> ) { @data1 = ( split(/\s+/) )[1,2,3,4,5,6,7]; } close(IN); open(IN, "$base2/$file") || die "Cant access $base2/$file\nReason: $!\n"; while ( <IN> ) { $data2 = ( split(/\s+/) )[3]; } close(IN); open(OUT, ">>$outfile") || die "Cant append to $outfile\nReason: $!\n"; print OUT "@data1 $data2\n"; close(OUT); } # END foreach $file readdir(BASE1) # # or alternately to save up our data and only print once # push(@out, join(' ', "$file: ", @data1, $data2)); # then loop through out once and print it out here instead #


    /* And the Creator, against his better judgement, wrote man.c */
      Well actually no. My data files have more than one line. Guess I'm getting many things wrong. The data look like: oldfile:
      mask x y dln9476326lce dln94930101lc 0 8852.68825 442495.5253 82293.02 0 0 9913.98125 442495.5253 82751.22 0 0 10975.27425 442495.5253 83238.23 0 0 10975.27425 441434.2323 81704.7 0 0 12036.56725 440372.9393 80174.36 0 0 12036.56725 439311.6463 80174.36 0 0 12036.56725 438250.3533 78647.4 0 0 13097.86025 438250.3533 79192.59 0 0 13097.86025 437189.0603 77679.91 0 0 13097.86025 436127.7673 77679.91 0 <snip> I haven't included all the columns, but there are 5 more.
      newfile:
      mask x y dln9476331lc 0 23710.79025 396859.92632 0 0 27955.96225 396859.92632 0.2530461 0 29017.25525 395798.63332 0.4151559 0 29017.25525 394737.34032 2.826168 0 19465.61825 393676.04732 0 <snip>
      The files always have the same name in each directory, the same number of lines, and the same structure (within each directory, but different across directories). They are output from a GIS sampling program. The only difference is the days on which they were sampled. Files look like:
      C:\neaq\ArcData\bluefin\data_dens\1994_sptr>dir 10.06.94.txt 10.07.94.txt 7.10.94.txt 7.11.94.txt 7.12.94.txt 7.14.94.txt <snip> C:\neaq\ArcData\bluefin\data_analysis\1994sptr\data>dir 10.06.94.txt 10.07.94.txt 7.10.94.txt 7.11.94.txt 7.12.94.txt 7.14.94.txt <snip>
Re: Printing Columns from Two Different Files
by Fletch (Bishop) on Jan 31, 2003 at 16:54 UTC
    while (<>) {

    It's not hanging, it's waiting for input on STDIN just like you asked it to do. Perhaps you mean to iterate over @data instead?

Re: Printing Columns from Two Different Files
by BUU (Prior) on Jan 31, 2003 at 18:24 UTC
    For extreme magic:
    #my list of files @ARGV=qw/txt1.txt foo.txt baz.txt etc.foo/; open OUT,'out.txt'; while(<>){print OUT $_;}
Re: Printing Columns from Two Different Files
by hardburn (Abbot) on Jan 31, 2003 at 17:00 UTC
Re: Printing Columns from Two Different Files
by bsb (Priest) on Feb 03, 2003 at 00:13 UTC
    If you're on unix the 'cut' and 'paste' commands might do part of what you're after.
    Not perl but handy

    cut - remove sections from each line of files
    paste - merge lines of files

    Or perhaps you can use perl for part of it and just merge the results with paste.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://231653]
Approved by davis
Front-paged by diotalevi
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (6)
As of 2024-04-25 07:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found