in reply to Re^2: incrementing already existing file
in thread incrementing already existing file
It looks like both files use column 5 as a "key" of sorts to connect the two files. I would approach this by reading all of the first file (the one you open as MYFILE), collecting the values from the last column along the way. Since you only need to collect one value from each line, I save then in an array as I read the file. This will work fine even for fairly large files. When the first file is processed, read from the second file (the one you open as NEWF) and do the substitutions (line by line), writing the output as we go.
Note that I use split (not substr) to get the fields of interest from each line (same approach for both files). For the output, I join the fields with a tab character. You should change that to something else (e.g., a fixed number of space characters) if you need the output formatted differently. And of course this writes to STDOUT, so you will need to redirect the output on the command line or add to this code to open an output file and print to that.#!/usr/bin/env perl use strict; use warnings; my $file1 = "pm-890461-in1.txt"; my $file2 = "pm-890461-in2.txt"; open( MYFILE, '<', $file1 ) or die "cannot open $file1: $!"; open( NEWF, '<', $file2 ) or die "cannot open $file2: $!"; my @in_values; while ( <MYFILE> ) { chomp; my( $index, $value ) = ( split /\s+/ )[4, -1]; # above line does same thing as next three # my @fields = ( split /\s+/ ); # my $index = $fields[4]; # my $value = $fields[-1]; $in_values[ $index ] = $value; } close MYFILE; while ( <NEWF> ) { chomp; my @fields = ( split /\s+/ ); my $index = $fields[4]; $fields[-1] = $in_values[ $index ]; my $output = join "\t", @fields; print "$output\n"; } close NEWF;
When you are more comfortable with Perl, you will find that some of this is actually on the "verbose" side. Using Perl idioms would make some of my code more compact, but also a bit harder to follow until you have more experience.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^4: incrementing already existing file
by wanttoprogram (Novice) on Feb 28, 2011 at 22:57 UTC | |
| [reply] |
by broomduster (Priest) on Mar 01, 2011 at 00:04 UTC | |
There are two sets of values 'A' and 'B'. The code you gave me is considering B set values only. Is there any way I can ask it to look for A first and then move to B.Almost certainly can. But you need to explain better what 'A' and 'B' are and show some examples of the input files. It sounds as if your data from NEWF have 'A' and 'B' somewhere on each line. 'A' should be replaced by a value from your MYFILE and 'B' should be replaced by a value from SOME_OTHER_FILE. This will be an easy modification if 'A' and 'B' are the last two columns in NEWF. If that is correct, here's how to proceed:
I think you should try to write this code yourself. If I made some wrong assumptions, post back with clarification and (most important) samples of the input files and a sample of what the output should look like. I'm quite happy to help you get your work done, but you will learn best if you give it a go on your own and then ask for help if something doesn't work the way you want. | [reply] [d/l] [select] |
by wanttoprogram (Novice) on Mar 01, 2011 at 21:25 UTC | |
| [reply] [d/l] |
by broomduster (Priest) on Mar 01, 2011 at 23:35 UTC | |
by wanttoprogram (Novice) on Mar 02, 2011 at 21:12 UTC | |
|
In Section
Seekers of Perl Wisdom