Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:
Hi ,
I am new to perl scripting. I have 2 new input files, file1 and file2. I have to retrieve second field of each record from first file and check if it is existing in second file, file2. I have read file2 contents and stored it in an array. Please find my file2(npn.txt) contents below:
11111111
10001781
11111222
File1 contents:
abc,,,,
abc,10001781,,,
abc,10001782,,,
abd,10001783,,,
Based on above files, what i need is to print records if second field is not matching.Hence my record should contain:
abc,,,,
abc,10001781,,,
abc,10001782,,,
abd,10001783,,,
But it prints:
abc,10001781,,,
abc,10001782,,,
abd,10001783,,,
Can anyone please guide me what is the bug in thi. Please find my code snippet below:
...
open (info, "<npn.txt");
chomp (@contents = (<info>));
close info;
$record = $_."~";
@flds = split(/\,/, $record);
if (grep !/^$flds[1]/, @contents) {
print $record;
}
' >hi5res.txt
Re: grep in Perl
by Laurent_R (Canon) on Jun 05, 2015 at 13:16 UTC
|
I think that you should use a hash instead of an array to load file2, because a hash lookup is far more efficient than greping through a full array for each line of file1. Something like this (untested):
use strict;
use warnings;
my $second_file = "npn.txt";
open my $fh2, '<', $second_file or die "Cannot open $second_file $!";
my %contents = map {chomp; $_ => 1} <$fh2>;
close $fh2;
my $first_file = "first_file.txt";
open my $fh1, '<', $first_file or die "Cannot open $first_file $!";
while( my $line = <$fh1>) {
chomp($line);
my $field2 = (split /,/, $line)[1];
print $line and next unless defined $field2;
print $line unless defined $contents{$field2};
}
close $fh1;
Please also take note of the way of opening files, considered to be more in line with commonly accepted best practices.
| [reply] [d/l] [select] |
Re: grep in Perl
by Tux (Canon) on Jun 05, 2015 at 13:32 UTC
|
$ cat npn.txt
11111111
10001781
11111222
$ cat first_file.txt
abc,,,,
abc,10001781,,,
abc,10001782,,,
abd,10001783,,,
$ cat test.pl
use strict;
use warnings;
use Text::CSV_XS "csv";
my %npn = map { chomp; $_ => 1 } do { local @ARGV = "npn.txt"; <> };
csv (in => "first_file.txt", filter => { 2 => sub { !exists $npn{$_} }
+});
$ perl test.pl
abc,,,,
abc,10001782,,,
abd,10001783,,,
Enjoy, Have FUN! H.Merijn
| [reply] [d/l] [select] |
Re: grep in Perl
by Anonymous Monk on Jun 05, 2015 at 09:18 UTC
|
I am new to perl scripting. I have 2 new input files, file1 and file2. I have to retrieve second field of each record from first file and check if it is existing in second file, file2. I have read file2 contents and stored it in an array. Please find my file2(npn.txt) contents below:
11111111
10001781
11111222
File1 contents:
abc,,,,
abc,10001781,,,
abc,10001782,,,
abd,10001783,,,
Based on above files, what my record should contain:
abc,,,,
abc,10001782,,,
abd,10001783,,,
But it prints:
abc,10001781,,,
abc,10001782,,,
abd,10001783,,,
Can anyone please guide me what is the bug in this. Please find my code snippet below:
...
open (info, "<npn.txt");
chomp (@contents = (<info>));
close info;
$record = $_."~";
@flds = split(/\,/, $record);
if (grep !/^$flds[1]/, @contents) {
print $record;
}
' >hi5res.txt
| [reply] [d/l] [select] |
|
Your grep returns all the elements of @contents that are NOT matched by your regex, so that will always be true. Removing the ! from your grep should fix that, assuming the rest of the code that we can't see is okay.
This is not the best way to do this, though, because you're looping through the contents of one file for each line in the other file. The standard way to "match lines from one file to keys from another file" is to load the keys from one file (file2) into a hash as its keys, then go through the other file (file1) splitting out the matching part and seeing if it exists as a key in the hash. If it does exist, print it.
Try coding it that way, and let us know if you need help.
Aaron B.
Available for small or large Perl jobs and *nix system administration; see my home node.
| [reply] |
|
use strict;
use warnings;
my $second_file = "npn.txt";
open my $fh2, '<', $second_file or die "Can't write to '$second_file':
+ $!\n";
my @contents;
chomp (@contents = (<$fh2>));
close $fh2;
my $first_file= "hi5res.txt";
open my $fh1, '<', $first_file or die "Can't write to '$first_file': $
+!\n";
while( my $eachLine=<$fh1>)
{
chomp($eachLine);
my @array = split(/,/,$eachLine);
print $eachLine if (not defined $array[1]);
if (grep {$_ eq $array[1]} @contents) {
print "$eachLine\n" ;
}
}
All is well. I learn by answering your questions...
| [reply] [d/l] |
|
|