Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

grep in Perl

by Anonymous Monk
on Jun 05, 2015 at 09:11 UTC ( [id://1129174]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi , I am new to perl scripting. I have 2 new input files, file1 and file2. I have to retrieve second field of each record from first file and check if it is existing in second file, file2. I have read file2 contents and stored it in an array. Please find my file2(npn.txt) contents below:

11111111 10001781 11111222
File1 contents:
abc,,,, abc,10001781,,, abc,10001782,,, abd,10001783,,,
Based on above files, what i need is to print records if second field is not matching.Hence my record should contain:
abc,,,, abc,10001781,,, abc,10001782,,, abd,10001783,,,
But it prints:
abc,10001781,,, abc,10001782,,, abd,10001783,,,
Can anyone please guide me what is the bug in thi. Please find my code snippet below:

... open (info, "<npn.txt"); chomp (@contents = (<info>)); close info; $record = $_."~"; @flds = split(/\,/, $record); if (grep !/^$flds[1]/, @contents) { print $record; } ' >hi5res.txt

Replies are listed 'Best First'.
Re: grep in Perl
by Laurent_R (Canon) on Jun 05, 2015 at 13:16 UTC
    I think that you should use a hash instead of an array to load file2, because a hash lookup is far more efficient than greping through a full array for each line of file1. Something like this (untested):
    use strict; use warnings; my $second_file = "npn.txt"; open my $fh2, '<', $second_file or die "Cannot open $second_file $!"; my %contents = map {chomp; $_ => 1} <$fh2>; close $fh2; my $first_file = "first_file.txt"; open my $fh1, '<', $first_file or die "Cannot open $first_file $!"; while( my $line = <$fh1>) { chomp($line); my $field2 = (split /,/, $line)[1]; print $line and next unless defined $field2; print $line unless defined $contents{$field2}; } close $fh1;
    Please also take note of the way of opening files, considered to be more in line with commonly accepted best practices.
Re: grep in Perl
by Tux (Canon) on Jun 05, 2015 at 13:32 UTC

    Looks like a ferfect task for Text::CSV_XS's csv using a filter:

    $ cat npn.txt 11111111 10001781 11111222
    $ cat first_file.txt abc,,,, abc,10001781,,, abc,10001782,,, abd,10001783,,,
    $ cat test.pl use strict; use warnings; use Text::CSV_XS "csv"; my %npn = map { chomp; $_ => 1 } do { local @ARGV = "npn.txt"; <> }; csv (in => "first_file.txt", filter => { 2 => sub { !exists $npn{$_} } +});
    $ perl test.pl abc,,,, abc,10001782,,, abd,10001783,,,

    Enjoy, Have FUN! H.Merijn
Re: grep in Perl
by Anonymous Monk on Jun 05, 2015 at 09:18 UTC

    I am new to perl scripting. I have 2 new input files, file1 and file2. I have to retrieve second field of each record from first file and check if it is existing in second file, file2. I have read file2 contents and stored it in an array. Please find my file2(npn.txt) contents below:

    11111111 10001781 11111222
    File1 contents:
    abc,,,, abc,10001781,,, abc,10001782,,, abd,10001783,,,

    Based on above files, what my record should contain:

    abc,,,, abc,10001782,,, abd,10001783,,,
    But it prints:
    abc,10001781,,, abc,10001782,,, abd,10001783,,,

    Can anyone please guide me what is the bug in this. Please find my code snippet below:

    ... open (info, "<npn.txt"); chomp (@contents = (<info>)); close info; $record = $_."~"; @flds = split(/\,/, $record); if (grep !/^$flds[1]/, @contents) { print $record; } ' >hi5res.txt

      Your grep returns all the elements of @contents that are NOT matched by your regex, so that will always be true. Removing the ! from your grep should fix that, assuming the rest of the code that we can't see is okay.

      This is not the best way to do this, though, because you're looping through the contents of one file for each line in the other file. The standard way to "match lines from one file to keys from another file" is to load the keys from one file (file2) into a hash as its keys, then go through the other file (file1) splitting out the matching part and seeing if it exists as a key in the hash. If it does exist, print it.

      Try coding it that way, and let us know if you need help.

      Aaron B.
      Available for small or large Perl jobs and *nix system administration; see my home node.

      It should be working for you

      use strict; use warnings; my $second_file = "npn.txt"; open my $fh2, '<', $second_file or die "Can't write to '$second_file': + $!\n"; my @contents; chomp (@contents = (<$fh2>)); close $fh2; my $first_file= "hi5res.txt"; open my $fh1, '<', $first_file or die "Can't write to '$first_file': $ +!\n"; while( my $eachLine=<$fh1>) { chomp($eachLine); my @array = split(/,/,$eachLine); print $eachLine if (not defined $array[1]); if (grep {$_ eq $array[1]} @contents) { print "$eachLine\n" ; } }

      All is well. I learn by answering your questions...

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1129174]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (2)
As of 2024-04-26 01:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found