Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Nested foreach problem

by Anonymous Monk
on Aug 19, 2008 at 16:52 UTC ( [id://705265]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I am seeking help on a script using nested loops. I am extracting a name from @name_array, and searching
each line of each file in @files. If the line contains
the name ($wanted_name), I want the line to be printed
into OUTPUTFILE.
Problem 1 : Don't know how to pass a variable $wanted_name
into a regular expression. It must be a variable to be
updated. Someting like: if ($line =~ m/$wanted_name/i) etc..

Problem 2: OK, since I can't deal with Problem 1, if I
hard-code a name I know to exist in several files (and
therefore will be printed to OUTPUTFILE), the matching
lines will be printed but REPEATEDLY. It seems as if
the program runs correctly, then repeats itself like
there is some form of infinite loop going on.
foreach $name (@name_array) { $wanted_name = ($name); foreach $file (@files) { open (FILE, $file); @array = <FILE>; close(FILE); foreach $line (@array) { if ($line =~ m/$wanted_name/i) { print OUTPUTFILE "$line"; } } } } close(OUTPUTFILE);
Thanks in Advance

Replies are listed 'Best First'.
Re: Nested foreach problem
by kyle (Abbot) on Aug 19, 2008 at 17:14 UTC

    This is a little better (but not tested):

    NAME: foreach my $name ( @name_array ) { FILE: foreach my $file ( @files ) { open my $fh, '<', $file or die "Can't read '$file': $!"; while ( my $line = <$fh> ) { chomp $line; if ( lc $line eq lc $name ) { print OUTPUTFILE "$line\n"; next FILE; } } } }

    Note a few things:

    • Once a name is found in a file, I break out of the loop and go to the next file using next FILE (made possible by the label on that loop).
    • I check to see whether open failed, and I die if so.
    • I use chomp to take the line ending off the lines. That leaves just the name with no newline.
    • I compare names using eq and lc. That makes it case-insensitive, but it won't match partial names to full names (i.e., "Fred" won't match "Frederick").
    • I do not read the whole file at once. I read it a line at a time and conserve memory.

    The way you're trying to do it with regular expressions means that you'll end up matching partial names with full names (i.e., "Fred" will match "Frederick" and "Liam" will match "William"). If that's what you want, fine. You can say $line =~ /\Q$wanted_name/ and get that (see also quotemeta).

    If you want each name to be found only once regardless of how many files it's in, you can change my "next FILE" to "next NAME" to skip over other files once the name is found.

    Update: I also Use strict and warnings

Re: Nested foreach problem
by olus (Curate) on Aug 19, 2008 at 17:16 UTC

    You don't seem to have problem 1 in your code. As for problem 2, if a name matches in several places, all occurrences will be printed. If you want uniqueness use an hash. The following example illustrates the differences:

    use strict; use warnings; my @names = ('joe', 'mary', 'francis'); my @lines = <DATA>; my %lines_matched; my @all_found; foreach my $name (@names) { my $wanted = $name; foreach my $line (@lines) { if ($line =~ m/\b$wanted\b/i) { push @all_found, $line; $lines_matched{$line} = 1; } } } print foreach @all_found; print "\nNow the uniques:\n"; print foreach keys %lines_matched; __DATA__ Joe is lazy Robin Wood Mary Poppins Francis Bacon Ruppert Joe married Mary

    outputs

    Joe is lazy Joe married Mary Mary Poppins Joe married Mary Francis Bacon Now the uniques: Francis Bacon Joe married Mary Joe is lazy Mary Poppins

    update. added \b after reading kyle's comment.

Re: Nested foreach problem
by dwm042 (Priest) on Aug 19, 2008 at 18:09 UTC
    1) I'm sure others have told you but with your file read inside the loop that checks for each name, you end up reading each file once for every name checked. That's an inefficiency. The file loop should be the outer loop.

    One way of doing this is:

    my @name_array = qw(fred joe john frank ); for my $file(@file) { open(FILE, "<", $file) or die("can't open file $file.\n"); while (my $line = <FILE>) { my $match = grep { $line =~ /$_/i } @name_array; print OUT $line if ( $match ); } }
    2) Since the "printing" loop doesn't check to see if the line has been printed before, you can print similar text over and over. If you want to avoid that, use a hash to ensure uniqueness of the lines being printed.

    An example:

    my %unique; if ( $have_a_line_to_print ) { $unique{$have_a_line_to_print} = 1; } # # much later # for( sort keys %unique ) { print; }
Re: Nested foreach problem
by Bloodnok (Vicar) on Aug 19, 2008 at 17:20 UTC
    As has been mentioned time without number elsewhere, if nothing else, you ought to...
    • use strict;
    • use warnings;
    • Use the 3 argument variant of open
    • Test the status returned from the call to open
    That being said, you ...
    • Attempt to slurp the file in one, but have forgotten local undef $/;.
    • Fail to open OUTPUTFILE
    A user level that continues to overstate my experience :-))

      If you're reading into an array, you always get the whole file regardless of the value of $/. Also, local alone is enough to undef $/.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://705265]
Approved by olus
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (5)
As of 2024-04-23 20:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found