Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Open several files and read line by line

by Leiria (Novice)
on Dec 04, 2015 at 15:09 UTC ( [id://1149392]=perlquestion: print w/replies, xml ) Need Help??

Leiria has asked for the wisdom of the Perl Monks concerning the following question:

Hi PerlMonks!

First please allow me to apologize for my rookie question for which I seek enlightenment :)

I'm very new when it comes to programming, specially perl (which I'm enjoying very much so far)

I've got an apparent simple problem that I can't quite figure out. Here's the scenario:

- On the command line I can pass several parameters, each containing strings or files paths separated by pipes: script.pl --id1="test1|test2" --id2="test3|test4" --file="file1.txt|file2.txt"

- The files can be in several directories, so placing them all in one directory and simply reading it wouldn't be an option, I guess

- I wish to use the strings on parameters "id1" and "id2" to search all the specified files on parameter "file" and then print out the file name with the line numbers that match (or even better, output this to a brand new file)

So far, I've managed to write the following code after doing some readings, which works well if I specify only one file:

#!usr/bin/perl use strict; use warnings; use Getopt::Long; GetOptions( 'id1=s' => \my $id1, 'id2=s' => \my $id2, 'file=s' => \my $file, ); open (my $filename, "<", $file) or die "Could not open $file, $!"; while (<$filename>){ while (/$id1/g && /$id2/g) { print "$file:$.\n"; } } close ($filename);

If I run as is, with only one file specified, it works fine, except for the print part, where I'm getting globs instead of the actual filenames:

GLOB(0xa5a36c):61

GLOB(0xa5a36c):65

So moving on to the actual questions:

1. How could I loop through all the files and open them line by line and search for the matching strings?

2. How could I output the matches to a new file?

3. How could I list the actual matching file names instead of the globs?

Thank you all very much and I'm sorry if I wasn't clear enough on my intentions, for which I apologize. Any guidance or directions towards the right path are highly appreciated

With best regards,

Leiria

Replies are listed 'Best First'.
Re: Open several files and read line by line
by toolic (Bishop) on Dec 04, 2015 at 15:21 UTC
    One pretty standard way is to just pass the file(s) on the command line and parse each file one after the next:
    use strict; use warnings; use Getopt::Long; GetOptions( 'id1=s' => \my $id1, 'id2=s' => \my $id2, ); for my $file (@ARGV) { open (my $filename, "<", $file) or die "Could not open $file, $!"; while (<$filename>){ while (/$id1/g && /$id2/g) { print "$file:$.\n"; } } close ($filename); }
    Called like:
    script.pl --id1="test1|test2" --id2="test3|test4" file1.txt file2.txt

    Regarding your glob problem, post a small sample of your input file so others can reproduce the issue.

      Thanks for the super fast reply, toolic.

      That worked very well and I don't see the glob issue any more either.

      I thought of and tried a few ways to achieve my purpose, but failed to see the most simple one. Thank you for clearing this up so quickly. I appreciate it!

Re: Open several files and read line by line
by BillKSmith (Monsignor) on Dec 04, 2015 at 16:29 UTC
    The code you have shown should not have the glob problem. I suspect that in another version of this code, you used $filename when you meant $file. Your choice of names for these variables is very misleading. $file contains the name of the file. $filename contains a file handle (Strictly, a reference to a file handle). Perl does not care what names you choose to use, but more descriptive names would help you prevent this kind of error.
    Bill
Re^2: Open several files and read line by line
by AnomalousMonk (Archbishop) on Dec 04, 2015 at 17:57 UTC
    while (<$filename>){ while (/$id1/g && /$id2/g) { print "$file:$.\n"; } }

    I don't understand. If  /$id1/g && /$id2/g ever becomes true (matching by default against  $_ assigned in the outer while-loop), when will it ever become false? I.e., isn't this an infinite loop? And what's the point of the  /g modifier in these regexes? And yes,  $filename is a terrible name for a file handle!

    Update: I should have known. Many thanks, choroba. (But  $filename is still terrible!)


    Give a man a fish:  <%-{-{-{-<

      Without /g it would be an infinite loop. But with /g, the matching starts where the last one stopped:
      $_ = "abcabcabc"; while (/c/g && /a/g) { say pos; } __END__ Output: 4 7
      ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
Re: Open several files and read line by line
by Leiria (Novice) on Dec 05, 2015 at 05:55 UTC

    Thank you all for the valuable input. I will definitely change my variable names to something more immediately understandable. These are in fact terrible choices, now that I have your feedback and think more about it

    Thank you!

    Leiria

Re: Open several files and read line by line
by u65 (Chaplain) on Dec 05, 2015 at 00:54 UTC

    I agree with BillKSmith. With a C background, when starting with Perl I tended to use short names like $fp (for "file pointer") or $fh (for "file handle") for the actual file reference. For file name variables I like to use variants of $fname.

      What's wrong with using $fh if you only have one file in scope? Even the documentation uses it as an example.
Re: Open several files and read line by line
by QuillMeantTen (Friar) on Dec 05, 2015 at 10:23 UTC

    Greetings, I have had to solve a problem that seems to have enough in common with yours to maybe give you some ideas : Have a look I hope you will find it helpful

      Thank you, QuillMeantTen. Very nice thread!

      With best regards,

      Leiria

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1149392]
Approved by Athanasius
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (4)
As of 2024-04-19 20:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found