Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Opening multiple files in directory

by molson (Acolyte)
on Mar 18, 2009 at 21:30 UTC ( [id://751577]=perlquestion: print w/replies, xml ) Need Help??

molson has asked for the wisdom of the Perl Monks concerning the following question:

I'm a complete newb to Perl so excuse the simple question. I've searched for an answer but there seem to be so many ways to do the same thing I think I'm confusing myself. I have multiple files I need to read data from. The data is always in the same format and there is no repeatable delimiter, so I am using substr to parse the data. I want a count of all unique company IDs combined for all the files. Here is part of my code (I removed unnecessary info)
#!/user/bin/perl open (INFILE, "TEST") or die "DIR NOT FOUND! \n$!"; open (OUTFILE, '>> COID_LIST.txt') or die "Unable to open Write File! +\n$!"; while (<INFILE>) { my $COMPANY_ID = substr $_, 260, 5; push @COMPANIES, {comp => $COMPANY_ID}; } foreach (@COMPANIES) { $sum{$_->{'comp'}} += $_->{'count'}; } #Output in %sum use Data::Dumper; print OUTFILE Dumper(\%sum); close OUTFILE;
This works for a single file but I can't get it to read multiple files. I've tried using open MYDIR "." and foreach to iterate through the files, but I end up with an empty output file. Can someone help me out with this newb question?

Replies are listed 'Best First'.
Re: Opening multiple files in directory
by almut (Canon) on Mar 18, 2009 at 21:52 UTC
    I've tried using open MYDIR "." and foreach to iterate through the files

    open? — or opendir (and readdir)? 

Re: Opening multiple files in directory
by olus (Curate) on Mar 18, 2009 at 21:58 UTC

    This works for me. Adjust as convenient.

    use strict; use warnings; use Data::Dumper; my $some_dir = "."; opendir(DIR, $some_dir) || die "can't opendir $some_dir: $!"; my @files = grep { /txt/ } readdir(DIR); closedir DIR; foreach my $f (@files) { open IN, "<$f"; my @cmpids = (); while(<IN>) { push @cmpids, $_; } close IN; open OUT, ">>COID_LIST.TXT"; print OUT Dumper(\@cmpids); close OUT; }
Re: Opening multiple files in directory
by kennethk (Abbot) on Mar 18, 2009 at 21:58 UTC

    The naive response is that you must open each file individually for I/O. By 'tried using open MYDIR "." and foreach to iterate through the files', do you mean you tried something like the following?

    #!/user/bin/perl open (OUTFILE, '>>', 'COID_LIST.txt') or die "Unable to open Write Fil +e! \n$!"; opendir MYDIR, '.' or die "opendir $dir_name failed"; my @file_list = readdir(MYDIR); closedir MYDIR, '.' or die "opendir $dir_name failed"; foreach $file (@file_list) { open (INFILE, '<', $file) or die "File not found\n$!"; while (<INFILE>) { my $COMPANY_ID = substr $_, 260, 5; push @COMPANIES, {comp => $COMPANY_ID}; } close INFILE; } foreach (@COMPANIES) { $sum{$_->{'comp'}} += $_->{'count'}; } #Output in %sum use Data::Dumper; print OUTFILE Dumper(\%sum); close OUTFILE;

    Note that I changed to the three-argument form of open - this is particularly important if your take in a list of files. You should also get into the habit of starting files with use strict;use warnings since it'll save you from typos.

Re: Opening multiple files in directory
by gwadej (Chaplain) on Mar 19, 2009 at 03:56 UTC

    Since others have answered the question you asked, I thought I would answer one that you didn't.<grin/>

    Often the simplest way to process more than one file is to pass those files to your script on the command line. Then you can use one of Perl's built-in features to make your life easier. In the code below, I've removed the open() call for the read file and changed your while loop.

    #!/user/bin/perl open (OUTFILE, '>> COID_LIST.txt') or die "Unable to open Write File!\ +n$!"; while (<>) { my $COMPANY_ID = substr $_, 260, 5; push @COMPANIES, {comp => $COMPANY_ID}; } foreach (@COMPANIES) { $sum{$_->{'comp'}} += $_->{'count'}; } #Output in %sum use Data::Dumper; print OUTFILE Dumper(\%sum); close OUTFILE;

    The resulting code is called with the list of files to process on the command line. The code will then process the files from the command line, one at a time until complete. This approach can simplify some problems.

    G. Wade

      Just wanted to say thanks this helped me a lot making this:

      grepcertainfilesgrepcertainlinesandputresultsinfiles.pl

      #!/usr/bin/perl use strict; use warnings; # FIND THE PATH TO THE Directory: my $dir = $ARGV[0]; opendir(DIR, $dir) or die $!; # select for files with names containing txt.fit my @files= grep { /txt.fit/ } readdir DIR; closedir DIR; # Read files line by line foreach my $file(@files) { open IN, "<$file" or die $!; my $tree = (); while(<IN>) { # skip everything in file not conatining Tree mixture next unless ($_ =~ m/Tree mixture/); # remove content from line thats unwanted in output $tree = $_; $tree =~s/Tree mixtureTree=//; } #name output file after input file but add .tre my $outfile = "$file.tre"; # here i 'open' the file, saying i want to write to it with the '>>' s +ymbol open (FILE, ">> $outfile") || die "problem opening $outfile\n"; print FILE $tree; }

        I haven’t tried to run this code, but just looking through it there are two obvious problems:

        1. Within a regex, a dot matches any character other than a newline (unless the regex has an /s modifier). So /txt.fit/ matches “mytxt.fit”, but it also matches “yourtxtafit”, “histxt0fitandmore”, etc. You need to backslash the . to make it match a literal dot only: /txt\.fit/; and unless “fit” may be followed by other characters, you want: /txt\.fit$/.

        2. $tree is a scalar variable. Within the inner while loop, each time the line $tree = $_; is executed, it overwrites whatever was in the variable. So when this inner loop finishes, only the last line containing “Tree mixture” is written to the output file. Here is one way to correct this (untested):

          foreach my $file (@files) { my $outfile = "$file.tre"; open(IN, '<', $file) or die $!; open(FILE, '>>', $outfile) or die $!; while (my $tree = <IN>) { next unless /Tree mixture/; $tree =~ s/Tree mixtureTree=//; print FILE $tree; } close FILE or die $!; close IN or die $!; }

        Hope that helps,

        Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Re: Opening multiple files in directory
by bichonfrise74 (Vicar) on Mar 19, 2009 at 00:49 UTC
    Please try this...
    #!/usr/bin/perl use strict; my %count; my $directory = "c:\\Test"; opendir( DIR, $directory ) || die "Unable to open directory - $!\n"; my @files = grep /\.txt/, readdir( DIR ); closedir( DIR ); open( OUTFILE, ">> $directory\\COID_LIST.txt" ) || die "Unable to open write file! - $!\n"; foreach my $file (@files) { open( FH, "$directory\\$file" ) || die "Unable to open $file - $!\n"; while( <FH> ) { my $ID = substr( $_, 260, 5 ); print OUTFILE "$ID\n"; $count{$ID}++ if ( defined( $ID ) ); } close( FH ); } print "Number of company ID = " . scalar keys %count;
      You guys rock! Thanks for all the replies!! I used the code below and it gives me a count of the total unique company IDs, but is there a way to return a list of unique company IDs as well? Please try this...
      #!/usr/bin/perl use strict; my %count; my $directory = "c:\\Test"; opendir( DIR, $directory ) || die "Unable to open directory - $!\n"; my @files = grep /\.txt/, readdir( DIR ); closedir( DIR ); open( OUTFILE, ">> $directory\\COID_LIST.txt" ) || die "Unable to open write file! - $!\n"; foreach my $file (@files) { open( FH, "$directory\\$file" ) || die "Unable to open $file - $!\n"; while( <FH> ) { my $ID = substr( $_, 260, 5 ); print OUTFILE "$ID\n"; $count{$ID}++ if ( defined( $ID ) ); } close( FH ); } print "Number of company ID = " . scalar keys %count;
        If you stored the values in an array rather than printing them directly, you could use uniq from List::MoreUtils, which is a core module.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://751577]
Approved by Old_Gray_Bear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (5)
As of 2024-04-19 05:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found