molson has asked for the wisdom of the Perl Monks concerning the following question:
I'm a complete newb to Perl so excuse the simple question. I've searched for an answer but there seem to be so many ways to do the same thing I think I'm confusing myself. I have multiple files I need to read data from. The data is always in the same format and there is no repeatable delimiter, so I am using substr to parse the data. I want a count of all unique company IDs combined for all the files. Here is part of my code (I removed unnecessary info)
#!/user/bin/perl
open (INFILE, "TEST") or die "DIR NOT FOUND! \n$!";
open (OUTFILE, '>> COID_LIST.txt') or die "Unable to open Write File!
+\n$!";
while (<INFILE>) {
my $COMPANY_ID = substr $_, 260, 5;
push @COMPANIES, {comp => $COMPANY_ID};
}
foreach (@COMPANIES) {
$sum{$_->{'comp'}} += $_->{'count'};
}
#Output in %sum
use Data::Dumper;
print OUTFILE Dumper(\%sum);
close OUTFILE;
This works for a single file but I can't get it to read multiple files. I've tried using open MYDIR "." and foreach to iterate through the files, but I end up with an empty output file.
Can someone help me out with this newb question?
Re: Opening multiple files in directory
by almut (Canon) on Mar 18, 2009 at 21:52 UTC
|
| [reply] [d/l] [select] |
Re: Opening multiple files in directory
by olus (Curate) on Mar 18, 2009 at 21:58 UTC
|
use strict;
use warnings;
use Data::Dumper;
my $some_dir = ".";
opendir(DIR, $some_dir) || die "can't opendir $some_dir: $!";
my @files = grep { /txt/ } readdir(DIR);
closedir DIR;
foreach my $f (@files) {
open IN, "<$f";
my @cmpids = ();
while(<IN>) {
push @cmpids, $_;
}
close IN;
open OUT, ">>COID_LIST.TXT";
print OUT Dumper(\@cmpids);
close OUT;
}
| [reply] [d/l] |
Re: Opening multiple files in directory
by kennethk (Abbot) on Mar 18, 2009 at 21:58 UTC
|
The naive response is that you must open each file individually for I/O. By 'tried using open MYDIR "." and foreach to iterate through the files', do you mean you tried something like the following?
#!/user/bin/perl
open (OUTFILE, '>>', 'COID_LIST.txt') or die "Unable to open Write Fil
+e! \n$!";
opendir MYDIR, '.' or die "opendir $dir_name failed";
my @file_list = readdir(MYDIR);
closedir MYDIR, '.' or die "opendir $dir_name failed";
foreach $file (@file_list) {
open (INFILE, '<', $file) or die "File not found\n$!";
while (<INFILE>) {
my $COMPANY_ID = substr $_, 260, 5;
push @COMPANIES, {comp => $COMPANY_ID};
}
close INFILE;
}
foreach (@COMPANIES) {
$sum{$_->{'comp'}} += $_->{'count'};
}
#Output in %sum
use Data::Dumper;
print OUTFILE Dumper(\%sum);
close OUTFILE;
Note that I changed to the three-argument form of open - this is particularly important if your take in a list of files. You should also get into the habit of starting files with use strict;use warnings since it'll save you from typos. | [reply] [d/l] [select] |
Re: Opening multiple files in directory
by gwadej (Chaplain) on Mar 19, 2009 at 03:56 UTC
|
Since others have answered the question you asked, I thought I would answer one that you didn't.<grin/>
Often the simplest way to process more than one file is to pass those files to your script on the command line. Then you can use one of Perl's built-in features to make your life easier. In the code below, I've removed the open() call for the read file and changed your while loop.
#!/user/bin/perl
open (OUTFILE, '>> COID_LIST.txt') or die "Unable to open Write File!\
+n$!";
while (<>) {
my $COMPANY_ID = substr $_, 260, 5;
push @COMPANIES, {comp => $COMPANY_ID};
}
foreach (@COMPANIES) {
$sum{$_->{'comp'}} += $_->{'count'};
}
#Output in %sum
use Data::Dumper;
print OUTFILE Dumper(\%sum);
close OUTFILE;
The resulting code is called with the list of files to process on the command line. The code will then process the files from the command line, one at a time until complete. This approach can simplify some problems.
| [reply] [d/l] [select] |
|
#!/usr/bin/perl
use strict;
use warnings;
# FIND THE PATH TO THE Directory:
my $dir = $ARGV[0];
opendir(DIR, $dir) or die $!;
# select for files with names containing txt.fit
my @files= grep { /txt.fit/ } readdir DIR;
closedir DIR;
# Read files line by line
foreach my $file(@files) {
open IN, "<$file" or die $!;
my $tree = ();
while(<IN>) {
# skip everything in file not conatining Tree mixture
next unless ($_ =~ m/Tree mixture/);
# remove content from line thats unwanted in output
$tree = $_;
$tree =~s/Tree mixtureTree=//;
}
#name output file after input file but add .tre
my $outfile = "$file.tre";
# here i 'open' the file, saying i want to write to it with the '>>' s
+ymbol
open (FILE, ">> $outfile") || die "problem opening $outfile\n";
print FILE $tree;
}
| [reply] [d/l] |
|
| [reply] [d/l] [select] |
Re: Opening multiple files in directory
by bichonfrise74 (Vicar) on Mar 19, 2009 at 00:49 UTC
|
#!/usr/bin/perl
use strict;
my %count;
my $directory = "c:\\Test";
opendir( DIR, $directory )
|| die "Unable to open directory - $!\n";
my @files = grep /\.txt/, readdir( DIR );
closedir( DIR );
open( OUTFILE, ">> $directory\\COID_LIST.txt" )
|| die "Unable to open write file! - $!\n";
foreach my $file (@files) {
open( FH, "$directory\\$file" )
|| die "Unable to open $file - $!\n";
while( <FH> ) {
my $ID = substr( $_, 260, 5 );
print OUTFILE "$ID\n";
$count{$ID}++ if ( defined( $ID ) );
}
close( FH );
}
print "Number of company ID = " . scalar keys %count;
| [reply] [d/l] |
|
You guys rock! Thanks for all the replies!!
I used the code below and it gives me a count of the total unique company IDs, but is there a way to return a list of unique company IDs as well?
Please try this...
#!/usr/bin/perl
use strict; my %count; my $directory = "c:\\Test";
opendir( DIR, $directory )
|| die "Unable to open directory - $!\n";
my @files = grep /\.txt/, readdir( DIR );
closedir( DIR );
open( OUTFILE, ">> $directory\\COID_LIST.txt" )
|| die "Unable to open write file! - $!\n";
foreach my $file (@files) {
open( FH, "$directory\\$file" )
|| die "Unable to open $file - $!\n";
while( <FH> ) {
my $ID = substr( $_, 260, 5 );
print OUTFILE "$ID\n";
$count{$ID}++ if ( defined( $ID ) );
}
close( FH );
}
print "Number of company ID = " . scalar keys %count;
| [reply] [d/l] |
|
If you stored the values in an array rather than printing them directly, you could use uniq from List::MoreUtils, which is a core module.
| [reply] |
|
|