First, if I understand your stated task, the common unix/gnu "grep" command already does what you want:
cd /path/to/search
grep -l pattern_to_find *
The two shell commands above do exactly what you were trying to do in perl (and if you're on a windows system, the gnu "bash" shell and "grep" are available for your OS -- since you have perl, you should know about these other tools).
There are a few problems with the OP script:
- You opendir() a path provided by the user and get file names, which is fine, but then you either need to chdir to that path, or else prepend the path string to each file name in order to open the file successfully. The OP script doesn't do either of those things
- You read the full content of every file into memory, but you don't need to do that, just read a line at a time until you either (a) reach EOF, or (b) find the first occurrence of the target pattern. In the latter case, you can print the file name, stop reading that file, and move on to the next. Big files will only increase the memory footprint if they happen to be binaries without any embedded line-breaks.
- You do a regex match on an array, but the =~ operator is supposed to be used on a single scalar value (one string) -- i.e. on each element of the array. That's another good reason just to read one line at a time, to check each line against the regex, and not use an array for file data.
- (added as an update:) You require the user to type input to the script after it starts running, rather than getting all the required user input from command-line args (using @ARGV) -- that gets really tiresome.
I'm not sure if you've given us the exact wording of the error message you got, and I'm not sure why you got a message like "readline() on closed filehandle" -- but that's the least of your problems. If you really don't want to use the existing "grep" command (e.g. if you want to use a regex that only Perl will support), then try something like this:
#!/usr/bin/perl
use strict;
use warnings;
my $Usage = "Usage: $0 [-p path/to/search] regex\n";
if ( @ARGV > 2 and $ARGV[0] eq '-p' ) {
shift;
chdir $ARGV[0] or die "Can't chdir to $ARGV[0]: $!\n";
shift;
}
die $Usage unless ( @ARGV == 1 );
my $regex = shift;
opendir( D, '.' );
my @files = grep { -f } readdir D; # we only want to look at data fil
+es
my @matches;
for my $f ( @files ) {
open( F, $f ) or do {
warn "open failed for $f: $!\n";
next;
};
while (<F>) {
if ( m{$regex} ) {
push @matches, $f;
last;
}
}
}
print "The pattern {$regex} was found in ", scalar @matches, " files:\
+n";
print "@matches\n";
| [reply] [d/l] [select] |
Add error checking to your open statements. You are only referencing the filename and not the full path, so it can't find the file.
The following is how I'd clean up your script:
#!/usr/bin/perl
#regexp.pl
use File::Spec;
use strict;
use warnings;
print "Gimme the address of the directory:\n";
chomp(my $folder = <>);
print "What's the phrase you're looking for?\n";
chomp(my $find = <>);
my @found;
opendir my $dh, $folder or die "Can't open $folder: $!";
while (my $file = readdir($dh)) {
next if $file =~ /^\.+$/;
my $path = File::Spec->catfile($folder, $file);
next if ! -f $path;
open my $fh, $path or die "Can't open $path: $!";
my $data = do {local $/; <$fh>};
close $fh;
if ($data =~ /\Q$find\E/i){
push @found, $file;
}
}
close $dh;
print "Your query was found in the following files:\n";
print "@found\n";
| [reply] [d/l] |
If you're working in your target directory, the error message you've asked about appears because you're trying to open the parent directory (..), the current directory (.) and subdirectories, if any, as if they were files, because opendir captured those to your array of files.
Knock those out of your @files before trying to open anything. (see http://perldoc.perl.org/perlfunc.html)
You have some other problems in this script. Pay special attention to graff's discussion of path. Some will be easily solved if you add use diagnostics; to your pragmata; some are in nature of your failure to test the open at line 21 (and -- in the same line -- failure to use what's now considered best practice: 3 arg open with lexical filehandles.)
Perhaps most critical among the problems you didn't ask about is the attempt at line 23 to test an array (in scalar context) for a match -- and you need to read about qr/.../ -- either qr or in Quote and Quote-like Operators to make your regex match what you expect... and at line 24, where you'll find you're pushing something quite unexpected onto @found.
Updated for grammar, markup and clarity
Update 2 (Warning: Sunday morning content): This is one way of attacking your target (and problem) that's along the lines you initially tried:
#!/usr/bin/perl
use warnings;
use strict;
# use diagnostics;
#regexp.pl # 911425
print "Enter the full path to the directory you want to search: ";
my $folder = <>;
chomp $folder;
chdir($folder);
opendir(DIR, $folder) or die "Can't open $folder, $!";
my @files = readdir(DIR) or die "Can't readdir $folder, $!";
print "What's the phrase you're looking for?: ";
my $find = <STDIN>;
chomp $find;
my $searchterm = qr/$find/;
my (%found, $found, $file);
for $file(@files) {
next if ($file =~ /^\./);
next unless (-T $file); # text files only (excludes binary f
+iles such as *.doc or .xls)
open(my $fh, '<', $file) or die "Can't open $file: $!";
my @content = <$fh>;
for my $line(@content) {
if ($line =~ /$searchterm/i) {
my $key = $file;
$found{$key} += 1;
}
}
}
while (my ($key, $value) = each %found) {
print "$key has \t $value instance(s) of \t \"$find\"\n";
}
| [reply] [d/l] [select] |
The glob() function will tack the file name onto the path specified for the search directory, and it will skip hidden files that start with a dot:
use strict;
use warnings;
use 5.010;
my @files = glob "/users/me/*";
for (@files) {
say;
}
--output:--
/users/me/066.JPG
/users/me/069.JPG
/users/me/072.JPG
/users/me/077.JPG
/users/me/079-1.JPG
/users/me/079.JPG
/users/me/081-1.JPG
/users/me/081.JPG
/users/me/1.txt
/users/me/1perl.pl
...
...
| [reply] [d/l] |