You need to quotemeta since you are trying to match a list of fixed strings not a list of regular expressions. This also makes it easy to figure out the maximum length of "carry over" you need between buffers.
#!/usr/bin/perl -w
use strict;
my @checklist;
my @testfile;
my $rulenames;
open( FILE, "wordlist" )
or die "Can't read wordlist: $!\n";
my $maxLen = 0;
while( <FILE> ) {
next if $_ =~ /^ *$/ || $_ =~ /^#/;
$maxLen = length($_) if $maxLen < length($_);
push @checklist, $_;
}
chomp @checklist;
close(FILE);
my $bufSize= 8*1024;
$bufSize= 2*$maxLen if $bufSize < 2*$maxLen;
my $GrepList = join '|', map quotemeta $_, @checklist;
$GrepList = qr/($GrepList)/i;
@ARGV = grep -f $_, <*>;
$/= \$bufSize; # Have <> read $bufSize bytes
if( @ARGV ) {
my $prev= "";
while( <> ) {
$_ =~ s/\n//g;
if( ($prev.$_) =~ /$GrepList/ ) {
print "$ARGV : $1\n";
close( ARGV );
$prev= "";
} elsif( eof ) {
$prev= "";
} else {
$prev = substr( $_, -$maxLen );
}
}
}
Tested and works. Note that this assumes that you don't have huge runs of newlines in the middles of your matches.
- tye
Updated: I originally left out the setting of $/.
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|