I can't say much without having examples of data which produce these "erronious results," but I've two guesses:
- $in{for} is empty. This means that the regex is using whatever the last successful match or replacement was. This is probably not what you want. Postfix $in{for} with .* to avoid this.
- $in{for} contains metacharacters. That is, things like . and * which have special meaning to a regular expression. To get around this, use \Q$in{for}\E instead of $in{for} in the regex; or, somewhere above the regex, run $in{for} through quotemeta.
perl -pe '"I lo*`+$^X$\"$]!$/"=~m%(.*)%s;$_=$1;y^`+*^e v^#$&V"+@( NO CARRIER'
| [reply] [d/l] [select] |
So my question is, what am I getting these erronious results?
That's hard to say without seeing some representative data. Show us a value for $in{for}, and @thread[0,4] pair that you expect to match (but doesn't), and a pair that does match that you expect should not.
| [reply] |
Ok, here's (I hope) a little clarification.
here's a big snipet of code
my @dat_files = <$board_dir/*.dat>;
$q = 0;
foreach (@dat_files) {
$number;
$number = $_;
$number =~ s/\/var\/www\/cgi-bin\/2930forum\/data\///g;
$number =~ s/\.dat//g;
open THREAD, "$_" or die "Can't open .dat file: $!";
$x = 0;
while (<THREAD>) {
$thread_data[$x] = $_;
$x++;
}
close THREAD;
foreach (@thread_data) {
@details = split /\|/, $_;
if ($details[4] =~ m/\Q$in{for}\E/i) {
$found[$q] = $number;
$q++;
}
}
}
It's a little sploppy at this point, but I'm just trying to get valid results at this point, I'll clean it up later.
$in{for}: is defined my form input (I've been using simple searches like "cheese")
$thread_data[4]: is the messages posted in every thread. I could potentially contain just about anything except empty.
$thread_data[0]: I just realized isn't used. Insted it's $number. Which is just a number string denoting the thread number
I do the search and get results. Some of the threads contain the searchword $in{for} and others do not. For example I put in $in{for} = cheese
and get around 30 results containing messages like:
Hard work pays off after time, but lazyness always pays off now.
One thing I just noticed is that many of the results are in numerical order. for example I get results like 121, 122, 124, 125, 126, 127, 128, 129, 21, 282 ,321, 343, 344, 345, etc | [reply] [d/l] |
my @dat_files = <$bboard/*.dat>;
my %found = ();
foreach my $file ( @dat_files ) {
open(DAT, $file) or die "$file: $!";
while ( <DAT> ) {
my @thread_data = split "|";
if ( $thread_data[4] =~ m/\Q$in{$for}\E/i ) {
$found{$thread_data[0]}++;
}
}
close(DAT);
}
The keys of %found are now the thread numbers taht contain a match, and the corresponding values are the number of matches.
| [reply] [d/l] [select] |
Maybe you need to do something like ...
my $IN = quotemeta($in);
if ($thread_data[4] =~ m/$IN{for}/i) {
print $thread_data[0];
}
| [reply] [d/l] |
Hi,
I can't see nothing wrong in the snipet... Maybe the problem
lays in how the @thread_data array is filled? are you shure that when you test the next thread_data _all_ @thread_data elements have changed?
Leo TheHobbit
GED/CS d? s-:++ a+ C++ UL+++ P+++>+++++ E+ W++ N+ o K? !w O? M V PS+++
PE-- Y+ PPG+ t++ 5? X-- R+ tv+ b+++ DI? D G++ e*(++++) h r++ y+++(*)
| [reply] |