This works okay, though how it fairs performance wise compared with other methods I'm not sure.
#! perl -slw
use strict;
sub reAnd{
my $re = '';
$re .= '(?=^.*\b' . quotemeta() . '\b)' for @_;
return qr[$re];
}
my @words = qw[ an of and ];
my $re1 = reAnd( @words );
#print $re1;
my $re2 = reAnd( qw[ a great sweet mother by the wellfed voice beside
+him ] );
#print $re2;
while( <DATA> ) {
m[$re1]i and print "1:$_";
m[$re2] and print "2:$_";
}
__DATA__
Stephen, an elbow rested on the jagged granite, leaned his palm agains
+t
his brow and gazed at the fraying edge of his shiny black coat-sleeve.
Pain, that was not yet the pain of love, fretted his heart. Silently,
in a dream she had come to him after her death, her wasted body within
its loose brown graveclothes giving off an odour of wax and rosewood,
her breath, that had bent upon him, mute, reproachful, a faint odour o
+f
wetted ashes. Across the threadbare cuffedge he saw the sea hailed as
a great sweet mother by the wellfed voice beside him. The ring of bay
and skyline held a dull green mass of liquid. A bowl of white china ha
+d
stood beside her deathbed holding the green sluggish bile which she ha
+d
torn up from her rotting liver by fits of loud groaning vomiting.
Prints C:\test>624296.pl
1:its loose brown graveclothes giving off an odour of wax and rosewood
+,
2:a great sweet mother by the wellfed voice beside him. The ring of ba
+y
The basic mechanism is to use regex of the form (?=^.*\bword\b). That is, a positive lookahead assertion that reads: Starting at the begining of the line, skip as much of anything as need to try and locate the word 'word', delimited by word/nonword transitions. (\b).
As these are zero length assertions, they do not advance the matchpoint, so adding a second one again starts from the beginning of the string. This gives the ability to match any number of words in any order. If they all match, the regex succeeds and the AND operation is achieved.
By generating the regex in a sub, the 'horrors' of the 'bunch of regex' can be hidden from the squeamish.
Add /i to the use of the generated regex if you need case independant matching.
If you omit the ^, then the lookaheads will continue from the current pos, and so you can append the AND operation to longer regex. However, continuing to match after the successful match is more involved.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
|