If what you want is to match Guinea but not New Guinea or Equatorial Guinea, then what you probably really want is a negative lookbehind assertion that specifically rules out being preceded by "New " or "Equatorial "
One caveat: You can't use alternation in the look-behind assertion because variable-length negative look-behind assertion isn't supported. Instead, you must list the alternatives separately. You can, of course, use alternation in the look-ahead assertion.
use strict;
use warnings;
my $pattern = qr{
(?<!New\s)
(?<!Equatorial\s)
Guinea
(?![\s-](?:Bissau|pig))
}ix;
while (my $text = <DATA>) {
my $match = $text =~ m/$pattern/ ? 1 : 0;
print "$match $text";
# This prints...
# 0 Papua New Guinea
# 1 I live in Guinea.
# 1 i live in guinea, but i don't have a shift key.
# 0 Guinea-Bissau
# 0 Guinea Bissau
# 0 Equatorial Guinea
# 0 I love guinea pigs!
}
__DATA__
Papua New Guinea
I live in Guinea.
i live in guinea, but i don't have a shift key.
Guinea-Bissau
Guinea Bissau
Equatorial Guinea
I love guinea pigs!
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|