Re: Regular expression help: why does this not match?
by Zaxo (Archbishop) on Jan 18, 2007 at 22:57 UTC
|
'?' is a regex metacharacter for optional. It doesn't match a literal '?'.
'.' is also, but it will only surprise you later. It does match literal '.'.
You can write,
if ($a =~ /\Q$b\E/) {
# . . .
}
to get all characters in $b taken as literal.
BTW, $a and $b are special to sort, so shouldn't be chosen as user variable names.
| [reply] [Watch: Dir/Any] [d/l] |
|
yes, i figured that... so if i have a very long string that contains multiple possible regex metacharacters, how can i do a match and tell that match to "ignore" any such metacharacters? or do i have to process the string first and backslash all of them??
| [reply] [Watch: Dir/Any] |
|
Or don't use a regexp at all.
$a =~ /^\Q$b\E\z/
is equivalent to
$a eq $b
$a =~ /\Q$b\E/
is equivalent to
index($a, $b) >= 0
Case-insensitive versions:
$a =~ /^\Q$b\E\z/i
is equivalent to
lc($a) eq lc($b)
$a =~ /\Q$b\E/i
is equivalent to
index(lc($a), lc($b)) >= 0
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
| [reply] [Watch: Dir/Any] |
Re: Regular expression help: why does this not match?
by chargrill (Parson) on Jan 18, 2007 at 23:00 UTC
|
Correct, ? is a regex meta character which means "Match 1 or 0 times". In your first example, your regex is looking for:
http://www(any char)myurl(any char)/e(any char)ph(0 or 1 p)a.
Please note that the dot . also has special meaning - it means to match any character (also note that a dot qualifies as "any character" :-). You could specify \. to match a dot, but given that you want to match the question mark too, you might just be better off with \Q (quote the following regex metacharacters) (generally followed by \E (stop quoting regex metacharacters), too), i.e.:
$a = "http://www.myurl.com/e.php?a";
$b = "http://www.myurl.com/e.php?a";
if ($a =~ /\Q$b\E/) {
print "FOUND\n";
} else {
print "NOT FOUND\n";
}
Incidentally, $a and $b are bad names for variables, as they have special meaning to sort. And you might also have to fix up your shebang (!/usr/bin/perl should be #!/usr/bin/perl) in case you ever want to run your program via ./ instead of perl program.pl.
--chargrill
s**lil*; $*=join'',sort split q**; s;.*;grr; &&s+(.(.)).+$2$1+; $; =
qq-$_-;s,.*,ahc,;$,.=chop for split q,,,reverse;print for($,,$;,$*,$/)
| [reply] [Watch: Dir/Any] [d/l] [select] |
Re: Regular expression help: why does this not match?
by gaal (Parson) on Jan 18, 2007 at 23:00 UTC
|
You're not stupid, after all you answered your own question correctly. The '?' in the first $b is indeed being interpreted as a metacharacter. To prevent this, use the \Q...\E codes in your regular expression (or the quotemeta builtin outside it):
$a = "http://www.myurl.com/e.php?a";
$b = "http://www.myurl.com/e.php?a";
if ($a =~ /\Q$b\E/) { # <---------
print "FOUND\n";
} else {
print "NOT FOUND\n";
}
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
ah ha! \Q..\E is what i was looking for...
however, its funny and a bit counter intuitive how this code still knows to interpret $b rather than simply going \$b
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
It may be surprising at first, but the idea is that it lets you construct regexps on the fly. One common thing this is used for is when you have a list of valid values you got from somewhere, say @valid, and you want to check a value against it:
my $valid = join "|", @valid;
print "okay" if /^$valid$/;
There are actually two improvements to make in the above code. First, the members of the valid list themselves might contain metacharacters in need of quoting; second, Perl has the qr// operator to make this more efficient:
# don't run this code on every match: the idea is the qr// needs
# to be computed only once.
my $valid = join "|", map { quotemeta } @valid;
my $valid_re = qr/^$valid$/;
# now match as many times as you like.
print "$_: " . (/$valid_re/ ? "okay" : "not okay") . "\n"
for @a_bunch_of_inputs;
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
Re: Regular expression help: why does this not match?
by vaticide (Scribe) on Jan 18, 2007 at 23:02 UTC
|
The ? is a regex metacharacter. If you want to match it, match on "\?".
Otherwise, you should be checking for string equality using $a eq $b.
#!/usr/local/perl
$a = "http://www.myurl.com/e.php?a";
$b = qr|http://www.myurl.com/e.php\?a|;
if ($a =~ /$b/) {
print "FOUND\n";
} else {
print "NOT FOUND\n";
}
$a = "http://www.myurl.com/e.php";
$b = "http://www.myurl.com/e.php";
if ($a =~ /$b/) {
print "FOUND\n";
} else {
print "NOT FOUND\n";
}
$a = "http://www.myurl.com/e.php";
$b = "http://www.myurl.com/e.php";
if ($a eq /$b/) {
print "FOUND\n";
} else {
print "NOT FOUND\n";
}
| [reply] [Watch: Dir/Any] [d/l] |