Re: Regex conditional match if previous match in same expression is true?
by ikegami (Patriarch) on Apr 09, 2007 at 18:31 UTC
|
What in the world am I missing?
The error is in your problem definition. Your regexp does exactly what you said you wanted it to do. It's searching for a string optionally surrounded by '{' and '}'. {hello is optionally surrounded by {...} since it's is not surrounded by {...}.
'oh {hello there' =~ /
([{]{0,1}) # Matches '' (after some backtracking)
hello # Matches 'hello'
(?(1)\}) # Matches ''
/x;
'oh hello} there' =~ /
([{]{0,1}) # Matches ''
hello # Matches 'hello'
(?(1)\}) # Matches ''
/x;
As you can see, searching for a string optionally surrounded by something is the same thing as searching for the string itself.
From your expected results, I deduce you actually want a string that is surrounded by {...}, or one that is neither preceded by { nor followed by }.
/
{hello} # '{hello}'
|
(?<! { ) # Not preceded by '{'
hello # 'hello'
(?! } ) # Not followed by '}'
/x
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
From your expected results, I deduce you actually want a string that is surrounded by {...}, or one that is neither preceded by { nor followed by }.
That's correct: I want the string, optionally surrounded by braces. One brace on only one side is not acceptable. I'm glad that my listing of expected results was clearer than my description. ;-)
Your alternation approach certainly functions. However, I was also hoping to learn to use the conditional ( (?(COND)...) ) notation. So you answered the question I asked (thanks!); but left me with the one I didn't ask.
For my own education, can you think of a solution that uses the conditional notation, or would I be horribly abusing said to solve this problem?
<–radiant.matrix–>
Ramblings and references
The Code that can be seen is not the true Code
I haven't found a problem yet that can't be solved by a well-placed trebuchet
| [reply] [Watch: Dir/Any] [d/l] |
|
/
(?(?<=(.))
# We are not at the start of the string.
# The preceding character is in $1.
hello
(?(?{ $1 eq "\{" })
# The char before 'hello' is '{'.
}
|
# The char before 'hello' is not '{'.
(?! } )
)
|
# We are at the start of the string.
hello
(?! } )
)
/x
Yikes!
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
|
|
|
Other posters were correct when they changed things from using the {min,max} quantifier notation on the inside of the capture buffer to using the '?' quantifier on the outside. There is a crucial difference between "empty but matching" and "not matching", and {0,1} doesnt have the same behaviour as '?' even though they are functionally equivelent. (This may be construed as a bug :-) Alternatively you can do what i do below, which is to put the capture inside of an alternation.
Anyway, the following uses the conditional pattern with lookahead/lookbehind to match as you requested. You can play around with it to easily forbid {{hello}}, as the current code allows it.
use strict;
use warnings;
for (
'oh {hello} there',
'oh {hello there',
'oh hello there',
'oh hello} there',
'of {{hello}} there',
) {
if ( $_ =~ /
(?:
( \{ ) # capture a '{'
| # or
(?<! \{ ) # not preceded by a '{'
)
hello # .. followed by 'hello'
(?(1) # if (defined $1)
\} # match an '}'
| # else
(?! \} ) # not followed by a '}'
) #
/x)
{
print "YEP : $_ : ",
defined $1 ? "'$1'" : 'undef',
" - $&\n";
} else {
print "NOPE: $_\n";
}
}
__END__
YEP : oh {hello} there : '{' - {hello}
NOPE: oh {hello there
YEP : oh hello there : undef - hello
NOPE: oh hello} there
YEP : of {{hello}} there : '{' - {hello}
Also I changed your diagnostic code, you were using $1 even when the match failed, which meant you were getting the wrong thing.
---
$world=~s/war/peace/g
| [reply] [Watch: Dir/Any] [d/l] |
|
|
Re: Regex conditional match if previous match in same expression is true?
by merlyn (Sage) on Apr 09, 2007 at 17:23 UTC
|
Looks to me like $1 is always going to be defined. It just may be empty. Maybe what you want is (\{)? to match the open brace, very much like the example in perlre.
| [reply] [Watch: Dir/Any] [d/l] |
|
use strict;
use warnings;
for (
'oh {hello} there',
'oh {hello there',
'oh hello there',
'oh hello} there',
) {
print '',
( $_ =~ /
(\{)? # optional opening brace
hello # .. followed by 'hello'
(?(1)\}) # a closing brace iif the open brace was the
+re
/x ? 'YEP ' : 'NOPE' ),
" - $1\n";
}
But my output is:
YEP - {
YEP -
YEP -
YEP -
However, now warnings are thrown for using an uninitialized '$1' in the last three cases, as it should be.
What I'm struggling with is how to say that '{hello}' is OK, and 'hello' is OK, but '{hello' and 'hello}' are NOT OK. Or, put another way, I want to require a closing brace if an opening brace is found; however, if there is no opening brace then there must be no closing brace either.
Thanks for the help, though, it's an improvement.
<–radiant.matrix–>
Ramblings and references
The Code that can be seen is not the true Code
I haven't found a problem yet that can't be solved by a well-placed trebuchet
| [reply] [Watch: Dir/Any] [d/l] [select] |
Re: Regex conditional match if previous match in same expression is true?
by Rhandom (Curate) on Apr 09, 2007 at 19:02 UTC
|
The following doesn't use the (?() pat) construct but it does use a similar setup - and it prints out the correct output:
for (
'oh {hello} there',
'oh {hello there',
'oh hello there',
'oh hello} there',
) {
our $paren;
if (/
({ | (?<!{)) # optional opening brace
(?{ $paren = $^N }) # store for later
hello # .. followed by 'hello'
(??{$paren ? "\}" : "(?!\})"}) # a closing brace if the open
+brace was there
/x) {
print "Yep - $1\n";
} else {
print "Nope\n";
}
}
I tried briefly to get the (?() ) form to work but couldn't get it to go.
my @a=qw(random brilliant braindead); print $a[rand(@a)];
| [reply] [Watch: Dir/Any] [d/l] |
|
That doesn't give me the desired results at all. I was using Perl 5.6, but $^N was only introduced in 5.8.
By the way, you should localize your package variables whenever possible. Replace
our $paren;
with
local our $paren;
I'd also add a comment along the lines of "Always use package variables with regular expressions." Someone reading or maintaining the code could very well not know that lexical variables can cause problems.
Finally, to avoid continually compiling regexp fragments, replace
(??{$paren ? "\}" : "(?!\})"})
with
(?(?{ $paren }) } | (?!}) )
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
Note that Rhandom and I basically posted the same pattern, the main difference being mine doesn't need the (?{})/(??{})/$^N stuff, using the conditional pattern instead (as you originally requested).
Actually to be honest I didnt really look deeply at Rhandom's post before I posted mine. Its cool we came up with the same thing pretty much, but using two different advanced feature sets.
---
$world=~s/war/peace/g
| [reply] [Watch: Dir/Any] |
|
Re: Regex conditional match if previous match in same expression is true?
by kyle (Abbot) on Apr 09, 2007 at 17:44 UTC
|
Your expression just never needs to match the braces at all. If you add \s at the beginning and end of the expression (to force Perl to look at something beyond 'hello'), you get the expected result.
Update: In my fooling around, I'd also changed ([{]{0,1}) to ([{])? (because the first version, as merlyn says, will always match, but sometimes an empty string). That's also necessary to make it work as expected.
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
| [reply] [Watch: Dir/Any] |
|
use Test::More;
my %tests
= ( 'hello' => 1,
'{hello}' => 1,
'hello}' => 0,
'{hello' => 0,
'oh {hello} there' => 1,
'oh {hello there' => 0,
'oh hello there' => 1,
'oh hello} there' => 0,
);
plan 'tests' => scalar keys %tests;
while ( my ( $text, $result ) = each %tests ) {
my $hello = qr/hello/;
my $test_result
= ( $text =~
/
\{$hello\} # hello with braces
| # or
(?<! \{ ) # not a brace
$hello # hello
(?! \} ) # also not a brace
/x
) ? 1 : 0;
is( $test_result, $result, $text );
}
Put your (possibly complicated) match text in a variable so you don't have to change it in two places when it changes. After that, literally match the text with braces and without. | [reply] [Watch: Dir/Any] [d/l] |