Re: What it mathches`
by ikegami (Patriarch) on Aug 17, 2009 at 04:42 UTC
|
Nothing. It's impossible to find a double quote before the start of the string.
- Matches a double quote
- followed by the start of the string
- followed by "abc"
- followed by "A" or a backspace
- followed by "def"
- followed by whatever the content of var $" matches.
| [reply] [d/l] [select] |
Re: What it mathches`
by Tanktalus (Canon) on Aug 17, 2009 at 05:14 UTC
|
I had to fight with this a lot to get a "true" output. And even then, I cheated. The most basic of the problems is that ^ is a zero-width assertion. Think about another zero-width assertion, \b. That is, the break between alphanumeric and non-alphanumeric. If you have /a\bc/, this can never match anything because there is not, by definition, a change between word and non-word between an a and a c. Can't happen. Similarly, ^ is a zero-width assertion that asserts this is the beginning of a line. Some pedants may point out that it actually is the beginning of the string, but that's not quite true. The m modifier allows ^ to match anywhere in the string - in fact, according to perlre, the m modifier is merely removing the optimisation that perl has that assumes there is only one line in the string you're testing. That means that it's assuming it's a single line, thus ^ is the beginning of the string because of the assumption there is only one line.
Anyway, ^, being zero-width, must be right after either the physical beginning of the string, or right after a \n. It can't be right after a quote.
However, if we insert a \n into your regex right before the ^, we still don't quite get it to work because you're missing the m modifier. I'm also assuming you haven't set the deprecated $* variable (see perlvar, but don't use it - it's deprecated). Let's say we use the m modifier. It still doesn't work because $_=<> will only drag in a single line. Typing in "\nabc... won't match because $_ will only have the ", terminating the input on the carriage return. There is more cheating to be had: adding local $/; before the input line. Now I have:
(echo '"'; echo "abcd") | perl -le 'local $/;$_=<>; $*=1; print "[$_]"
+; if (/"\n^abc/) { print "true" }'
And, lo and behold, it works. But notice: I added the $/ and $* (which you shouldn't do) variables, and the \n inside your regex.
I do have to wonder, though, why you're asking this question. It has a slight odor of XY Problem ... or maybe homework. But only slight. | [reply] [d/l] [select] |
|
thus ^ is the beginning of the string because of the assumption there is only one line.
No, ^ is the beginning of the string because /m wasn't used. No assumption was made.
By the way, $* doesn't exist anymore.
| [reply] [d/l] [select] |
|
| [reply] |
|
|
|
Re: What it mathches`
by CountZero (Bishop) on Aug 17, 2009 at 06:09 UTC
|
Obviously, this is written by someone who thinks all strings must be doube-quoted in Perl! And consequentely he put double quote around the regex. Not only this is not necessary, it actually breaks the regex.Most probably, it was meant to be: if (/^abc[A\b]def$/ ){
print "true" ;
}
else
{
print "false";
}
Which means: match a string starting with 'abc', followed by either a capital 'A' or a backspace, followed by 'def' which ends the string.Please note that normally \b in a regex means "match on a word boundary", but inside a character class (the square brackets) it means 'backspace'.
CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James
| [reply] [d/l] [select] |
Re: What it mathches`
by Anonymous Monk on Aug 17, 2009 at 05:03 UTC
|
use YAPE::Regex::Explain;
print YAPE::Regex::Explain->new( qr/"^abc[A\b]def$"/ )->explain;
__END__
The regular expression:
(?-imsx:"^abc[A\b]def )
matches as follows:
NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
" '"'
----------------------------------------------------------------------
^ the beginning of the string
----------------------------------------------------------------------
abc 'abc'
----------------------------------------------------------------------
[A\b] any character of: 'A', '\b' (backspace)
----------------------------------------------------------------------
def 'def '
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
| [reply] [d/l] |
Re: What it mathches`
by bobf (Monsignor) on Aug 17, 2009 at 05:56 UTC
|
As other monks have suggested, this regex is a bit odd. Are the quotes supposed to be part of the pattern? The pattern does not need to be quoted inside the regex delimiters. See perlre and perlreref.
If the quotes were added in error (i.e., they are not part of the pattern), then the regex becomes
/^abc[A\b]def$/
which not only makes more sense, but also the task of predicting matching patterns becomes trivial.
If you need a hint, see Re: What it mathches` and ignore the parts about the quotes. I also wonder if \b (backspace) is in error, but without additional information speculation on intent is merely that.
| [reply] [d/l] [select] |
Re: What it mathches`
by Anonymous Monk on Aug 17, 2009 at 06:36 UTC
|
use re 'debug';
$_="abc def";
#$_= qq!"abc def$"!;
if (/"^abc[A\b]def$"/ ){
print "\ntrue\n";
}
else
{
print "\nfalse\n";
}
__END__
Compiling REx `"^abc[A\b]def '
size 19 Got 156 bytes for offset annotations.
first at 1
1: EXACT <">(3)
3: BOL(4)
4: EXACT <abc>(6)
6: ANYOF[\10A](17)
17: EXACT <def >(19)
19: END(0)
anchored ""abc" at 0 (checking anchored) minlen 9
Offsets: [19]
1[1] 0[0] 2[1] 3[3] 0[0] 6[5] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[
+0] 0[0] 0[0] 0[0] 11[4] 0[0] 15[0]
false
Freeing REx: `"\"^abc[A\\b]def "'
use re 'debug';
#$_="abc def";
$_= qq!"abc def$"!;
if (/"^abc[A\b]def$"/ ){
print "\ntrue\n";
}
else
{
print "\nfalse\n";
}
__END__
Compiling REx `"^abc[A\b]def '
size 19 Got 156 bytes for offset annotations.
first at 1
1: EXACT <">(3)
3: BOL(4)
4: EXACT <abc>(6)
6: ANYOF[\10A](17)
17: EXACT <def >(19)
19: END(0)
anchored ""abc" at 0 (checking anchored) minlen 9
Offsets: [19]
1[1] 0[0] 2[1] 3[3] 0[0] 6[5] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[
+0] 0[0] 0[0] 0[0] 11[4] 0[0] 15[0]
Guessing start of match, REx ""^abc[A\b]def " against ""abc def "...
Found anchored substr ""abc" at offset 0...
Guessed: match at offset 0
Matching REx ""^abc[A\b]def " against ""abc def "
Setting an EVAL scope, savestack=3
0 <> <"abc def > | 1: EXACT <">
1 <"> <abc def > | 3: BOL
failed...
Match failed
false
Freeing REx: `"\"^abc[A\\b]def "'
| [reply] [d/l] [select] |
|
| [reply] [d/l] [select] |
|
die qq!{$"}!;
__END__
{ } at - line 2.
| [reply] [d/l] |
|
Re: What it mathches`
by targetsmart (Curate) on Aug 18, 2009 at 09:09 UTC
|
Hi abubacker,
I think u have done enough on regular expression when you learnt SED in your UNIX course.....
if you have doubts on the basics just check with the local mentor there, or go back and read regular expressions.... :)
Vivek
-- 'I' am not the body, 'I' am the 'soul', which has no beginning or no end, no attachment or no aversion, nothing to attain or lose.
| [reply] |