Am I on the pipe, or what?

BorgCopyeditor has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Am I on the pipe, or what? by Zaxo (Archbishop) on Jul 09, 2002 at 23:15 UTC
Perl is seeing the pipe as alternation to nothing, i.e. 'either this or anything else'. Here are the details from a debugging perl (5.8.0-RC1): $ perl -Dr pipe.pl Omitting $` $& $' support. EXECUTING... Compiling REx `lo/gos' size 4 Got 36 bytes for offset annotations. first at 1 rarest char / at 2 1: EXACT <lo/gos>(4) 4: END(0) anchored `lo/gos' at 0 (checking anchored isall) minlen 6 Offsets: [4] 1[6] 0[0] 0[0] 7[0] Guessing start of match, REx `lo/gos' against `lo/gou'... Did not find anchored substr `lo/gos'... Match rejected by optimizer Freeing REx: `lo/gos' Compiling REx `lo/gou' size 4 Got 36 bytes for offset annotations. first at 1 rarest char / at 2 1: EXACT <lo/gou>(4) 4: END(0) anchored `lo/gou' at 0 (checking anchored isall) minlen 6 Offsets: [4] 1[6] 0[0] 0[0] 7[0] Guessing start of match, REx `lo/gou' against `lo/gou'... Found anchored substr `lo/gou' at offset 0... Guessed: match at offset 0 lo/gou Freeing REx: `lo/gou' Compiling REx `lo/gw\|' size 7 Got 60 bytes for offset annotations. 1: BRANCH(5) 2: EXACT <lo/gw>(7) 5: BRANCH(7) 6: NOTHING(7) 7: END(0) minlen 0 Offsets: [7] 0[0] 1[5] 0[0] 0[0] 6[1] 6[0] 7[0] Matching REx `lo/gw\|' against `lo/gou' Setting an EVAL scope, savestack=16 0 <> <lo/gou> \| 1: BRANCH Setting an EVAL scope, savestack=22 0 <> <lo/gou> \| 2: EXACT <lo/gw> failed... 0 <> <lo/gou> \| 6: NOTHING 0 <> <lo/gou> \| 7: END Match successful! lo/gw\| Freeing REx: `lo/gw\|' Compiling REx `lo/gon' size 4 Got 36 bytes for offset annotations. first at 1 rarest char / at 2 1: EXACT <lo/gon>(4) 4: END(0) anchored `lo/gon' at 0 (checking anchored isall) minlen 6 Offsets: [4] 1[6] 0[0] 0[0] 7[0] Guessing start of match, REx `lo/gon' against `lo/gou'... Did not find anchored substr `lo/gon'... Match rejected by optimizer Freeing REx: `lo/gon' [download] After Compline, Zaxo	[reply] [d/l]
Re: Re: Am I on the pipe, or what? by flounder99 (Friar) on Jul 10, 2002 at 04:40 UTC
If you don't want to compile a debugging version use YAPE::Regex::Explain use strict; use YAPE::Regex::Explain; print YAPE::Regex::Explain->new(qr'lo/gw\|')->explain(); __OUTPUT__ The regular expression: (?-imsx:lo/gw\|) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- lo/gw 'lo/gw' ---------------------------------------------------------------------- \| OR ---------------------------------------------------------------------- ) end of grouping ---------------------------------------------------------------------- [download] -- flounder	[reply] [d/l]
Regex without 'm' or '/' by dvergin (Monsignor) on Jul 10, 2002 at 01:30 UTC
The example that BorgCopyeditor supplies with his question brings up an interesting point. Why does this work: `if($query=~$term) {...}` Or, to offer a very plain example, why does this work: `if ( 'abc' =~ 'bc' ) { print "yes\n"; }` [download] The rule I learned was (quoting from perlop): "If "/" is the delimiter then the initial m is optional." Fair enough. The implication is: `if ( 'abc' =~ m/bc/ ) { # good if ( 'abc' =~ /bc/ ) { # good if ( 'abc' =~ m%bc% ) { # good (for any non-alphanum) but: if ( 'abc' =~ 'bc' ) { # BAD! (or so we assume)` [download] But it is not so. As I said above, both of the examples at the top of this response work. Why? What I cannot find in the on-line docs (update: see danger's response below) but I know from experimentation and the words of Camel 3, page 144, is that, even without the 'm' or the '/', the righthand side of =~ "still counts as a m// matching operation, but there'll be no place to put any trailing modifiers, and you'll have to handle your own quoting." So... `if ( 'abc' =~ 'bc' ) { # works if ( 'abc' =~ $pattern ) { # works if ( 'abc' =~ "$pattern" ) { # works if ( 'abc' =~ bc ) { # works!! if ( 'abc' =~ 'bc'g ) { # Error: bareword 'g'` [download] The present writer is not responsible for any sideways looks any of these may earn you from your peers. ------------------------------------------------------------ "Perl is a mess and that's good because the problem space is also a mess." - Larry Wall	[reply] [d/l] [select]
Re: Regex without 'm' or '/' by danger (Priest) on Jul 10, 2002 at 04:05 UTC
What I cannot find in the on-line docs Its an operator thing, not a regex thing --- from perlop under "Binding Operators": `... If the right argument is an expression rather than a search pattern, substitution, or transliteration, it is interpreted as a search pattern at run time. This can be less efficient than an explicit search, because the pat- tern must be compiled every time the expression is evalu- ated.` [download] This can sometimes cause problems for newcomers, especially when they use split with a double-quoted string as the split pattern (as seems to happen with undue frequency) and have an escaped metacharacter in the pattern: `$_ = 'this has a \| pipe'; @a = split /\\|/; # good print join(":", @a),"\n"; @a = split "\\|"; # oops print join(":", @a),"\n";` [download] In the second case, the double-quoted string is first evaluated and the resulting string (sans backslash) is then used as the pattern in the regex.	[reply] [d/l] [select]
Re: Re: Regex without 'm' or '/' by hv (Prior) on Feb 24, 2003 at 01:19 UTC
If the right argument is an expression rather than a search pattern, substitution, or transliteration, it is interpreted as a search pattern at run time. This can be less efficient than an explicit search, because the pat- tern must be compiled every time the expression is evalu- ated. This perlop text must be a holdover from a while back. At least as far back as 5.005_03, code like `$str1 =~ 'bc'` (with a constant string for the pattern) would be compiled only once. Between 5.6.0 and 5.6.1 an extra check was added, so that even `$str =~ $str2` would not be recompiled as long as `$str2` had not changed. I guess the second statement should simply be deleted from that paragraph. Hugo	[reply] [d/l] [select]
Thanks, all by BorgCopyeditor (Friar) on Jul 10, 2002 at 02:03 UTC
Thanks to Texas Tess, Zaxo, and dvergin for patient explanations of both the obvious and the arcane. Also, next time I have a regex problem, maybe I'll brave the debugger. That was very enlightening. FWIW, the data I'm parsing is in 'betacode', an ASCII transcription scheme for Ancient Greek. It's convenient in some ways, but chock full of what I still have to remind myself are metacharacters. Grrr. BCE --Your punctuation skills are insufficient!	[reply]
Re: Am I on the pipe, or what? by TexasTess (Beadle) on Jul 09, 2002 at 23:14 UTC
You have to escape the metacharacter to ensure it's not evaluated literally...but that really does not explain why it picks up the w as well....try escaping the pipe and see if it still returns the same.. TexasTess "Great Spirits Often Encounter Violent Opposition From Mediocre Minds" --Albert Einstein UPDATE: After re-reading this..I think I must have been on the pipe myself when i wrote it!	[reply]