http://qs321.pair.com?node_id=525794

Articuno has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks!

Please, I need some enlightment about a regex error...

Why am I getting these errors ? I thought \Q and \E where valid escapes for regexen :-(
Unrecognized escape \Q passed through in regex; marked by <-- HERE +in m/\b\Q <-- HERE Renata\E\b/ at -e line 1. Unrecognized escape \E passed through in regex; marked by <-- HERE +in m/\b\QRenata\E <-- HERE \b/ at -e line 1.
The $regexp is retrieved from a DB, and came from an user (the interface appended \Q...\E before storing it in the DB) The offending code snippet is this:
# ... print STDERR "/$regexp/\n"; $throw_away = ($window =~ /$regexp/ism) ? 'TRUE' : 'FALSE'; # ...
And the "STDERR" output before the error is:
/\b\QRenata\E\b/

Thanks in Advance :-)


Update:Maybe I wasn't clear about a point: the regexes come from a DB, and in DB they already had \b...\b, and now some (few) of them have \b\Q...\E\b there. Is isn't my script that is messing with them...

last update:Thanks ikegami-san (who explained the point about \Q...\E in interpolations), and others who helped. I'm accepting more suggestions, but for now, as the regexes are "simple" (whatever that means :-)), I'll go with quotemeta()'ing whatever is inside \Q...\E (the text in question has no "\" so I wont fall in the \\Quux trap :-))

-- 6x9=42

Replies are listed 'Best First'.
Re: Unrecognized escape \Q passed through in regex
by ikegami (Patriarch) on Jan 26, 2006 at 18:33 UTC

    \Q and \E don't work inside interpolations. They only work in regexp literals (m/here/, s/here// and qr/here/). Solutions:

    # Text. $text = 'Renata'; $window =~ /\b\Q$text\E\b/ism
    and
    # Uncompiled regexp. $text = 'Renata'; $regexp = '\\b' . quotemeta($text) . '\\b'; $window =~ /$regexp/ism
    and
    # Compiled regexp. $regexp = qr/\b\QRenata\E\b/ism; $window =~ $regexp

    Note:
    The s modifier is useless if you don't use ".".
    The m modifier is useless if you don't use "^" or "$".

      Ok,I'll go with something along the lines of the second solution...

      I know the meaning of the switches... They are there because some regexen on DB are allowed to be "real regexes" and not just "bare words".

      Maybe I'll emulate \Q...\E then...
      $regexp =~ s/\\Q(.*?)\\E/quotemeta($1)/eg;
      Should that do what I want ? :-)
      update: /e ==> /eg
      -- 6x9=42

        Not necessarily. It won't work for

        print($regexp) Expect Gives -------------- ------------ ---------- \Q*\E\Q*\E \*\* \*\Q*\E \Q**LOL** \*\*LOL\*\* \Q**LOL** \\Quit\\Exit \\Quit\\Exit \Quit\Exit \Qfoo\\E foo\\\\E foo\\

        Solution:

        my $in_quote = 0; $regexp =~ s/([^\\]|\\.)/ if ($in_quote) { if ($1 eq '\\E') { $in_quote = 0; '' } else { quotemeta($1) } } else { if ($1 eq '\\Q') { $in_quote = 1; '' } else { $1 } } /eg;

        Update: Alternative:

        $regexp =~ s/ \G ( (?:[^\\]|\\[^Q])* ) (?: \\Q (?:[^\\]|\\[^E])* (?:\\E)? )? / $1 . (defined($2) ? quotemeta($2) : '') /xge;

        Neither snippet is fully tested. In fact, both are known to be unable to handle regexps in which (?{...}) or (?{{...}}) are used. What's wrong with what I suggested in my earlier post?

      \Q and \E don't work inside interpolations. They only work in regexp literals
      Fascinating theory, but easily proven wrong:
      print "\Qabc*def\E\n"; # prints abc\*def
      The truth is that \Q means "add backslashes to special chars until \E" in the exact same places that \n becomes a newline and $x expands to its value: every double-quoted string. A regex (that doesn't have special single-quote quoting) is just one example of that.

      -- Randal L. Schwartz, Perl hacker
      Be sure to read my standard disclaimer if this is a reply.

        \Q and \E don't work inside interpolations. They only work in regexp literals
        Fascinating theory, but easily proven wrong:
        print "\Qabc*def­\E\n"; # prints abc\*def

        Your example shows a \Q being used inside of a quoted string, not inside of a string value being interpolated (into a string or regex). You've misunderstood how "interpolation" was originally used.

        I'll agree that the original use of "interpolation" wasn't clear enough. Though, double-quoted strings allow interpolation but I wouldn't call a quoted string that didn't contain any variables "an intepolation".

        - tye        

        I thought it worked in double-quoted literals, but my test showed otherwise. I must have made a typo in my test. In any case, that \Q works in double-quoted litarals doesn't help the OP any.
Re: Unrecognized escape \Q passed through in regex
by diotalevi (Canon) on Jan 26, 2006 at 18:27 UTC

    The \Q and \E can't be interpolated into your regexp and still function like you expect \Q and \E to. At the time the \Q" and \E are seen, they aren't special anymore and just mean Q and E.

    ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

      But if they "just mean" Q and E, why does perl gives me an error when it (*) sees them ?

      (*) offtopic: I was about to type "he" instead of "it" :-)
      -- 6x9=42

        It's not an error, it's a warning. It's a warning because you escape things because you expect them to be special. Perl is telling you that you were wrong and that they aren't.

        ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

Re: Unrecognized escape \Q passed through in regex
by BrowserUk (Patriarch) on Jan 26, 2006 at 18:40 UTC

    It would be a lot more helpful if you would supply a single snippet that produced the error, rather than lots of little bits with no context.

    I haven't been able to reproduce your error exactly, but at least part of the problem is that \b means something different when interpolated in a string, than when interpolated in a regex. A backspace versus word boundary.

    However, the fact that when you print $regex out having interpolated it into a string, it gets printed as \b\QRenata\E\b means that the contents of $regex must originally be (something like):

    \\b\\QRenata\\E\\b

    And when you try to use that as a regex, the escaped (doubled) backslashes means that the regex engine will not recognise the escape sequences.

    That I cannot reproduce the errors you are seeing probably means that your snippets do not reflect what you are really doing in your code, and I haven't been able to read between the lines sufficiently to guess what it is that you are actually doing.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Last Update: BrowserUK: this reproduces the errors, err, warnings I've received
      perl -e 'use warnings; $teste=q(\Qfoo\E);print 42 if "(foo" =~ m/\b$t +este\b/'



      1. The "\b" is not the problem... It used to work before my "\Q...\E" problem.
      update: At least I thought I worked. All my regexes should have \b...\b as boundaries... I think if they were interpreted as backspaces, the entire regex would never match (text don't have backspaces), and we would've noticed the absence of filtering. (the program is a kind of filter)

      2. The code is a real copy-paste from my program. What i've ommited was unimportant (for example, DB queries, what i do with the return value, etc...).

      3. I don't know why you say the content of $regex has to have "double slashes"... below is an example where the content of $x has only 1 slash and the printed version has 1 slash too

      perl -e '$x = q(\b); $y = "$x"; print $y'
      Update:Ok, I know the two, '\b' and "\\b" are equivalent. The point is, there is no "..." involved. I think when i get a string from a DB it comes as a '...', and putting it inside a "..." doesn't interpolate it's own content
      -- 6x9=42
Re: Unrecognized escape \Q passed through in regex
by HelenCr (Monk) on Oct 13, 2012 at 07:59 UTC

    I had the same problem. I think

    $window =~ eval 'qr{'.$regexp.'}' ;

    should work, and not produce the error.