Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re^3: bracket processing

by kcott (Archbishop)
on Apr 01, 2020 at 00:37 UTC ( [id://11114870]=note: print w/replies, xml ) Need Help??


in reply to Re^2: bracket processing
in thread bracket processing

"As a general practice, I find it's much safer to interpolate strings ... into regexes using \Q \E ..."

As a general rule, for regexes in general, that's fine and I'd generally do the same; however, bracketed classes are different.

Take a look at "perlrecharclass: Special Characters Inside a Bracketed Character Class". I'll leave you to acquaint yourself with the full text. Here's some pertinent extracts (my emphasis added):

Most characters that are meta characters in regular expressions ... lose their special meaning and can be used inside a character class without the need to escape them.
...
Characters that may carry a special meaning inside a character class are: \ , ^ , - , [ and ] , and are discussed below.
...
A [ is not special inside a character class, unless it's the start of a POSIX character class ... It normally does not need escaping.

So, none of the characters in $delim required escaping.

Furthermore, I generally aim to thoroughly test my solutions before posting them. In this instance, I had added a temporary print statement:

my $prefix = qr{[^$delim]*}; print "$prefix\n";

which output:

(?^:[^([{<]*)

That's exactly the regex I wanted.

— Ken

Replies are listed 'Best First'.
Re^4: bracket processing
by AnomalousMonk (Archbishop) on Apr 01, 2020 at 18:23 UTC
    ... for regexes in general, that's fine ... bracketed classes are different. ... Most characters that are meta characters in regular expressions ... lose their special meaning ... none of the characters in $delim required escaping.

    In a character class, no one can hear your metacharacters scream. For the most part. The consequences of the occasional exception are what I seek to avoid with defensive measures like this. The effects of a change from
        my $delim = '([{<';
    to
        my $delim = '(-[{<';
    may not be readily apparent, yet still be very significant. One would hope that thorough testing would reveal a problem like this, but better IMHO to obviate the problem to begin with.


    Give a man a fish:  <%-{-{-{-<

      "... a change from my $delim = '([{<'; to my $delim = '(-[{<'; may not be readily apparent ..."

      I find it hard to believe that you think that range of 52 characters "may not be readily apparent".

      $ perl -E 'say "@{[map chr, ord q{(} .. ord q{[}]}"' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J +K L M N O P Q R S T U V W X Y Z [

      The OP hadn't asked for a hyphen to take on a parenthetical function; however, if that was wanted, the correct way to write it would be:

      my $delim = '([{<-';

      And still nothing needs to be escaped.

      What you've contrived as an example is a novice mistake. It's a good mistake to learn from. I'm sure I made that mistake a quarter of a century ago; found out what I did wrong; and, didn't do it again.

      Defensive programming is all well and good when dealing with data from an unknown or untrusted source. When the data is just four characters you've written yourself; defensive programming is overkill.

      — Ken

        You have the wisdom and discipline (and steel-trap memory, apparently :) to make a mistake once and never make it again. I saw long ago that I had not. One of the lessons I drew from that observation was to program defensively. Even when I have absolute control of both code and data, however simple, I find the effort of defensive programming is usually, in the end, repaid.

        Defensive programming is all well and good when dealing with data from an unknown or untrusted source. When the data is just four characters you've written yourself; defensive programming is overkill.

        It's been observed time and again that one is oneself the person best equipped to lead one down the garden path.

        What you've contrived as an example is a novice mistake. It's a good mistake to learn from.

        I should have made it clear in my initial comment that it was not intended for you; I think you are well aware of the issues involved. It's hard to contrive an example of a novice mistake that's not... well, contrived, but please be assured that the example in question was directed not to you but to novices.


        Give a man a fish:  <%-{-{-{-<

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11114870]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (3)
As of 2024-04-25 06:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found