Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re^2: bracket processing

by AnomalousMonk (Archbishop)
on Mar 31, 2020 at 06:09 UTC ( [id://11114828]=note: print w/replies, xml ) Need Help??


in reply to Re: bracket processing
in thread bracket processing

my $delim = '([{<';
my $prefix = qr{[^$delim]*};

As a general practice, I find it's much safer to interpolate strings like  $delim into regexes using  \Q \E metaquote escapes:
    my $delim = '([{<';
    my $prefix = qr{[^\Q$delim\E]*};
Of course, one could metaquote the string variable upon definition:
    my $delim = quotemeta '([{<';
but that might screw up subsequent use of the string; e.g., its use in something like
    my @parts = extract_bracketed($string, $delim, $prefix);
might become problematic.


Give a man a fish:  <%-{-{-{-<

Replies are listed 'Best First'.
Re^3: bracket processing
by kcott (Archbishop) on Apr 01, 2020 at 00:37 UTC
    "As a general practice, I find it's much safer to interpolate strings ... into regexes using \Q \E ..."

    As a general rule, for regexes in general, that's fine and I'd generally do the same; however, bracketed classes are different.

    Take a look at "perlrecharclass: Special Characters Inside a Bracketed Character Class". I'll leave you to acquaint yourself with the full text. Here's some pertinent extracts (my emphasis added):

    Most characters that are meta characters in regular expressions ... lose their special meaning and can be used inside a character class without the need to escape them.
    ...
    Characters that may carry a special meaning inside a character class are: \ , ^ , - , [ and ] , and are discussed below.
    ...
    A [ is not special inside a character class, unless it's the start of a POSIX character class ... It normally does not need escaping.

    So, none of the characters in $delim required escaping.

    Furthermore, I generally aim to thoroughly test my solutions before posting them. In this instance, I had added a temporary print statement:

    my $prefix = qr{[^$delim]*}; print "$prefix\n";

    which output:

    (?^:[^([{<]*)

    That's exactly the regex I wanted.

    — Ken

      ... for regexes in general, that's fine ... bracketed classes are different. ... Most characters that are meta characters in regular expressions ... lose their special meaning ... none of the characters in $delim required escaping.

      In a character class, no one can hear your metacharacters scream. For the most part. The consequences of the occasional exception are what I seek to avoid with defensive measures like this. The effects of a change from
          my $delim = '([{<';
      to
          my $delim = '(-[{<';
      may not be readily apparent, yet still be very significant. One would hope that thorough testing would reveal a problem like this, but better IMHO to obviate the problem to begin with.


      Give a man a fish:  <%-{-{-{-<

        "... a change from my $delim = '([{<'; to my $delim = '(-[{<'; may not be readily apparent ..."

        I find it hard to believe that you think that range of 52 characters "may not be readily apparent".

        $ perl -E 'say "@{[map chr, ord q{(} .. ord q{[}]}"' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J +K L M N O P Q R S T U V W X Y Z [

        The OP hadn't asked for a hyphen to take on a parenthetical function; however, if that was wanted, the correct way to write it would be:

        my $delim = '([{<-';

        And still nothing needs to be escaped.

        What you've contrived as an example is a novice mistake. It's a good mistake to learn from. I'm sure I made that mistake a quarter of a century ago; found out what I did wrong; and, didn't do it again.

        Defensive programming is all well and good when dealing with data from an unknown or untrusted source. When the data is just four characters you've written yourself; defensive programming is overkill.

        — Ken

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11114828]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (8)
As of 2024-03-28 23:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found