http://qs321.pair.com?node_id=786123


in reply to Re^2: About a piece of code
in thread About a piece of code

Whitespace is \n\f\s\f\r Update: oof..goof should be \n\f\s\t\r order doesn't matter,

\s is not a whitespace character.   In a regular expression it is a character class that includes the characters [ \t\n\r\f] and in a double quoted string it is the character s.

Replies are listed 'Best First'.
Re^4: About a piece of code
by Marshall (Canon) on Aug 05, 2009 at 15:47 UTC
    \s is the single space ' ' character. In your set [ \t\n\r\f] that thing right before the \t is \s.

    Oooh I see now... in a character set [\s\t\f\r\n], \s means a single space.
    in a regex \s means all of the chars in this set: [\s\t\f\r\n]. Yep, confusing!!
    \s has a context dependent meaning. Such as it is.

      \s is not a single character in a regular expression, it is the character class [ \t\n\r\f].   See perlre.

      $ perl -le' use Data::Dumper; $Data::Dumper::Useqq = 1; print Dumper grep /\s/, map chr, 0 .. 255; ' $VAR1 = "\t"; $VAR2 = "\n"; $VAR3 = "\f"; $VAR4 = "\r"; $VAR5 = " ";

      In a double quoted string it is just the character "s".

      $ perl -le' use Data::Dumper; $Data::Dumper::Useqq = 1; print Dumper "\s"; ' $VAR1 = "s";
      I think that, since the common regex wisdom involves replacing a literal space in an /x-modified regex with \s, it's easy to think of \s as a replacement for a literal space—but, if you actually want a literal space (rather than just a single whitespace character) in such a regex, but outside a character class, then what you want is '\' (i.e., an escaped space).