Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

How to introduce a frustrating bug with a single whitespace

by radiantmatrix (Parson)
on Dec 24, 2007 at 20:00 UTC ( [id://658921]=perlmeditation: print w/replies, xml ) Need Help??

There's probably more than one way to accomplish the goal in the title -- and I'd love to hear stories about it, if only because it'd make me feel better.

I had the following, very simple line of code:

$user = ~ s/^\s+|\s+$//gs;

The intention, of course, was to trim white space from the beginning and end of a user name previously parsed from a record. Instead, though, $user ended up containing a large and random-seeming number.

Spot the bug?

Stepping through the code with the debugger found the line where the oddness was occurring, but it seemed like inexplicable madness (for something like 10 minutes) until I noticed that one extra space. Man, do I feel stupid!

The lesson? I probably should have used Text::Trim. :)

<radiant.matrix>
Ramblings and references
The Code that can be seen is not the true Code
I haven't found a problem yet that can't be solved by a well-placed trebuchet

Replies are listed 'Best First'.
Re: How to introduce a frustrating bug with a single whitespace
by ikegami (Patriarch) on Dec 24, 2007 at 20:17 UTC
    In a similar vein,
    print (4+5)*20; # Prints 9, returns 20

      Yes, but at least that gets caught with warnings (or -w):

      $ perl -we 'print (4+5)*20;' print (...) interpreted as function at -e line 1. Useless use of multiplication (*) in void context at -e line 1.

      The whitespace bug described in the original node is the first bug I've had in a while that:

      • wasn't caught by syntax checking (it's valid Perl)
      • wasn't griped about with strict and warnings on
      • wasn't a design error
      <radiant.matrix>
      Ramblings and references
      The Code that can be seen is not the true Code
      I haven't found a problem yet that can't be solved by a well-placed trebuchet
        Well, it's kind of you to say that, but given the fact that Perl 6 changed =~ to ~~, it arguably was a design error. (Though this particular failure mode is not why it was changed, but rather the failure of reversing the operator to say ~= instead. It doesn't matter if you reverse ~~.) Anyway, if you split ~~ in Perl 6 with whitespace, it'll parse, but will likely give you a "useless use of ~ in void context" warning.

        By the way, Perl 6 also fixes the "print (4+5)*20" faq...

        The whitespace bug described in the original node is the first bug I've had in a while that:
        • wasn't caught by syntax checking (it's valid Perl)
        • wasn't griped about with strict and warnings on
        • wasn't a design error

        I've got one of those, though not a whitespace bug. Just recently I was messing with a data-structure that was supposed to be an array of arrays of arrays:

        $rect_list = [ [ [0, 0], [3, 9 ] ], [ [5, 4], [3, 10] ], [ [7, 8], [9, 11] ]. [ [2, 4], [5, 15] ], [ [1, 9], [9, 13] ], ];

        First question: Can you spot the bug? Second question: Do you know what it does?

Re: How to introduce a frustrating bug with a single whitespace
by ambrus (Abbot) on Dec 25, 2007 at 09:24 UTC

    Did you know that <$x> is a readline, but <$ x> is a glob?

Re: How to introduce a frustrating bug with a single whitespace
by doom (Deacon) on Dec 31, 2007 at 02:44 UTC

    The lesson? I probably should have used Text::Trim. :)

    It could be, but I would say the lesson is to not try to do it all in one step. I remember reading a Randal Schwartz rant some time back about people continually trying to do whitespace trimming in one step, when it's much simpler and straight-forward to do it in two:

    $user =~ s/^\s+//s; $user =~ s/\s+$//s;
    I think you'll find that that runs much faster than one regexp involving alternation....

      I think you'll find that that runs much faster than one regexp involving alternation....

      Well, I don't really care much about faster runtimes -- most of what I do waits on disk access and network I/O long before it runs into RAM or CPU limits, so which regex is fastest almost never comes into play.

      I try to write primarily for readability. I realize, of course, that "readability" is subjective. But, I find s{^\s+|\s+$}{}g to be more readable than the two-line alternative. When I see that one line, I immediately think "ok, trim whitespace from the ends". When I see the two, I have to stop and think about it. Just me.

      One of the things I like about Python1 is the readability of performing this action:

      text = " a string of some sort, with space at either end " text = text.strip()

      I think that's what draws me to Text::Trim (if I have to do the step repeatedly, especially):

      my $text = " a string of some sort, with space at either end "; $text = trim( $text );

      Of course, creating a dependency for something like that seems a little excessive, so I've recently taken to

      my $TRIM_WHITESPACE_RE = qr/^\s+|\s+$/ ## ... later on .. ## $text =~ s/$TRIME_WHITESPACE_RE//g;

      1: I know, I know. :) Seriously, though, Python has some nifty things in it, even if I still like Perl better.

      <radiant.matrix>
      Ramblings and references
      The Code that can be seen is not the true Code
      I haven't found a problem yet that can't be solved by a well-placed trebuchet

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://658921]
Front-paged by tye
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (5)
As of 2024-04-24 02:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found