Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
You have used the regex / / (match one space) as delimiter for split. Unfortunatly the string in $line contains several consecutive spaces, so perl does what it's told, split on one space. As an example:
$line = "foo bar"; print ++$i, ". $_\n" for split / /, $line;
$line is assigned the string "foo..bar", where the '.' is meant to be space (it's more visible). That would output:
1. foo
2. 
3. bar
It prints out three lines, one which is empty (because there is nothing between the two spaces). So we have to deal with more than one whitespace. We could do this: split /\s+/, $line Now, perl will look for 1 or more spaces when trying to figure out where to split - and in your example that would probabely work out just fine. (don't use \s* as delimiter as that would match everywhere)
But, what if your string looked something like this: $line = "  foo   bar  "; Using \s+ as delimiter now would garble things up again. It would return an extra element (which is empty) at the front. How do we fix this then ? Well, the solution may look at bit counterintuitive from what we have learned so far: split " ", $line Hm, wouldn't that just match one single space again ? Well, it should, but it doesn't. This is a special case in perl, and if we apply that to our test case we will get the following result:
1. foo
2. bar
Ah, just what we want (in almost all cases I would dare to say). As an interesting note, consider the following:
perl -MO=Deparse -e 'split " "' split(/\s+/, $_, 0);
Whereas
perl -MO=Deparse -e 'split /\s+/' split(/\s+/, $_, 0);
generates the exact same output - but we know that the semantics is not the same. But as long as perl does The Right Thing(TM), I'm happy :-)

Autark


In reply to RE: strip multiple spaces by autark
in thread strip multiple spaces by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others rifling through the Monastery: (8)
    As of 2021-01-22 16:20 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?
      Notices?