Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
You have used the regex / / (match one space) as delimiter for split. Unfortunatly the string in $line contains several consecutive spaces, so perl does what it's told, split on one space. As an example:
$line = "foo bar"; print ++$i, ". $_\n" for split / /, $line;
$line is assigned the string "foo..bar", where the '.' is meant to be space (it's more visible). That would output:
1. foo
2. 
3. bar
It prints out three lines, one which is empty (because there is nothing between the two spaces). So we have to deal with more than one whitespace. We could do this: split /\s+/, $line Now, perl will look for 1 or more spaces when trying to figure out where to split - and in your example that would probabely work out just fine. (don't use \s* as delimiter as that would match everywhere)
But, what if your string looked something like this: $line = "  foo   bar  "; Using \s+ as delimiter now would garble things up again. It would return an extra element (which is empty) at the front. How do we fix this then ? Well, the solution may look at bit counterintuitive from what we have learned so far: split " ", $line Hm, wouldn't that just match one single space again ? Well, it should, but it doesn't. This is a special case in perl, and if we apply that to our test case we will get the following result:
1. foo
2. bar
Ah, just what we want (in almost all cases I would dare to say). As an interesting note, consider the following:
perl -MO=Deparse -e 'split " "' split(/\s+/, $_, 0);
Whereas
perl -MO=Deparse -e 'split /\s+/' split(/\s+/, $_, 0);
generates the exact same output - but we know that the semantics is not the same. But as long as perl does The Right Thing(TM), I'm happy :-)

Autark


In reply to RE: strip multiple spaces by autark
in thread strip multiple spaces by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (3)
As of 2024-04-25 06:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found