You have used the regex
/ / (match one space) as delimiter
for split. Unfortunatly the string in
$line contains several
consecutive spaces, so perl does what it's told, split on one space. As an
example:
$line = "foo bar";
print ++$i, ". $_\n" for split / /, $line;
$line is assigned the string
"foo..bar", where the
'.' is meant to be space (it's more visible). That would output:
1. foo
2.
3. bar
It prints out three lines, one which is empty (because there is
nothing between the two spaces). So we have to deal with more than
one whitespace. We could do this:
split /\s+/, $line
Now, perl will look for 1 or more spaces when trying to figure out
where to split - and in your example that would probabely work out
just fine. (don't use
\s* as delimiter as that would match everywhere)
But, what if your string looked something like this:
$line = " foo bar ";
Using
\s+ as delimiter now would garble things up again.
It would return an extra element (which is empty) at the front.
How do we fix this then ? Well, the solution may look at bit counterintuitive
from what we have learned so far:
split " ", $line
Hm, wouldn't that just match one single space again ? Well, it should,
but it doesn't. This is a special case in perl, and if we apply
that to our test case we will get the following result:
1. foo
2. bar
Ah, just what we want (in almost all cases I would dare to say).
As an interesting note, consider the following:
perl -MO=Deparse -e 'split " "'
split(/\s+/, $_, 0);
Whereas
perl -MO=Deparse -e 'split /\s+/'
split(/\s+/, $_, 0);
generates the exact same output - but we
know that
the semantics is
not the same. But as long as perl
does The Right Thing(TM), I'm happy :-)
Autark
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.