http://qs321.pair.com?node_id=1230570


in reply to Re^2: Avoiding repeated undefs
in thread [Solved] Avoiding repeated undefs

Yes, split isn't exactly consistent there. There is a third parameter - LIMIT - which limits the number of time the string is splitted. Eg: split '_', 'a_b_c_d', 2 will actually return the list ('a', 'b_c_d') because it has been splitted in two.

The thing is if LIMIT is 0 or omitted, all empty fields at the end are removed. So split '_', 'a_b___', 0; will return the list ('a', 'b'). Which is why you get an undefined value in your second case.

Now, the tricky bit is, for optimization, when perl knows how many values you are trying to write to, it will actually set LIMIT to the number of element +1 (split to each element, and ignore the reminder). So

my ($key, @array[0,1], undef, $value) = split " ", $string;
is actually interpreted as
my ($key, @array[0,1], undef, $value) = split " ", $string, 6; # Split + five times, ignore the sixth value
In that case, LIMIT is not 0 so the empty fields at the end are not removed.

You could write ($key, $value) = split(" ", $string, 8)[0,6]; (there are seven values from 0 to 6, so the reminder is the 8th), but the 8 seems to come out of nowhere, and this just calls for a mistake. Luckily, if LIMIT is negative, it will be treated as an infinite value, ie split will continue splitting until the end of the input, and won't remove empty fields at the end:

my ($key, $value) = (split(" ", $string, -1))[0,6];
Do note that split " " is a special case of split which is the same as split /\s+/ except empty fields at the start are removed.

Edit: for what it's worth, hippo's solution doesn't have that problem.

Replies are listed 'Best First'.
Re^4: Avoiding repeated undefs
by rsFalse (Chaplain) on Feb 27, 2019 at 22:41 UTC
    The split behavior here really looks like a bug. Limit "inheritance" is not documented, so it is a surprise for user.

    UPD.: Thanks, choroba, somehow I missed that line :( . Now it seems normal behaviour. But in case '(split ... )[ ... ]' it doesn't look like DWIM, when split generates too few elements (which cant't be accessible by higher indexes which were used).
      If what you mean by "inheritance" is

      > when assigning to a list, if LIMIT is omitted (or zero), then LIMIT is treated as though it were one larger than the number of variables in the list;

      then note that the quote was taken directly from the documentation of split.

      map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]