Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

[Solved] Avoiding repeated undefs

by davies (Prior)
on Feb 26, 2019 at 09:43 UTC ( [id://1230550]=perlquestion: print w/replies, xml ) Need Help??

davies has asked for the wisdom of the Perl Monks concerning the following question:

I have some working code. One line of it smells. That line is:

my ($key, undef, undef, undef, undef, undef, $val) = split(/\s+/, $_);

It seems to me to be inelegant to repeat undef. I tried the suggestions in Generating an array of n identical elements and the replies, but they threw error messages when used in a my statement. Is there a more elegant idiom?

Regards,

John Davies

Update: Thanks, all, for scratching my itch!

Replies are listed 'Best First'.
Re: Avoiding repeated undefs
by Eily (Monsignor) on Feb 26, 2019 at 09:55 UTC

    my ($key, $val) = (split(" ", $_))[0,6]; would work, although you might just as well write:

    my @values = split(" ", $_); my ($key, $value) = @values[0,6];
    I find the second one more elegant personally

    Edit: this is not an exact equivalent though, as davies demonstrated below

      Trying the first one, which I thought extremely elegant, I ran into some unexpected behaviour.

      dr@dns:~$ cat sscce.pl use strict; use warnings; use feature 'say'; my $str = 'a b c d e f '; my ($key, undef, undef, undef, undef, undef, $val) = split(" ", $str); say "Val = <$val>"; my ($key2, $val2) = (split(" ", $str))[0,6]; say "Val2 = <$val2>"; dr@dns:~$ perl sscce.pl Val = <> Use of uninitialized value $val2 in concatenation (.) or string at ssc +ce.pl line 9. Val2 = <>

      Some of the data I am hacking have only six data points, but with a trailing space. In my original version I was getting a zero length string as the last value as shown in the SSCCE above. But using the more elegant version, it becomes undef and I get the warning shown. Actually, I find it more logical to get undef and changing my code would be no problem, but I can't find any indication why split should behave so apparently inconsistently. I have tried using the regex in my OP, but get the same results as in the SSCCE.

      As I said in my OP, I have working code. This isn't important, but it is a gap in my knowledge that I'd like to close.

      Regards,

      John Davies

        Yes, split isn't exactly consistent there. There is a third parameter - LIMIT - which limits the number of time the string is splitted. Eg: split '_', 'a_b_c_d', 2 will actually return the list ('a', 'b_c_d') because it has been splitted in two.

        The thing is if LIMIT is 0 or omitted, all empty fields at the end are removed. So split '_', 'a_b___', 0; will return the list ('a', 'b'). Which is why you get an undefined value in your second case.

        Now, the tricky bit is, for optimization, when perl knows how many values you are trying to write to, it will actually set LIMIT to the number of element +1 (split to each element, and ignore the reminder). So

        my ($key, @array[0,1], undef, $value) = split " ", $string;
        is actually interpreted as
        my ($key, @array[0,1], undef, $value) = split " ", $string, 6; # Split + five times, ignore the sixth value
        In that case, LIMIT is not 0 so the empty fields at the end are not removed.

        You could write ($key, $value) = split(" ", $string, 8)[0,6]; (there are seven values from 0 to 6, so the reminder is the 8th), but the 8 seems to come out of nowhere, and this just calls for a mistake. Luckily, if LIMIT is negative, it will be treated as an infinite value, ie split will continue splitting until the end of the input, and won't remove empty fields at the end:

        my ($key, $value) = (split(" ", $string, -1))[0,6];
        Do note that split " " is a special case of split which is the same as split /\s+/ except empty fields at the start are removed.

        Edit: for what it's worth, hippo's solution doesn't have that problem.

        Could you explain where/how do you think split() is inconsistent? . . . Wait, this is where ...

        In my original version I was getting a zero length string as the last value as shown in the SSCCE above. But using the more elegant version, it becomes undef and I get the warning shown.

        ... in which case, sorry to bother you. I see that in perl 5.24.0.

        So in your own version, split() gets the number of fields (limit) to generate; in Eily's version, split() behaves as expected in that (just see the B::Deparse output) ...

        # Inserted newlines for legibility of one-liner run under Windows. perl -MO=Deparse -e " use strict; use warnings; my $x = q[a b c d e f ]; my ( $one , undef , undef, undef , undef, undef , $other ) = split q +[ ] , $x; my @all = split q[ ] , $x; " use warnings; use strict; my $x = 'a b c d e f '; my($one, undef, undef, undef, undef, undef, $other) = split(' ', $x, 8 +); my @all = split(' ', $x, 0); -e syntax OK

        At least in perl 5.24.0, supplying 0 or -1 as the limit to split() in the OP's version has no effect on the outcome, i.e. $other gets the value of empty string not undef. $other becomes undef only if one sets the limit to 6.

        Then force the issue by splitting in list context: my ( $x , ... , $y ) = () = split( ... ).

        After all that, I am unable to answer your question: Why?

Re: Avoiding repeated undefs
by Discipulus (Canon) on Feb 26, 2019 at 09:54 UTC
    Hello davies,

    > It seems to me to be inelegant to repeat undef

    No, is not. Compiler does not bother with elegance ;)

    Anyway I suspect you cannot avoid them repeated: you are are in left side of an assignement, and inside a my declaration: no array (well you mean list?) can be there.

    PS

    An eventual list as second (or whatever..) element in the assignement will slurp everything

    perl -e "$str = 'a b c d e f g';@arr=(1,2,3,4,5); ($key, @arr, $val) = + split(/\s+/, $str); print qq($key $val\n@arr\n)" a b c d e f g # even if the array is presized: perl -e "$str = 'a b c d e f g';$#arr=4; ($key, @arr, $val) = split(/\ +s+/, $str); print qq($key $val\n@arr\n)" a b c d e f g

    L*

    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
      even if the array is presized

      If you specify the slice in the assignment it's fine though:

      use strict; use warnings; use Test::More tests => 1; $_ = "Anyway I suspect you cannot avoid them repeated: you are are in +left side"; my @undef; my ($key, $val); ($key, @undef[0..4], $val) = split(/\s+/, $_); is ($val, 'them');

      I still prefer Eily's approach, however.

        Similar approach:
        ( my $key, (undef) x 5, my $val) = split(/\s+/, $_);
        Note: 'undef' must be enclosed with parentheses to force list context for 'x' operator.
Re: [Solved] Avoiding repeated undefs
by talexb (Chancellor) on Feb 27, 2019 at 00:40 UTC

    It sounds like you already have a good solution, but for me, the obvious one is to pop the output from split into an array, then take the first and last values.

    my @array = split(/\s+/, $_); my ( $key, $value ) = @array[ 0, -1 ];
    That approach doesn't care if the number of values changes -- you always get the first and the last values. Also, the split could be simplified to just
    my @array = split(/\s+/);
    because $_ is the default parameter for this function.

    Alex / talexb / Toronto

    Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.

Re: Avoiding repeated undefs
by bliako (Monsignor) on Feb 26, 2019 at 15:07 UTC

    Are equally inelegant entries accepted?

    my $str = 'a b c d e f g'; my @two = $str =~/^([^\s]*)(?:(?:\s+[^\s]+){5})\s+([^\s]+)/; # or my ($key, $val) = $str =~ ... print join(",", @two)."\n";
Re: Avoiding repeated undefs
by bliako (Monsignor) on Feb 26, 2019 at 15:15 UTC

    That passes my elegance test (trust me I drove a mercedes for years):

    my $str = 'a b c d e f g'; my ($key, $val) = split(/(?:\s+[^\s]+)+\s+/, $str); print "key=$key, val=$val\n";

    bw, bliako

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1230550]
Approved by marto
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (6)
As of 2024-04-23 20:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found