Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

undef vs empty string '' from split

by chrism01 (Friar)
on Jun 20, 2007 at 23:35 UTC ( [id://622420]=perlquestion: print w/replies, xml ) Need Help??

chrism01 has asked for the wisdom of the Perl Monks concerning the following question:

Monks,

Given the following code:

#!perl -w use strict; my ( $var1, $var2, @t_arr ); open(SL,"<s.log"); @t_arr=<SL>; close(SL); chomp(@t_arr); $t_arr[0] =~ s/\s+$//; # trailing whitespace ($var1, $var2) = split( /\s*=\s*/, $t_arr[0], 2); if( !defined($var2) ) { print "undefined\n"; } else { print "$var1 X${var2}X\n"; } exit;
and file is:
event_handler=
ie nothing (except \n) after '='. Why do I get:
event_handler XX
as the output? I was expecting "undefined".

Cheers
Chris

Replies are listed 'Best First'.
Re: undef vs empty string '' from split
by ikegami (Patriarch) on Jun 20, 2007 at 23:44 UTC

    The seperator is matched, so it must be seperating two things. One of them must therefore be "". It allows the following three cases to be distinguished: a=b vs a= vs a.

    for ('a=b', 'a=', 'a') { my ($p1, $p2) = split(/=/, $_, 2); print( "$_\t", ( !defined($p2) ? 'undef' : !length($p2) ? '""' : qq{"$p2"}, ), "\n" ); }
    a=b "b" a= "" a undef
      Ok, that makes sense ie being able to tell if it matched the separator at all
Re: undef vs empty string '' from split
by Joost (Canon) on Jun 20, 2007 at 23:57 UTC
    You're specifying a limit of 2. You can make 2 fields out of the string "bla=" splitting on = - the last field is the empty string following the = sign.

    If you hadn't specified the limit, you would have got what you expected:

    Splits the string EXPR into a list of strings and returns that list. By default, empty leading fields are preserved, and empty trailing ones are deleted. (If all fields are empty, they are considered to be trailing.)

    ...

    If LIMIT is unspecified or zero, trailing null fields are stripped (which potential users of "pop" would do well to remember).

    (from split, emphasis mine). Note that using a limit of 1 does not split the string (which was news to me):

    #!/usr/bin/perl -w use strict; sub pr { print "'$_[0]' (no limit) => [".join(",",map { defined $_ ? "'$_'" + : 'undef' } split /=/,$_[0])."]\n" ; print "'$_[0]' (limit 1) => [".join(",",map { defined $_ ? "'$_'" +: 'undef' } split /=/,$_[0],1)."]\n" ; print "'$_[0]' (limit 2) => [".join(",",map { defined $_ ? "'$_'" +: 'undef' } split /=/,$_[0],2)."]\n" ; } pr(''); pr('a'); pr('='); pr('a='); pr('a=b');
    update: output:
    '' (no limit) => [] '' (limit 1) => [] '' (limit 2) => [] 'a' (no limit) => ['a'] 'a' (limit 1) => ['a'] 'a' (limit 2) => ['a'] '=' (no limit) => [] '=' (limit 1) => ['='] '=' (limit 2) => ['',''] 'a=' (no limit) => ['a'] 'a=' (limit 1) => ['a='] 'a=' (limit 2) => ['a',''] 'a=b' (no limit) => ['a','b'] 'a=b' (limit 1) => ['a=b'] 'a=b' (limit 2) => ['a','b']
    updated again fixed code & output.
      I haven't tried your more exotic version, but if you remove the limit cnt from my simple code, you still get the empty string...
      Actually, that's the first thing I tried after it didn't 'work', given that my data only has var=val (if there is a val to get)
        Hmm.. You mean like this:
        #!perl -w use strict; $_ = 'event_handler='; my ($var1, $var2) = split( /=/); if( !defined($var2) ) { print "undefined\n"; } else { print "'$var1' '$var2'\n"; }
        output:
        'event_handler' ''
        that does seem to be inconsistent with the docs. And what's even more confusing is that:

        #!perl -w use strict; $_ = 'event_handler='; my @arr = split(/=/); my ($var1, $var2) = @arr; if( !defined($var2) ) { print "undefined\n"; } else { print "'$var1' '$var2'\n"; }
        output:
        undefined
        Confirms the docs.

        updated: added output for perl 5.8.8 / linux

Re: undef vs empty string '' from split
by GrandFather (Saint) on Jun 21, 2007 at 00:08 UTC

    You can achieve what (I presume) you want by ensuring there is at least one character following the match:

    ( $var1, $var2 ) = split( /\s*=\s*(?=.)/, $t_arr[0] );

    Note however that if you are hand rolling an .ini file parser, that there area a plethora of modules to do the job on CPAN. See Config::Ini and Config::IniFiles for example.


    DWIM is Perl's answer to Gödel
      As it happens, the empty string is easier (simpler code), because it's going to be inserted into an Ingres NOT NULL column and I don't have to check/cvt the value from 'undef'.
      No, this isn't a .ini file parser btw.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://622420]
Approved by Joost
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (3)
As of 2024-04-26 00:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found