Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

UU-decode unpack of empty string yields undefined value

by AnomalousMonk (Archbishop)
on May 13, 2009 at 01:45 UTC ( [id://763648]=perlquestion: print w/replies, xml ) Need Help??

AnomalousMonk has asked for the wisdom of the Perl Monks concerning the following question:

Oh, Most Monkilicious Monks...

If an empty string is packed with the  'u*' template and then unpacked with the same template, the resulting value is undefined.

Is this behavior specified? Expected? Desired?

(I have never worked with pack and unpack) UU-encoding before, and have found a simple work-around for the 'problem', if such it be, and there is no compelling need to use this encoding in the first place. But now I'm curious...)

>perl -v This is perl, v5.8.2 built for MSWin32-x86-multi-thread ... Binary build 808 provided by ActiveState Corp. Built Dec 9 2003 10:19:40 ... >perl -wMstrict -le "my $ue = pack 'u*', ''; print 'ue undefined' unless defined $ue; print qq{:$ue:}; my $ud = unpack 'u*', $ue; print 'ud undefined' unless defined $ud; print qq{:$ud:}; " :: ud undefined Use of uninitialized value in concatenation (.) or string at ... ::

Replies are listed 'Best First'.
Re: UU-decode unpack of empty string yields undefined value
by almut (Canon) on May 13, 2009 at 11:05 UTC

    If encoding and decoding '' did reproduce '', you could then complain that encoding and decoding undef would not reproduce undef (but '' instead)...

    Reason is that uuencoding-wise, there is no way to distinguish an empty string from an undefined value, so it's simply an implementation choice which variant appears to be less of an issue in the typical case.

      If encoding and decoding '' did reproduce '', you could then complain that encoding and decoding undef would not reproduce undef (but '' instead)...
      But isn't the 'standard' (or at any rate the typical) Perl way to handle  undef in numeric and string contexts to coerce it to 0 or the empty string, respectively, and issue warnings as appropriate? Conversely, is there any other numeric or string context in which the empty string is coerced to undef?
      Reason is that uuencoding-wise, there is no way to distinguish an empty string from an undefined value...
      Are you saying by this that the UU-encoding standard specifies no way to encode an empty string, i.e., that an empty string is, literally, 'undefined' in the standard? If this is the case, I would expect  undef to be produced by the encoding step as well as by decoding.

      Perhaps the basic problem here is that I am simply unfamiliar with and do not understand the UU-encoding process.

        Are you saying by this that the UU-encoding standard specifies no way to encode an empty string, i.e., that an empty string is, literally, 'undefined' in the standard?

        Not quite. All I was trying to say is that while Perl variables internally have meta information to tell apart undef from empty, the uuencoding format doesn't have any such provision.  I.e. an empty string encodes to an empty string (nothing, zero bytes), and as it isn't really specified how to encode an undefined value, the nearest approximation would be to also encode it as nothing (zero encoded chars/bytes).

        Now, when you're faced with having to decode that nothing, you cannot tell whether it has been generated by an empty string, or undef. So you just have to make some choice... and it's kind of arbitrary whether you consider decoding nothing into undef or the empty string as more appropriate.

Re: UU-decode unpack of empty string yields undefined value
by december (Pilgrim) on May 13, 2009 at 06:42 UTC
    I'm running Perl v5.10.0 on Linux and I get the exact same output. I would assume, in a string context, an empty string to remain an empty string and not to become an "undefined value". An empty string is valid unicode (or indeed should be valid in any and every encoding) after all. I too am curious about the reason for this "feature", as I do a lot of unicode handling in my code – I've never noticed this before...
      ...as I do a lot of unicode handling in my code

      Just to avoid confusion:  uuencoding has nothing directly to do with Unicode.

      Note the difference between upper and lower case U. From pack:

      u A uuencoded string. U A Unicode character number. Encodes to UTF-8 inter +nally (or UTF-EBCDIC in EBCDIC platforms).

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://763648]
Approved by planetscape
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (7)
As of 2024-04-16 11:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found