Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

A question about method return values and error checking

by wee (Scribe)
on Nov 03, 2015 at 19:03 UTC ( [id://1146837]=perlquestion: print w/replies, xml ) Need Help??

wee has asked for the wisdom of the Perl Monks concerning the following question:

So a coworker and I were having a friendly debate about something and I'd like get the opinions of you fine folks.

In short: Do you think it's good style for a method that is normally expected to return a string to return undef in some cases?

In particular: The Sys::Info::OS module has a name() object method. It's normally is expected to return the operating system's name. The example given in the module's POD offers this as sample usage:

printf "Operating System: %s\n", $os->name( long => 1 );

The problem is that on certain linuxes, the name() method returns undef, which causes the above example code to output something like this:

Use of uninitialized value in printf at /tmp/test.pl line 20. Operating System:

The debate: I'm of the opinion that -- in general -- returning undef isn't a great idea. For subroutines, I think it's fairly terrible due to a caller's possible scalar or list context, which is unknown to the subroutine author and certainly of his control. For methods that are expected to be used in concatenation and the like, as in the example above, I think it's a little rude.

My coworker says that it's up to the end user to check for a valid string being returned. I say that there's nothing in the docs which says that undef might sometimes be emitted and that it would be more polite to have it return the output of 'uname -o -r' or 'unknown' or whatever. It sets up the expectation that a string will be returned, and then forces you to deal with a possible undef. Does every method call return undef sometimes?

I mean, it's not hard to do this instead of the module's example:

my $osname = $os->name() ? $os->name() : `uname -o -r`; chomp($osname); print "Operating System: $osname\n";

But there's a part of me that says it'd be better to have the name() method do that rather than require it for (I suppose; I don't know) every method call the module offers. Where else might there be errors waiting to be checked for?

Also, this comes from code that was shipped to a customer. We've got perhaps 6 various Linuxes in house (including a very old version of RHEL), and all of them print the OS name reliably. The customer got the "Use of uninitialized value ..." error, and we, frankly, looked like chumps because of it.

It's nothing earth-shattering, and the fix was easy, but it's the genesis of the debate. I think it's not really our fault since the module doesn't do what it says it would do (the is_unknown() method notwithstanding). The coworker claims it is our fault due to shoddy error checking. And I can certainly see his side of things since we shipped the code, after all. We're ultimately responsible for what it does.

Your thoughts?

Replies are listed 'Best First'.
Re: A question about method return values and error checking
by kennethk (Abbot) on Nov 03, 2015 at 19:17 UTC
    This is definitely subjective, and a point for documentation. In both cases, $os->name is trying to communicate that the $os object has no name. It's actually more consistent with this interpretation in my mind to return an undef here, since an undef feels more like a null to me, as opposed to an empty string. But either is acceptable so long as it's documented that way.

    Assuming you are dealing with 5.10.0 or later (I think), the Logical Defined Or actually gives a very elegant way of writing your case, and makes undef a preferred choice:

    printf "Operating System: %s\n", $os->name( long => 1 ) // 'undef';
    or even
    printf "Operating System: %s\n", $os->name( long => 1 ) // (`uname -o +-r` =~ s/\n//r);

    #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

      I've been thinking about it and I am leaning toward this. We had no idea that it might even return undef. What operating system has no name? And with the docs like they were, it's just not something we accounted for. But the logical defined or looks like the way to go.
        If you wanted to be particularly crazy, you could even go with:
        printf "Operating System: %s\n", $os->name( long => 1 ) // (`uname -o -r` =~ s/\n//r) || 'undefined';

        #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

Re: A question about method return values and error checking
by GotToBTru (Prior) on Nov 03, 2015 at 19:28 UTC

    If the end result of the processing done by the subroutine is an undefined quantity, I believe that is the value it should return. poNumber() should return undef if no PO Number exists. If an undefined value would only result because of an unexpected condition, the subroutine should raise an error. An example of the latter is a subroutine that is supposed to return the name of the OS. The OS has a name, but if our code can't arrive at it, that's an error in the programming and this should be noted as such.

    Dum Spiro Spero
Re: A question about method return values and error checking
by dsheroh (Monsignor) on Nov 04, 2015 at 09:19 UTC
    Because of the context issues you mentioned, I'd use a blank return; instead of return undef;, but, other than that, I tend to agree with your coworker. If there's no defined value to return, return an undefined value.

    As far as warnings and error detection/handling, you need to check the result either way to see whether it failed to get an OS name or not and checking the truthiness of the return value works just as well (and doesn't emit any warnings) regardless of whether the return value on failure is undef or an empty string.

      A counter example to consider: let's say the code in question was:
      printf <<EOF, $os->name( long => 1 ), $os->version; Operating System: %s Version : %s EOF
      In order for the output to make sense, you would actually want a placeholder in the list for the name. In the general case, invoking $os->name in a list context doesn't really make much sense unless you really have an explicit list of values.

      #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

        That's resolved trivially:
        printf <<EOF, $os->name( long => 1 ) // 'unknown OS', $os->version // +'unknown version'; Operating System: %s Version : %s EOF
        If you return empty strings on failure instead of undef, then you'd need to use || instead of //, which could cause issues if 0 is a potential return value. (Not likely in the specific example of OS name/version, but relatively common in the general case.) So, personally, I'd chalk that up as another point in undef's favor.

      Because of the context issues you mentioned, I'd use a blank return; instead of return undef;

      I hate you! Functions that are expected to return a scalar should not suddenly returning nothing. It rarely has any advantage, and it fails very poorly in common situations.

      f( name => get_name(), foo => 1 );
        > f( name => get_name(), foo => 1 );

        That's because the design of => is flawed.

        It should enforce scalar context, if people want to use it as a pairing operator.

        DB<106> sub tst { return } DB<107> x a => scalar( tst() ), b=>2 => ("a", undef, "b", 2)

        And yes I know that Moose relies on this "feature", but you can't have it all.

        Cheers Rolf
        (addicted to the Perl Programming Language and ☆☆☆☆ :)
        Je suis Charlie!

Re: A question about method return values and error checking
by mr_mischief (Monsignor) on Nov 04, 2015 at 15:47 UTC

    I think you're asking writers of code you're calling to consider the code around it, which is code you're writing. Why don't you read the documentation for code you call and be prepared for it to return what it's documented to return? The code you write is your responsibility, not the responsibility of those providing properly documented libraries to you.

    If you want to introduce a code shim around the library call that allows you to handle the undef differently and return something else, then feel free to do that. Don't try to call out the library's implementors as doing the wrong thing. They told you what their code does. You chose to call it. From there your code is your responsibility, not theirs.

    Given the opportunity to write code that does it the same way as Sys::Info::OS or another way, choose either. Then document what your code returns and why. Then people can decide whether to use your code or not, and perhaps can blame you for their code doing the wrong thing after they ignore its documentation.

    I just read the documentation for the module, and it doesn't promise that the name method returns a defined value nor does it clearly say it may return undef. I'm sure the author would appreciate a POD patch. Meanwhile, there are is_linux, is_bsd, is_windows, and is_unknown methods to test in a boolean manner for each of those. Have you tried those methods? Are you depending upon code you're unsure of working the way you're using it?

    If you read the fairly simple code of Sys::Info::OS to understand what it does, you'd see it's checking against Sys::Info::Constants's OSID value for those boolean methods. In the name method, it's deferring to Sys::Info::Driver::Linux::OS in the case of Linux, which in turn defers in its private _populate_os_version method to Sys::Info::Driver::Linux::OS::Distribution. That module does a best guess by reading the conventional (and LSB-required) informational files in /etc and /proc to determine the system name, version, edition, and more. The synopsis of Sys::Info::Driver::Linux::OS::Distribution very clearly shows that its name method returns a false value if it can't determine the OS. The description clearly states which distributions it does currently support, and welcomes patches.

    This document describes version C<0.7903> of C<Sys::Info::Driver::Linux::OS::Distribution> released on C<8 May 2013>. This is a simple module that tries to guess on what linux distribution we are running by looking for release's files in /etc. It now looks for 'lsb-release' first as that should be the most correct and adds ubuntu support. Secondly, it will look for the distro specific files. It currently recognizes slackware, debian, suse, fedora, redhat, turbolinux, yellowdog, knoppix, mandrake, conectiva, immunix, tinysofa, va-linux, trustix, adamantix, yoper, arch-linux, libranet, gentoo, ubuntu and redflag. It has function to get the version for debian, suse, redhat, gentoo, slackware, redflag and ubuntu(lsb). People running unsupported distro's are greatly encouraged to submit patches.

    Instead of complaining about the author who did all that work so you could use a single method not returning exactly what you think it should, why don't you either do the work yourself, handle the return value properly for what you're getting, or contribute to his project? Patches are "greatly encouraged" and I think you've volunteered to submit some quite well.

      The world is better place if you're not rude to strangers.

      Give that a try sometime.

        Presuming strangers owe you their time is among the most rude things one can do. If you don't like the free and open code provided, don't use it. If you don't like my advice, you don't have to follow it.

        Now you're not just complaining about the free and open code. Now you're complaining that I pointed out in my thorough review of your situation that you were complaining about the free and open code.

        So far as I can tell from reading the thread, I've put more time into understanding and fixing your problem than you have. I'm pretty certain the group as a whole has spent more person-hours on this than you have. Frankly, I think that's rude of you.

        Nothing requires me to help you at all. Neither requires the other members here to help, either. You're ranting about a stranger to strangers and expecting them not only to help you for free, but apparently to agree with you as well. Here's the thing: when you're wrong (and I do think you're wrong) agreeing with you doesn't help. Disagreeing with you isn't an attempt to be rude. It's an attempt to make your situation better in the long run even if it's a little uncomfortable for you now.

        Then after all the free time people here have put into trying to help you with your complaint about the free code you're using to make your living, you make an entire new response to tell me you don't like the tone you read into my advice? Which one of us exactly is being rude?

Re: A question about method return values and error checking
by thargas (Deacon) on Nov 04, 2015 at 19:10 UTC

    My personal take on this is: it depends. If the thing to be returned is something which is supposed to exist, then I would throw an exception (die). If it's something which sometimes exists and sometimes doesn't, then undef is (IMHO) the way to go.

    So then, in this case, what should we do? Well, I'd say that an OS would be expected to have a name, so, lacking documentation to say whether undef is a valid thing to return, I would expect it to die.

    However, when we start to investigate further, there is no standard for how to determine such a thing as OS name, at least not portably, so I would accept the author's choice to return undef, as it seems, in the real world, this is indeed a piece of information which may be lacking.

    The best thing to do (IMHO) would be to provide a patch to the author which updates the docs to say that undef may be returned, and in addition, if it's about to return undef for name, try running "uname -o' and use that.

Re: A question about method return values and error checking
by nikosv (Deacon) on Nov 04, 2015 at 08:17 UTC
    I think it's another perspective of the 'return error code vs throw an exception' debate and not just a matter of instead of returning undef upon error, to return another true holding value
Re: A question about method return values and error checking
by sectokia (Pilgrim) on Nov 05, 2015 at 12:20 UTC
    At cpan, I notice its very common not to properly mention the way error handling is done. I have seen undef, "", 0, throwing exceptions, dieing. Basically if the author doesn't explicitly say what happens, you have to eval the call, then check the result to figure out what it is, or better yet just open the module code and have a look. I prefer to just return empty string, in your particular situation, since it's a value and it's correct in that no name could be obtained.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1146837]
Front-paged by stevieb
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (4)
As of 2024-04-19 02:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found