http://qs321.pair.com?node_id=368936

...and octals, and decs.  Specifically, I refer you to the functions oct() and hex().

Now, consider:  man perlfunc groups oct() and hex() in 'functions for scalars or strings', and also in 'numeric functions' along with sin(), cos(), tan(), sqrt().  So they can be used both for strings and for numeric values.  Considering how we represent octal and hexadecimal values, this is perfectly reasonable.  But let us consider for a few moments some of the other functions with which they are grouped.

crypt(), for instance, returns a crypted version of plaintext input.  reverse() returns a reversed version of its input.  abs() returns the absolute value of input of undeterminate sign, sin() returns the sine of the value it is called on, etc, etc.  In this vein, if I find a function hex(), I normally expect it to be a function which returns a hexadecimal representation of its input, and oct() to return an octal representation.  Just like any of the functions above, I expect its name to be indicative of what it does.

Perl's hex() and oct(), however, violate this convention.   They return, in fact, the decimal value of their input, which hex() assumes is in hexadecimal representation, and oct() is actually quite happy to accept in hexadecimal or binary as well as octal.  They are, in fact, not so much hex() and oct() functions as they are unhex() and unoct() functions.  They could both be replaced, with more clarity and no loss of generality, by a single dec() function which uses logic similar to the following to determine the nature of its input:

/^(0x)?[0-9A-Fa-f]+$/i Assume hexadecimal /^0b[01]+$/i Assume binary /^[0-7]+$/ Assume octal /^\d+$/ Assume decimal Any other input Error

One could then call dec() to obtain decimal representations of binary, octal and hexadecimal input, freeing up oct() and hex() to return octal and hexadecimal representations of their input respectively (which could be trivially be done using the input logic above in a wrapper around sprintf).

Now, I'm not necessarily advocating making this change, because it would break much existing code.  But would anyone care to speculate as to why it might be that hex() and oct(), unlike almost every other predefined function in every programming language I know, are named not for the nature of the output they return, but for the nature of the input they expect?  This is as counter-intuitive as having a "Start" button on the dash of your car which, when pushed, shuts off the engine (and which is labelled "Start" because it assumes it will only ever be pushed when the engine has already been started).  It is, in short, Bad Design.

Replies are listed 'Best First'.
Re: A philosophical pondering concerning hexes
by diotalevi (Canon) on Jun 23, 2004 at 04:10 UTC

    I like the idea you are presenting but I balk at your interpretation that oct() and hex() convert or return decimal. They consume a properly formed string and return a number. Perl happens to stringify numbers as their decimal representation. It is only incidental that print hex "0x20" appears to be a hex_string->decimal converter. It is just a hex_string->number converter.

    I am not quite sure what the name for numify_hex_string() should be. unhex() is awkward and is still not descriptive of its function.

    I also note that oct() doesn't happily consume non-octal data. It just screws up when you do that.

      I also note that oct() doesn't happily consume non-octal data. It just screws up when you do that.
      huh? oct("0xf") will happily return 15. "0xf" is non-octal in my view...
        My guess is that he means something like:
        $ perl -wle 'print oct 108' Illegal octal digit '8' ignored at -e line 1. 8
        To me, this is just what's Perl is about; regardless of what you throw at it, it does it utter best to make something out of it, and it issues a warning there's something odd about the input. Although in this case, it isn't just Perl that does so - the C functions strtol, and atoi and friends also consume initial portions of a string, stopping at the first character that isn't valid.

        Abigail

Re: A philosophical pondering concerning hexes
by BigLug (Chaplain) on Jun 23, 2004 at 04:53 UTC
    I've often felt that base manipulation was one of perl's shortcomings -- not because it can't do it just because it's difficult and, as you point out, the functions work in reverse.

    I don't want to see your magic function because we'd then have:

    dec(998) == 998 dec(999) == 999 dec(1000) == 8 dec(1001) == 9 dec(1002) == 514
    which I'm sure you'll agree is just wrong.

    I'd prefer that the existing functions did what they appear to do: hex() returns a hexidecimal representation of the input data and oct() returns an octal representation. Similarly a bin() function should return a binary representation.

    As changing now would be an absolute backward-compatability nightmare, I'd suggest using the full names for decimal-to-base-n conversion: hexidecimal(255) eq 'FF'.

    We might also include generic base manipulator, but it's probably more a loadable (module) function (the functionality below probably already exists in a module, I haven't checked):

    # changebase($number, $from[, $to]); print changebase(255, 10, 16); # FF print changebase(0xFF, 16, 10); # 255 print changebase('FF', 16); # assume change to base 10 when no 'to' i +s supplied # 255 print changebase('FF', 16, 2); # 11111111
    An OO module could even be more transparent:
    use Base; $hex = new Base 16 => 'FF'; print $hex; # Stringify # FF print $hex->binary; # Return value only ($hex->binary is an alias f +or $hex->base 2) # 11111111 print $hex; # FF print $hex->to_binary; # Change value # 11111111 print $hex; # 11111111
    After all that, my main point is that I absolutely agree that those two functions are completely non-intuitive.
    "Get real! This is a discussion group, not a helpdesk. You post something, we discuss its implications. If the discussion happens to answer a question you've asked, that's incidental." -- nobull@mail.com in clpm

      You're absolutely right on the 1000/1001 etc. case, I didn't think that through far enough. I was thinking more in terms of "This is wrong, what would be a better approach?" than specific implementations.

      A better implementation would be to add functions like oct2dec(), hex2dec(), dec2binary() ....

Re: A philosophical pondering concerning hexes
by Abigail-II (Bishop) on Jun 23, 2004 at 10:47 UTC
    Your dec has a problem. Anything matched by /^[0-7]+$/ or /^\d+$/ is going to be matched by /^(0x)?[0-9A-Fa-f]+$/i as well, so if you do the tests in order you present it, it'll assume either hexadecimal, or binary. Reversing the order will give a problem as well. How would you distinguish between 1016 == 1610, 1010, and 108 == 810 (subscripts indicate base)?

    Of course, as pointed out, your fallacy lies in assuming that oct and hex return decimal representations of numbers - they don't. They return numbers:

    $ perl -MDevel::Peek -wle 'Dump hex 10' SV = IV(0x8192014) at 0x8181270 REFCNT = 1 FLAGS = (IOK,READONLY,pIOK) IV = 16
    It's a number - without a stringified valued.

    Also, Perl already has a function to turn a number into a hexadecimal, octal or binary representation: it's called sprintf.

    Abigail

Re: A philosophical pondering concerning hexes
by ysth (Canon) on Jun 23, 2004 at 08:28 UTC
    Your description of dec() is what oct() does, except for the "Assume decimal" line, which is pretty useless in perl.

    At least it isn't named something horrid like strtoul.

    Oh, and you've got reverse "backward". You pass it a reversed string and it returns a string in the correct order, so it is analogous to oct() after all :)

    Update: oops; I missed the ? in (0x)?; oct() requires the 0x to be there. Sounds like what you ought to do is implement sscanf in POSIX.pm and use it instead of oct/hex.

Re: A philosophical pondering concerning hexes
by hardburn (Abbot) on Jun 23, 2004 at 21:27 UTC

    Personally, I think it's a good argument for making numbers true objects. It doesn't matter how they're represented internally, as long as operations (addition, subtraction, etc.) all work. You can specify the format of input and output, but that doesn't change how it operates.

    Such a change wouldn't fit into Perl5, but would probably work in Perl6, and is probably already done in Python and Ruby.

    ----
    send money to your kernel via the boot loader.. This and more wisdom available from Markov Hardburn.