Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Trinary Operator Semantics

by hardburn (Abbot)
on May 26, 2005 at 14:27 UTC ( #460703=perlmeditation: print w/replies, xml ) Need Help??

NOTE: This was orginally a mailing to my local LUG after a Perl vs Python debate and a discussion by one Pythoner about how he's annoyed by the trinary operator.

At a recent LUG meeting, a small debate arose over the trinary operator (so-called because it's the only three-arg operator in most (all?) languages). It was stated that a compiler can create exactly the same code for both the trinary operator and an if-else construct.

It is true that they are functionally identical. In other words, if you create two functions with each construct and give them the same input, they will always return the same output. However, this statement says nothing about the efficiency of the given function.

In programming langauges, there is another kind of identity to keep in mind (and this is the one that I think Python's designers forget a lot): semantic identity. This deals with how the compiler generates the underlieing machine instructions. Optimiziers might seem like they are deep voodoo, but they are actually straightforward mathmatically once you realize how semantics work.

To prove it, I wrote this C program:

#include <stdio.h> int trinary( int var ); int ifelse( int var ); int trinary( int var ) { int to_return = var ? 1 : 0; return to_return; } int ifelse( int var ) { int to_return; if( var ) { to_return = 1; } else { to_return = 0; } return to_return; } int main(int argc, char **argv) { int got_trinary = trinary( 1 ); int got_ifelse = ifelse( 1 ); return 0; }

This was compiled with gcc -fverbose-asm -S trinary_test.c. This will place the generated assembly code in 'trinary_test.s'. The '-fverbose-asm' arg has gcc make the ASM a little more readable.

My gcc is:

$ gcc -v Reading specs from /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.5-20050130/s +pecs Configured with: /var/tmp/portage/gcc-3.3.5.20050130-r1/work/gcc-3.3.5/configure --enable-version-specific-runtime-libs --prefix=/usr --bindir=/usr/i686-pc-linux-gnu/gcc-bin/3.3.5-20050130 --includedir=/usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.5-20050130/include + --datadir=/usr/share/gcc-data/i686-pc-linux-gnu/3.3.5-20050130 --mandir=/usr/share/gcc-data/i686-pc-linux-gnu/3.3.5-20050130/man --infodir=/usr/share/gcc-data/i686-pc-linux-gnu/3.3.5-20050130/info --with-gxx-include-dir=/usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.5-200501 +30/include/g++-v3 --host=i686-pc-linux-gnu --disable-altivec --enable-nls --without-included-gettext --with-system-zlib --disable-checking --disable-werror --disable-libunwind-exceptions --disable-multilib --disable-libgcj --enable-languages=c,c++,f77 --enable-shared --enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu Thread model: posix gcc version 3.3.5-20050130 (Gentoo Hardened Linux 3.3.5.20050130-r1, ssp-3.3.5.20050130-1, pie-8.7.7.1)

Below are the outputs for the two functions in question. I've put in comments for where the param-passing sections begin and end, which are the same for both functions, since they have the same function signature.

# trinary function .LC0: .string "trinary" .text .globl trinary .type trinary, @function trinary: # Start param passing pushl %ebp movl %esp, %ebp pushl %ebx subl $52, %esp call __i686.get_pc_thunk.bx addl $_GLOBAL_OFFSET_TABLE_, %ebx movl __guard@GOT(%ebx), %eax movl (%eax), %eax movl %eax, -24(%ebp) # End param passing cmpl $0, 8(%ebp) # var setne %al movzbl %al, %eax movl %eax, -28(%ebp) # to_return movl -28(%ebp), %eax # to_return movl __guard@GOT(%ebx), %edx movl (%edx), %edx cmpl %edx, -24(%ebp) je .L2 movl -24(%ebp), %eax movl %eax, 4(%esp) leal .LC0@GOTOFF(%ebx), %eax movl %eax, (%esp) call __stack_smash_handler@PLT .L2: addl $52, %esp popl %ebx popl %ebp ret .size trinary, .-trinary .globl __stack_smash_handler .section .rodata # ifelse function .LC1: .string "ifelse" .text .globl ifelse .type ifelse, @function ifelse: # Start param passing pushl %ebp movl %esp, %ebp pushl %ebx subl $52, %esp call __i686.get_pc_thunk.bx addl $_GLOBAL_OFFSET_TABLE_, %ebx movl __guard@GOT(%ebx), %eax movl (%eax), %eax movl %eax, -24(%ebp) # End param passing cmpl $0, 8(%ebp) # var je .L4 movl $1, -28(%ebp) # to_return jmp .L5 .L4: movl $0, -28(%ebp) # to_return .L5: movl -28(%ebp), %eax # to_return movl __guard@GOT(%ebx), %edx movl (%edx), %edx cmpl %edx, -24(%ebp) je .L6 movl -24(%ebp), %eax movl %eax, 4(%esp) leal .LC1@GOTOFF(%ebx), %eax movl %eax, (%esp) call __stack_smash_handler@PLT .L6: addl $52, %esp popl %ebx popl %ebp ret .size ifelse, .-ifelse .globl __stack_smash_handler .section .rodata

I warn against trying to make an opcode count. Both Intel and AMD stopped publishing opcode counts years ago, because modern processors have so many optimizations built in that you can no longer get a predictable count. Unlike compiler optimizations, I think processor optimizations actually are deep voodoo, to the point where they are boarderline non-deterministic.

What can be compared is that the ifelse has 3 jumps to trinary's 1. That will be a significant hit on performance, especially if the processor makes a branch mis-prediction.

Also, notice how the two handle the return variable:

# trinary movl %eax, -28(%ebp) # to_return # ifelse movl $1, -28(%ebp) # to_return

The trinary function is handling the variable between registers, whereas ifelse is using a memory location. This is another boost to the trinary operator.

Semantics are basically a way to give the compiler more information about what you're trying to do. In this case, the trinary operator is designed around the idea that you will directly assign to a variable (to_return = var ? 1 : 0;). OTOH, an if-else may or may not be used that way, so it must be compiled in a much more general way.

As it happens, putting -O2 in the compiler args has both functions compile to identical assembly. Apparently, gcc is smart enough to figure out that the if-else's only side effect is modifying the to_return variable. Consider all this a lesson in how semantics work in general, because such smart optimization may not always be possible without introducing different syntax to produce better semantics. Also, it may not be possible for gcc to do so when the operation is placed inside a much more complex function.

All this may not apply to other programming languages. I suspect Perl's implementation isn't very efficient, on the basis that Perl's semantics in general make C look good (it only got constant-folding fairly recently). It isn't so much that Perl's designers ignored good semantics, but that the language wasn't really designed at all. Perl6 should be a great improvement here.

Since the orginal writing, I've been shown output from B::Terse showing that Perl will indeed do optimization of the trinary operator.

Language designers often make a less-general alternative to another statement that has better semantics (particularly the (semi-)functional languages). That's why C's switch() statement is limited in the conditionals it can handle compared to a if-elseif-else. The limitation means switch() can be compiled to a jump table with O(1) efficiency, whereas the if-elseif-else is O(n). (See Re: Perl Idioms Explained: && and || "Short Circuit" operators for an interesting discussion on this).

Functional languages won't generally force a programmer to learn syntax for the sake of learning syntax (i.e., obfuscation), but they freely add (or remove) syntax if it means the compiler can reason about the program more effectively. While this admittedly makes the learning curve harder, it also produces a much more efficient language.

"There is no shame in being self-taught, only in not trying to learn in the first place." -- Atrus, Myst: The Book of D'ni.

Replies are listed 'Best First'.
Re: Trinary Operator Semantics
by pfaut (Priest) on May 26, 2005 at 19:39 UTC

    I just compiled your code on a VAX (VMS) both with and without optimization enabled. In each case, both routines generated identical code.

    To me, the ternary operator is most useful when used in a function call argument list or other complex expression where one element needs to change based on some condition. These become much easier and concise to express with a ternary operator than with two separate statements within an if/else construct.

    Contrived example:

    printf("File name: %s\n", (ptr == NULL) ? "<unspecified>" : ptr);

    ...or...

    char *s; if (ptr == NULL) s = "<unspecified>"; else s = ptr; printf("File name: %s\n", s);

    What it boils down to is that the ternary operator presents an expression and can be used anywhere an expression can be used. if/else is a (compound) statement and is much more restrictive about where it can be used.

    # trinary movl %eax, -28(%ebp) # to_return # ifelse movl $1, -28(%ebp) # to_return
    The trinary function is handling the variable between registers, whereas ifelse is using a memory location. This is another boost to the trinary operator.

    Doesn't the $1 notation mean that the value 1 is encoded in the instruction itself (immediate value)? You seem to imply that it is referencing a variable somewhere else in memory.

    90% of every Perl application is already written.
    dragonchild

      Doesn't the $1 notation mean that the value 1 is encoded in the instruction itself (immediate value)?

      You are correct. My ASM is rusty.

      "There is no shame in being self-taught, only in not trying to learn in the first place." -- Atrus, Myst: The Book of D'ni.

Re: Trinary Operator Semantics
by chromatic (Archbishop) on May 26, 2005 at 21:25 UTC
    It isn't so much that Perl's designers ignored good semantics, but that the language wasn't really designed at all.

    I think that misses a much more interesting issue (and is potentially insulting as well). Perl is a very different type of language from C. The fact that it may not do the same types of optimizations as C compilers can do to the same degree that C compilers can do doesn't necessarily mean that Perl had no design stage nor that the Perl developers are lazy or ignorant.

    By giving up the "compile once and for all and that's that" step, Perl has traded some potential optimizations for additional flexibility. I don't think that's an accident.

      Tail recursion would be a great optimization to have, but Perl doesn't do it except through the clunky use of goto BLOCK;. And that doesn't appear to give any time optimization (admittedly, that p5p message could be long out of date).

      I don't think there's any support for the idea that until the Perl6 effort, Perl was anything but ad hoc design. The difficulties in dis-ambiguizing the defined-or operator from an empty regex should be proof of that. The fact that you can't get a context-free grammar for Perl should be proof of that. The steps perl has to go through (and still ocasionally get it wrong) to figure out if map {. . .} @array should contain a block or a hashref should be proof of that.

      I don't mean to insult the p5p people, as they make my work easy in many cases. Perl has simply been extended over the years from an awk/sed killer into a part imparative, part functional, part OO language, and that process would inevitably introduce some rather odd designs.

      "There is no shame in being self-taught, only in not trying to learn in the first place." -- Atrus, Myst: The Book of D'ni.

        Tail recursion would be a great optimization to have, but Perl doesn't do it except through the clunky use of goto BLOCK;. And that doesn't appear to give any time optimization (admittedly, that p5p message could be long out of date).

        I think you mean goto &sub;

        #!perl use strict; use Benchmark; sub normal { return 0 unless $_[0]; @_ = ($_[0] - 1); return normal(@_); } sub tail { return 0 unless $_[0]; @_ = ($_[0] - 1); goto &tail; } timethese( 10, { "normal" => sub { normal(500000) }, "tail" => sub { tail(500000) }, } );
        Output:
        normal: 25 wallclock secs (23.19 usr + 0.59 sys = 23.78 CPU) @ 0 +.42/s (n=10) tail: 18 wallclock secs (17.44 usr + 0.03 sys = 17.47 CPU) @ 0 +.57/s (n=10
        Not very impressive, but there is a clear improvement in speed. The improvement in wallclock seconds becomes far larger if you start swapping with the normal recursive call (at my machine at 1_000_000 levels of recursion). I tried this only with 1 iteration because I got impatient:
        #!perl use strict; use Benchmark; sub normal { return 0 unless $_[0]; @_ = ($_[0] - 1); return normal(@_); } sub tail { return 0 unless $_[0]; @_ = ($_[0] - 1); goto &tail; } timethese( 1, { "normal" => sub { normal(1000000) }, "tail" => sub { tail(1000000) }, } );
        output:
        Benchmark: timing 1 iterations of normal, tail... normal: 123 wallclock secs ( 6.58 usr + 1.13 sys = 7.71 CPU) @ +0.13/s (n=1) (warning: too few iterations for a reliable count) tail: 3 wallclock secs ( 3.50 usr + 0.01 sys = 3.51 CPU) @ 0 +.28/s (n=1) (warning: too few iterations for a reliable count)
        I don't think there's any support for the idea that until the Perl6 effort, Perl was anything but ad hoc design.

        Compare it to a real example of ad hoc language design: PHP.

        The fact that you can't get a context-free grammar for Perl should be proof of that.

        I wouldn't jump to that conclusion so fast. Human languages are not context-free. Giving a language a CFG certainly makes it easier for computers to understand it; it's less clear that it makes it easier for the human programmer.

Re: Trinary Operator Semantics
by tilly (Archbishop) on May 26, 2005 at 22:00 UTC
    Supporting what chromatic said, the fact that the time to compile is typically part of the time to run a Perl program means that there are a lot of optimizations that are not worth doing in Perl. The time needed to look for the optimization is less than the time that you're likely to save.
Re: Trinary Operator Semantics
by demerphq (Chancellor) on May 27, 2005 at 07:55 UTC

    It is true that they are functionally identical. In other words, if you create two functions with each construct and give them the same input, they will always return the same output.

    Well, insofar as you can use one to simulate the other always, however the relationship is not symmetrical in that the ternary generally speaking (that is without the use of do{}) cannot simulate the if(){}else{}. To me this is much more important issue than their relative efficiency. When you use a ternary you are signaling that your conditional logic is used to select a value. When you use an if statement you are signalling that your conditional logic determines what set of actions occur. To me these are totally different signals which are very useful in understanding the code. If a conditional has no side effects then I prefer it to be in a ternary (or even a chained ternary) so that it is very clear that there are no side effects to be considered. As soon as my eyes see an if statement they start looking for multiple side effects, when they see a ternary they start looking for where the resulting variable is used because that will be where the side effects manifest themselves. These are totally different cognitive processes IMO.

    ---
    $world=~s/war/peace/g

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://460703]
Approved by Limbic~Region
Front-paged by ghenry
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (7)
As of 2021-01-22 12:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?