http://qs321.pair.com?node_id=1078511


in reply to Re: Given When Syntax
in thread Given When Syntax

Hi Marshall. Thanks for including both an if/else and hash example. I wasn't aware of the more efficient way of using IF and that is what I was looking for (hard to find that stuff by just searching online). Also these make it a lot easier to me to try different solutions. Unfortunately I couldn't get your examples to work. I'm sure I'm overlooking something simple, but when I pass a value to the subroutines I'm not getting any output (see example below for hash). I thought the return would print it to the screen. Please pardon my newbness.

Beyond that, the example I created for the question is rather simplified. In real life I have to deal with a string where I am only concerned with the first number. E.g., n.yy.zzz, where "n" is the number I need to map to. I can get this number simply enough with regex, but I'm not sure how to implement that within your example (can't test because of the output issue). Here's an attempt with an illustration of what I'm trying to do:

use strict; use warnings; #Using a hash table #Evaluate on the first digit of the string { my ($var2) = @_; # $var1 = shift; # is slightly faster with one var # my ($var1, $var2) = @_; # slightly faster than than 2 shifts # normally this minor difference doesn't matter. # Perl can implement the if/else idea very code efficiently. return ("One") if ($var2 =~ /^1/ ); return ("Two") if ($var2 == /^2/ ); return ("Three") if ($var2 == /^3/ ); return undef; } # I want to evaluate the number 1.00.000 # I thought I would just have to pass it # to the subroutine to get a return value. test2(1.00.000);

Once again, many thanks. I appreciate your time.

Replies are listed 'Best First'.
Re^3: Given When Syntax
by Marshall (Canon) on Mar 16, 2014 at 15:03 UTC
    Try this:
    sub test2 { my ($var2) = shift; # Perl can implement the if/else idea very code efficiently. return ("One") if ($var2 =~ /^1\./ ); return ("Two") if ($var2 =~ /^2\./ ); return ("Three") if ($var2 =~ /^3\./ ); return ("UNKOWN"); } print test2 ("1.00.000"), "\n"; print test2 ("11.00.000"), "\n"; print test2 ("3.abc"), "\n"; __END__ prints: One UNKOWN Three
    Update:
    Well I think we both know that this is a simple example.
    The regex'es aren't complicated. Just to demo the idea.

      Thanks, Marshal. That did the trick. I was an idiot for not seeing the subroutine was not identified in your code snippet as it was in my example. Thanks again for your time and patience.

Re^3: Given When Syntax
by ww (Archbishop) on Mar 16, 2014 at 15:01 UTC
    1. You have no sub test2 { ...}
    2. You show use of strict and warnings but you should have seen a message that return is not valid as your code presents it: "Can't return outside a subroutine..."
    3. You don't have any mechanism for output -- no print of the return value, were the return actually valid.

    Further, inclusion of comments which have no relevance to your problem merely makes your code more verbose, and thus, less likely to be a candidate for a reply by Monks who have other demands on their time. You shouldn't waste it if you really "appreciate (our) time."

    So here's an Rx: make sure you have your fundamentals down... and use that knowledge to recognize when you've been given an outline; not a line-for-line code solution. (Offering that kind of response is wholly in keeping with the Monastery ethos: we're here to help you learn; not to spoonfeed you with solutions!)

    Updated by addition of thoughts in last para following the ellipsis.

    Update2/clarification/correction: (yeah, my remark re comments is too broad, in insufficiently precise.) My intent -- now "better phrased" I hope -- was to point out that non-code info on the problem belongs in the narrative -- not in the code -- and that code that's been commented out should be omitted unless it casts some NEW light on the problem -- which is not the case with OP's reposting of Marshall's well-taken comment on the efficiency of the construct shown in the node to which Deep_Plaid is addressing himself.


    Questions containing the words "doesn't work" (or their moral equivalent) will usually get a downvote from me unless accompanied by:
    1. code
    2. verbatim error and/or warning messages
    3. a coherent explanation of what "doesn't work actually means.

      Sorry to waste your time, WW. I will be more thoughtful in the future. I do appreciate the time and help I have received from people on this site. I try to do as much research as I can before posting so I don't waste people's time. I am also taking a PERL course through Pluralsite and I haven't gotten through the course yet - real time deadlines have gotten in the way.

Re^3: Given When Syntax
by Laurent_R (Canon) on Mar 16, 2014 at 18:03 UTC
    In addition to the errors that have already been pointed out to you (especially the fact that you don't have a test2 subroutine), please note that you should pass a string as a parameter to your sub:
    test2(1.00.000);
    should be:
    test2("1.00.000");
    I would also submit that this:
    sub test2 { my ($var2) = @_; return ("One") if ($var2 =~ /^1/ ); return ("Two") if ($var2 == /^2/ ); return ("Three") if ($var2 == /^3/ ); return undef; }
    is not correct for input values starting with 2 and 3 and is not very efficient in terms of performance, nor in terms of coding simplicity. Immediate correction of the error is to replace == with =~ for cases 2 and 3:
    sub test2 { my ($var2) = @_; return ("One") if ($var2 =~ /^1/ ); return ("Two") if ($var2 =~ /^2/ ); return ("Three") if ($var2 =~ /^3/ ); return undef; }
    Note that Marshall corrected these two errors, but I thought it would be useful to point these out to you for your benefit. An additional improvement would be to remove the triple regex and to extract the first digit from the string only once:
    sub test2 { my $var2 = substr shift, 0, 1; return ("One") if $var2 == 1 ; return ("Two") if $var2 == 2 ; return ("Three") if $var2 == 3 ; return undef; }
    Doing the extraction of the first digit only once is cleaner, removes the risk of the error I pointed out just above and is likely to be faster if that matters (although it is true that an anchored regex is pretty fast). And it paves the way for yet another improvement, the use of an array rather than multiple evaluations. The full program may now be this:
    use strict; use warnings; my @translation = qw / Zero One Two Three/; sub test2 { return $translation[(substr shift, 0, 1)]; } print test2("1.00.000");
    Now, assuming you have a very large amount of data and performance matters, we may want to benchmark this against your (corrected) triple regex version and an intermediate solution extracting the first digit only once:
    use strict; use warnings; use Benchmark qw/cmpthese/; my @translation = qw / Zero One Two Three/; sub test1 { my $var2 = shift; return ("One") if ($var2 =~ /^1/ ); return ("Two") if ($var2 =~ /^2/ ); return ("Three") if ($var2 =~ /^3/ ); return undef; } sub test2 { my $var2 = substr shift, 0, 1; return ("One") if ($var2 == 1 ); return ("Two") if ($var2 == 2 ); return ("Three") if ($var2 == 3 ); return undef; } sub test3 { return $translation[(substr shift, 0, 1)]; } cmpthese( -1, { test_1 => sub {test1("3.01.000")}, test_2 => sub {test2("3.01.000")}, test_3 => sub {test3("3.01.000")}, } )
    which gives the following results:
    $ perl test_if.pl Rate test_1 test_2 test_3 test_1 1294050/s -- -11% -51% test_2 1451608/s 12% -- -45% test_3 2642856/s 104% 82% --
    As you can see, the array solution is about twice faster. Having said that, performance is often not so important (it is often fast enough anyway), and I am using quite regularly solutions similar to Marshall's proposals.

      You're using a constant input which starts with a "3" though, which unfairly penalizes test1 and test2 (it's the final situation they check for). For inputs starting with a "1", test3 is still the fastest, but the difference between it and the other tests is much smaller.

      Also, I'd recommend running your benchmarks like this:

      cmpthese(-1, { test_1 => q{ test1("3.01.000") }, test_2 => q{ test2("3.01.000") }, test_3 => q{ test3("3.01.000") }, });

      ... using q{ ... } instead of sub { ... }. If you use sub { ... } you're wrapping each iteration in an extra sub call layer. For micro-optimization benchmarks like this, that extra layer can make a significant difference to the results.

      use Moops; class Cow :rw { has name => (default => 'Ermintrude') }; say Cow->new->name

        You're using a constant input which starts with a "3" though, which unfairly penalizes test1 and test2 (it's the final situation they check for). For inputs starting with a "1", test3 is still the fastest, but the difference between it and the other tests is much smaller.

        Yes, you are absolutely right, tobyink, I did unfairly penalize test1 and test2, and I did it consciously and voluntarily, because, in a real situation, I would assume that the first digit in the input can take any value between 1 and 9 (and possibly 0), so that having a match at the third value actually gives an unfair advantage to test1 and test2. Having said that, with three possible values, matching at the second value should be fair if values are more or less equally distributed. Changing my benchmark test to:

        cmpthese( -1, { test_1 => sub {test1("2.01.000")}, test_2 => sub {test2("2.01.000")}, test_3 => sub {test3("2.01.000")}, } )
        I obtain the following result:
        $ perl test_if.pl Rate test_1 test_2 test_3 test_1 1451608/s -- -8% -46% test_2 1578202/s 9% -- -41% test_3 2667353/s 84% 69% --
        which still shows a very clearcut advantage to the array solution.

        As for using

        cmpthese(-1, { test_1 => q{ test1("3.01.000") }, test_2 => q{ test2("3.01.000") }, test_3 => q{ test3("3.01.000") }, });
        I was not aware of the possibility of doing it this way, thank you for the information, I'll investigate this further. I doubt, though, that it really makes a huge difference, a factor of two between one solution and the others is not exactly what I would call micro-optimization.

      Hello again, Laurent. Let me just start by saying "I'm not worthy! I'm not worthy!" This is great stuff. I had asked about performance and you replied. This is huge because the amount of data I'm dealing with is significant. I probably won't be able to fully examine your notes until later today or tomorrow (I'm under some deadlines), but just wanted to let you know your contribution is highly valued. Hope you are having a smashing weekend. Cheers, DP.