Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

shift v. indexing into an array

by 7stud (Deacon)
on Nov 15, 2009 at 12:46 UTC ( [id://807258]=perlquestion: print w/replies, xml ) Need Help??

7stud has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I have two questions:

1) I've read that monks avoid indexing into an array. When using shift, does perl use some kind of magic to avoid the overhead of sliding all the other elements over one spot. For instance, if a monk had an array with 1 million elements would they call shift to get the first element, or would a monk break down and write $big_arr[0]?

2) Should I be concerned about declaring my variables inside a while loop? For instance $val:

my $x = 0; while ($x < 5) { my $val = $x + 2; #repeatedly declaring a new variable? say $val; $x++; }

versus:

my ($x, $val); $x = 0; while ($x < 5) { $val = $x + 2; #same old variable every time say $val; $x++; }

Replies are listed 'Best First'.
Re: shift v. indexing into an array
by moritz (Cardinal) on Nov 15, 2009 at 13:09 UTC
    I've read that monks avoid indexing into an array.

    I think that's a mis-perception. Monks avoid iterating over array indexes simply to index it later on. So instead of

    for (my $i = 0; $i < @array; $i++) { use $array[$i] here }

    They write

    for my $a (@array) { use $a directly here }

    But there is absolutely no reason to avoid array indexing in general.

    Should I be concerned about declaring my variables inside a while loop?

    No.

    Perl 6 - links to (nearly) everything that is Perl 6.
Re: shift v. indexing into an array
by bobr (Monk) on Nov 15, 2009 at 15:26 UTC
    Hello, I tried to benchmark those constructs like this:
    use Benchmark qw(cmpthese); use 5.010; cmpthese(-2, { shift => sub { my @array = (1..10)x10000; my $item = shift @array; }, index => sub { my @array = (1..10)x10000; my $item = $array[0]; }, }); cmpthese(-2, { inside => sub { my $x = 0; while ($x < 5) { my $val = $x + 2; #repeatedly declaring a new variable? say $val; $x++; } }, outside => sub { my ($x, $val); $x = 0; while ($x < 5) { $val = $x + 2; #same old variable every time say $val; $x++; } }, });
    For 1) it shows shift is slightly faster (it actually just moves pointer to start of the array, no moving occur).
    Rate index shift index 62.6/s -- -1% shift 63.1/s 1% --
    For 2) after filtering all printed output it seems declaring my inside loop makes it slightly slower. I would say inside declaration is cleaner and speedup is not worth avoiding it.
    Rate inside outside inside 132994/s -- -9% outside 146050/s 10% --
    All measured on Activestate perl 5.10 on Win32 ... YMMV

    -- regards, Roman

      Hi,

      Thanks for the benchmarks. The shift results surprise me, but the my declaration results are what I would expect.

      Is there a "setup" option for benchmarking? For instance, allowing you to create the big array before the benchmarking begins.

        The benchmarking works by running the subroutines multiple times (the number is specified by first parameter of cmpthese). It is possible to move initialization outside as far as you don't modify them, which is not the case of pop nor shift.

        -- Roman

      For 1) it shows shift is slightly faster

      The difference is so small that I'd be more inclined to write this off as statistical noise and/or a CPU-specific characteristic and consider them identical.

      For 2) after filtering all printed output it seems declaring my inside loop makes it slightly slower. I would say inside declaration is cleaner and speedup is not worth avoiding it.

      Your conclusion is correct and the first sentence is why. In real code, you're going to be doing more in your loop than just declaring a variable, so a 10% difference in the time required to do that declaration will quickly fade into insignificance. Better to keep the declaration close to where the variable is first used to make the source clearer and eliminate any chance of an old value persisting from one iteration into the next.

Re: shift v. indexing into an array
by BrowserUk (Patriarch) on Nov 15, 2009 at 16:07 UTC
Re: shift v. indexing into an array
by Joost (Canon) on Nov 15, 2009 at 21:58 UTC
    The reason seasoned programmers avoid indexes when they can is because there are nice abstractions for many common cases that are easy to understand, make the intent of the code clearer and are more concise.

    Perl's builtins include map and grep as simple examples: if you want to filter out some information in a large or small array it's clearer to use grep, if you want to transform an array into a new one, it's clearer to use map.

    If you want to modify the array in place, you can generally use simple for(each): for (@array) { $_ = $_+1 }. You really only want to use indexing if the index is more than just the index in your algorithm. It's also sometimes the case that using an index makes it easier to do operations on related elements in a list (though that has more to do with a lack of builtins that deal with groups of elements - you can easily write functions that handle the problem in a more generic way).

    My rule of thumb, inspired by more functional languages like Scheme, is this: most of the time, you want to produce a new list/array, and that entails you probably don't want to use for() with or without indexes. But when you instead want to produce side-effects (like modifying the original array, or printing each element) you do want to use for().

    I try to use while(defined shift(@array)) etc only on buffers and the like that get filled in one place and are "simultaniously" drained in another, and shift() on its own pretty much only when reading function arguments.

    On the subject of shift() - perl arrays keep a "start-of-array" number in their datastructures, so using shift() usually doesn't move any items at all and it's generally pretty efficient to shift() even very large arrays. See Perlguts illustrated. I'm assuming shift/push combos move stuff around every once in a while, since I write code that depends on that behaviour and it seems to work, but I'm only 90% certain on that.

    As for where to declare variables: declare them in the smallest block you can, if only because the memory they're using can potentially be freed when the code leaves that block of code.

    Don't worry about potential cpu problems there unless you have serious reasons to think they're any problem in a real program. On the whole of your program's runtime they're almost always a completely trivial concern, and there are even potential optimizations to make on "locally declared" variables - though perl doesn't usually make those optimizations. Also note that looping over a lexical variable declaration does not 're-declare' that variable. It just gets marked as uninitialized.

      perl arrays keep a "start-of-array" number in their datastructures, so using shift() usually doesn't move any items at all and it's generally pretty efficient to shift() even very large arrays.

      shift never moves any items, and thus it's extremely efficient no matter the size of the array.

      @a = qw( a b c ); +---+---+---+---+ @a = | a | b | c | X | ( X = spare ) +---+---+---+---+ ^ ^ | | start end shift @a; +---+---+---+---+ @a = | X | b | c | X | +---+---+---+---+ ^ ^

      I'm assuming shift/push combos move stuff around every once in a while, since I write code that depends on that behaviour and it seems to work, but I'm only 90% certain on that.

      It's not the shift/push combo, it's push alone that causes the moving.

      For efficiency, arrays can have more elements allocated than necessary. If a push is performed, these spare elements will be used. If a push is performed and there no spare elements, a new bigger array is allocated and the elements (pointers) are moved to the new array. This occurs whether shift was used or not.

      [ continuing from above ] push @a, 'd'; +---+---+---+---+ @a = | X | b | c | d | +---+---+---+---+ ^ ^ push @a, 'e'; -> No more space! Re-allocation occurs. -> $new_buf_size = $old_buf_size * 2 + 4 -> Pointers to elements are quickly copied to newly allocated buffer. +---+---+---+---+ | X | b | c | d | +---+---+---+---+ / / / / / / / / / v v v +---+---+---+---+---+---+---+---+---+---+---+---+ @a = | b | c | d | e | X | X | X | X | X | X | X | X | +---+---+---+---+---+---+---+---+---+---+---+---+ ^ ^
      Also note that looping over a lexical variable declaration does not 're-declare' that variable. It just gets marked as uninitialized.

      I was reading the Monk tutorial "Coping with Scoping", and I found an example which is relevant to this discussion:

      Every time control reaches a my declaration, Perl creates a new, fresh variable. For example, this code prints x=1 fifty times:

      for (1 .. 50) { my $x; $x++; print "x=$x\n"; }

      You get a new $x, initialized to undef, every time through the loop.

      If the declaration were outside the loop, control would only pass by it once, so there would only be one variable:

      { my $x; for (1 .. 50) { $x++; print "x=$x\n"; } }

      This prints x=1, x=2, x=3, ... x=50.

      That seems at odds with what you are saying. In any case, there are situations where declaring a my variable inside a loop can cause problems.
Re: shift v. indexing into an array
by jwkrahn (Abbot) on Nov 15, 2009 at 13:10 UTC
    I've read that monks avoid indexing into an array.

    I've never read that myself.

    if a monk had an array with 1 million elements would they call shift to get the first element

    If you really needed to modify the array then there is no other way, else you should not modify it and just use an index into the array.

      You could add new items to the end of an array by simply storing them as elements with new, larger indices. But real Perl programmers don't use indices.*

      *Of course, we're joking. But there's a kernel of truth in this joke.

      Learning Perl 5th, p. 45

      Is the Llama infecting me with fleas? I feel comfortable manipulating the right side of an array with pop and push, but messing around with the left side of an array using shift and unshift sets off alarm bells.

Re: shift v. indexing into an array
by GrandFather (Saint) on Nov 15, 2009 at 21:36 UTC
Re: shift v. indexing into an array
by zentara (Archbishop) on Nov 15, 2009 at 14:33 UTC
    ....funny.... just a few days ago, a new language was discussed here, seeRe: Thoughts on "Go"? ; where one of the benefits of doing your type of array operation was discussed

    ...you might want to check out the map function too, see Auto-filling an array for example


    I'm not really a human, but I play one on earth.
    Old Perl Programmer Haiku
Re: shift v. indexing into an array
by dwm042 (Priest) on Nov 16, 2009 at 18:45 UTC
    Just my 0.02, but this is an exceptionally good thread.

    It makes me wish we could vote for whole threads in some fashion.

    David

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://807258]
Approved by moritz
Front-paged by biohisham
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (4)
As of 2024-04-24 11:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found