How foreach loops decide what to iterate through

lefthanded has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: How foreach loops decide what to iterate through by Lawliet (Curate) on Feb 13, 2009 at 23:27 UTC
Welcome to The Monastery, and thank you for writing your node. I have little experience with the Benchmark module, but after conducting some tests, I would say that you should use the latter method, the one you prefer. It is more concise, easier to read, ~~and a little more efficient.~~ (thanks for the notice, ikegami.) #!/usr/bin/perl use warnings; use strict; use 5.010; use Benchmark qw(:all); use Data::Dumper; cmpthese(100000, { 'LowOutside' => sub { my @array = qw(8 1 3 5 6 2); @array = sort @array; foreach (@array) { my $num = $_; } }, 'LowInside' => sub { my @array = qw(8 1 3 5 6 2); foreach (sort @array) { my $num = $_; } }, 'Outside' => sub { my @array = qw(8 1 3 5 6 2) x 100; @array = sort @array; foreach (@array) { my $num = $_; } }, 'Inside' => sub { my @array = qw(8 1 3 5 6 2) x 100; foreach (sort @array) { my $num = $_; } } }); __END__ Rate Outside Inside LowOutside LowInside Outside 1681/s -- -1% -99% -99% Inside 1691/s 1% -- -99% -99% LowOutside 169492/s 9985% 9924% -- -5% LowInside 178571/s 10525% 10461% 5% -- [download] And you didn't even know bears could type.	[reply] [d/l]
Re^2: How foreach loops decide what to iterate through by ikegami (Patriarch) on Feb 14, 2009 at 03:09 UTC
I would say that you should use the latter method, the one you prefer. It is more concise, easier to read, and a little more efficient. 1% and even 5% is within the margin or error of Benchmark, so that last phrase is inaccurate. I'm actually surprised they're the same speed. `@a = sort @a;` is optimized to be done in place, and I think `for (@a)` is optimized over the more general `for (LIST)` (it's definitely implemented differently).	[reply] [d/l] [select]
Re: How foreach loops decide what to iterate through by jethro (Monsignor) on Feb 13, 2009 at 23:34 UTC
Both methods are practically the same in terms of speed. When you do foreach (sort @a), the foreach doesn't loop over @a but instead over the anonymous (and temporary) array that was returned from the sort function. The only difference of the two methods is that the first method stores the (anonymous) sorted array into @a before going into the foreach loop, while the second method leaves the sorted array anonymous.	[reply]
Re^2: How foreach loops decide what to iterate through by jvector (Friar) on Feb 14, 2009 at 00:25 UTC
That's clear and understandable, and what I would have thought too. (</metoo> ;-) But I am still puzzled as to what is happening in the example of `my @a = (1); foreach (@a) { push @a, 1; }` [download] Ceci n'est pas une signature	[reply] [d/l]
Re^3: How foreach loops decide what to iterate through by jethro (Monsignor) on Feb 14, 2009 at 02:54 UTC
You are looping over an array you are adding to, so you are never reaching the end. This is somewhat similar to the following loop `my $i=1; my $end= 2; while ($i++<$end) { $end++; }` [download] Note that `y @a = (1); foreach (sort @a) { push @a, 1; }` [download] stops immediately, because here you are looping over the anonymous array that is not expanding.	[reply] [d/l] [select]
Re^3: How foreach loops decide what to iterate through by jasonk (Parson) on Feb 15, 2009 at 18:33 UTC
Essentially what you are doing is something like this: `my @a = (1); for ( my $i = 0 ; $i <= $#a ; $i++ ) { push( @a, 1 ); }` [download] So the loop isn't being restarted or re-evaluated, but because you are extending the thing it's iterating over, it never gets to the end. www.jasonkohles.com We're not surrounded, we're in a target-rich environment!	[reply] [d/l]
Re^4: How foreach loops decide what to iterate through by jvector (Friar) on Feb 15, 2009 at 22:38 UTC
Re^5: How foreach loops decide what to iterate through by ikegami (Patriarch) on Feb 16, 2009 at 02:24 UTC
Re: How foreach loops decide what to iterate through (push array inside foreach) by toolic (Bishop) on Feb 14, 2009 at 02:10 UTC
I notice that the following code runs forever: Adding a print inside the loop might shine some light on it: `foreach (@a) { push @a, 1; print scalar @a, "\n" }` [download] Prints: `2 3 4 5 ...` [download] Apparently, the push causes the array to grow by one element, somehow triggering the loop to iterate again. I found this in the docs for Foreach Loops: If any part of LIST is an array, foreach will get very confused if you add or remove elements within the loop body, for example with splice. So don't do that. Now I am curious as to what perl is doing internally. Can any monks offer a deeper explanation?	[reply] [d/l] [select]
Re^2: How foreach loops decide what to iterate through (push array inside foreach) by ikegami (Patriarch) on Feb 14, 2009 at 03:14 UTC
Now I am curious as to what perl is doing internally. Can any monks offer a deeper explanation? I think it iterates over the array instead of putting it on the stack. You can circumvent the optimisation by changing `for (@a)` to `for ((),@a)`.	[reply] [d/l] [select]
Re^2: How foreach loops decide what to iterate through (push array inside foreach) by jethro (Monsignor) on Feb 14, 2009 at 03:33 UTC
I don't know, but my guess is that perl simply keeps a pointer to the array-element it is at in the loop. I did a few tests with splice and perl did exactly what I expected from it. It looped over the array positions (!! not the array values) until it was outside the array length (because of looping or a shrinking of the array). Would be nice if someone could show me a case where perl gets confused. I could not create one	[reply]
Re^3: How foreach loops decide what to iterate through (push array inside foreach) by ikegami (Patriarch) on Feb 14, 2009 at 03:51 UTC
Would be nice if someone could show me a case where perl gets confused. I could not create one Confused by addition: `>perl -wle"@a=1; for ((),@a) { push @a,$_+1; print }" 1 >perl -wle"@a=1; for (@a) { push @a,$_+1; print }" 1 2 3 4 ...` [download] Confused by deletion: `>perl -wle"@a=(1,2); for (@a) { pop @a; print qq{<$_>} }" <1> >perl -wle"@a=(1,2); for ((),@a) { pop @a; print qq{<$_>} }" <1> Use of freed value in iteration at -e line 1.` [download]	[reply] [d/l] [select]
Re^4: How foreach loops decide what to iterate through (push array inside foreach) by jethro (Monsignor) on Feb 14, 2009 at 05:25 UTC
Re^5: How foreach loops decide what to iterate through (push array inside foreach) by ikegami (Patriarch) on Feb 14, 2009 at 05:37 UTC
Re: How foreach loops decide what to iterate through by bichonfrise74 (Vicar) on Feb 13, 2009 at 23:10 UTC
I would think both methods would just be the same and so I would prefer method 2 since it has less clutter. `foreach (sort @a) { ... }` [download]	[reply] [d/l]
Re: How foreach loops decide what to iterate through by ysth (Canon) on Feb 15, 2009 at 18:50 UTC
In principle, foreach iterates over a list that is constructed before the looping begins, so you should have no concerns about making extra work for each iteration by having a complex expression for the list being iterated over. In practice, iterations over a range (`foreach ($x .. $y)`) or an array (even one built on the fly: `foreach (@{[ sort @x ]})`) are optimized to not need to load the full list ahead of time. Since this optimization shouldn't introduce differences in behaviour, what you are seeing with a push in the loop is a bug. -- Online Fortune Cookie Search Office Space merchandise	[reply] [d/l] [select]


laziness, impatience, and hubris
	PerlMonks