Benchmarking Quiz

I spent a good deal of time using Benchmark.pm over the last week and have learned a lot. I hope this provides a diversion that most will find interesting and demonstrates some unusual results of the benchmarking process. I suppose that if I had to single out one of the most abused notions of benchmarking, it would have to be that just because a snippet of code performs the most executions per second in a timethis or cmpthese result doesn't necessarily mean that it will perform the best in real world conditions.

It is quite likely that my results will not match those obtained on different computers, versions of Perl, operating systems or other application mixes, but I can say that all results were run multiple times and then averaged.

1. Which of the following snippets averages the most executions per second?

  a) 's/\s*#.*//'
  b) '$_ = $` if /#/'
  c) '($_) = split/#/'
  d) '$_ = substr($_, 0, $-[0]) if /#/'
  e) 'if ((my $p = index($_, "#")) > -1) { substr($_, $p, -1, "") }'
[download]

Answer: d

2. Which snippet from the previous question averages the fewest executions per second?

Answer: c

3. Which of the following snippets can cause a runtime error?

   a) '{}'
   b) ''
   c) ';'
   d) All of the above
   e) None of the above
[download]

Answer: If you answered a or c, more certain would be b. If you answered b, correct for most computers. If you answered d, you have a much better computer than mine. If you answered e, your computer might be a collector's item.

4. What runtime error is most likely produced by the previous question?

Answer: Range iterator outside integer range

5. Which of the following snippets averages the most executions per second?

   a) '$_ = (split/#/)[0]'
   b) '($_) = split/#/'
   c) Too close to tell
[download]

Answer: b, by about 9%

6. Which of the following snippets averages the most executions per second?

   a) '$i = ""; $i = 1 if ($_ % 2) == 0'
   b) 'if (($_ % 2) == 0) {$i = 1} else {$i = ""}'
   c) Too close to tell
[download]

Answer: b, by about 4%

7. Which of the following snippets averages the most executions per second?

   a) '($str1, $str2) = split/:/ if /:/',
   b) 'if (/:/) {$str1 = $`; $str2 = $\'}',
   c) 'if (/:/) {$str1 = substr($_,0, $-[0]); $str2 = substr($_,$+[0])
+}'
   d) '($str1,$str2) = m/(\d*):(\d*)/ if /:/'
   e) Too close to call.
[download]

Answer: e, allowing for at least 3% error.

8. Which of the snippets in the previous question runs the fastest in the following code?

   $_ = "123:45678";
   
   foreach (1..2000000) {
      # snippet goes here ;
   }
[download]

Answer: e, allowing for at least 3% error.

9. Which of the following snippets averages the most executions per second?

   a) 'foreach (1..$n){$m = $_}'         
   b) '$m = $_ for (1..$n)'              
   c) 'for ($i=1;$i<$n+1;$i++) {$m = $i}'
   d) Too close to call.
[download]

Answer: c, by about 10%

10. Which of the snippets from the previous question runs the fastest in the following code?

   my ($i, $m);
   my $n = 5000000;
   # snippet goes here
[download]

Answer: b, by about 7%

--Jim

Comment on Benchmarking Quiz Select or Download Code

Replies are listed 'Best First'.
Re: Benchmarking Quiz by Arguile (Hermit) on Nov 11, 2001 at 16:50 UTC
"Premature optimisation is the root of all evil." -- Donald Knuth Knuth's famous quotation applies to many of these constructs. Especially as readability is often never included in such benchmarks :). I was glad to see your warning at the top of the quiz but I thought I'd add my own. Question 9-10 in particular can be very misleading; especially given that you don't show your method. As it's showing a counting loop, instead of an array iteration loop, it could easily trap the unwary. See Efficient Looping Constructs for some good discussion on this. I'm also a little inclined to question your methodology. Are you using subs with lexicals to make sure they don't have any interaction with each other? I'm not saying you're doing anything wrong, just some of these results suprised me and don't compare with my benchmarks. This could just be differences in our approach and choice of test strings. As a side note, consider `for ($i=1; $i < $n+1; $i++)`. You're doing an extra assignment in there every loop. That slows it down considerably `for ($i=1; $i <= $n; $i++)` is quite a bit faster (and clearer IMO, though not as clear as Perl-ish for). Seeing a foreach along the lines of `for my $i (1..$n)` might have been nice too, and the results might suprise you :). Counting loops have already been covered, but I can't seem to find a good node comparing C style for(;;) loops against Perl style foreach when iterating over lists/arrays (probably their most common usage). So read on if you want a quick benchmark. Read more... (2 kB)	[reply] [d/l] [select]
Re: Re: Benchmarking Quiz by jlongino (Parson) on Nov 12, 2001 at 03:06 UTC
The purpose of the quiz was not to claim the superiority of any one given snippet over another. Only in the context that it is presented. We should question the methodology of others, particularly when they start making claims about efficiency and performance. Take for example the classic example of the bubble sort (which BTW, was mentioned in a reply to a SOPW recently). To say that it is inefficient is a generalization. It can be more efficient than almost any other sort given the right conditions. But it would tick me off if responder had made the generalization and didn't qualify it (they did qualify it). This doesn't happen frequently enough. As for questions 9 & 10, yes they can be misleading (and that was part of the point) but only if you don't examine it closely or over-generalize. Also, if the results were all predictable, it wouldn't make for a very interesting meditation. Any differences in your benchmarks whether seemingly major or minor can have a big impact on the metrics. I realize that you already know this, because you knew enough to question it. What worries me is all the people who see metrics and just blindly accept them. The program used for question 9 was a simple cmpthese: `use strict; use Benchmark qw(cmpthese); cmpthese (-30, { a => 'foreach (1..$n){$m = $_}', b => '$m = $_ for (1..$n)', c => 'for ($i=1;$i<$n+1;$i++) {$m = $i}' });` [download] No, I didn't go to the extremes I did in question 1 (see my reply to blakem above). For question 10, I did however and wrote one program for each snippet. I also ran them for $n=10,000,000 several times and got the same results: `use strict; use Benchmark; my ($i, $m); my $t0 = new Benchmark; $_ = "123:45678"; my $n = 5000000; # snippet goes here. my $t1 = new Benchmark; my $td = timediff($t1, $t0); print "x: ",timestr($td),"\n";` [download] BTW: thanks for adding the reference to turnsteps tutorial. I read it before starting with Benchmark.pm and it does cover the basics (which I intentionally left out due to the length of my original post, a poor decision in retrospect). Another interesting read is How to Lie with Statistics by Darrell Huff, illustrated by Irving Geis. Not that I lied here of course :) --Jim	[reply] [d/l] [select]
Re: Benchmarking Quiz by blakem (Monsignor) on Nov 11, 2001 at 14:24 UTC
I'd be interested to see the actual benchmarking code that gave you these results. Not that I'm clever enough to immediately question any of the numbers, but several of your snippets do have potential benchmark pitfalls that are easy to fall into. For instance, in 1,2, and 5 the input value of $_ is being modified... are you reseting it before each loop? Consider the following: `# snippet 1 $_ = 'abc'; for (1..1000) { sleep 1 if $_ eq 'abc'; $_ = 'def'; } # snippet 2 $_ = 'abc'; for (1..1000) { sleep 2 if $_ eq 'abc'; $_ = 'def'; }` [download] Notice that after the first time through the loop, you aren't actually executing the same code. If you tested this as is, you might come to the conclusion that 'sleep 2' is only fractionaly slower than 'sleep 1' since that 1 second differential is spread out over 1000 loops.... Also, I don't understand what questions 3 and 4 are getting at. Could you give a bit more explanation for them? -Blake	[reply] [d/l]
Re: Re: Benchmarking Quiz by jlongino (Parson) on Nov 12, 2001 at 01:22 UTC
I think you're missing the point of the post. Perhaps I should've given more details on the methods used but I thought that the wording in the questions "the most executions per second" shows that the snippet was evaluated using a timethis or cmpthese mode, generally in absence of any real data values. Also, see my response to Arguile (once I get time to reply). While researching questions 1 & 2 I used both timethis and cmpthese methods in many different programs. I am well aware of the fact that using $POSTMATCH and $PREMATCH can taint the compilation (not just execution) of other regexen in the same program. This was pointed out by shotgunefx in a reply to Support for hash comments on a line. That premise was what sparked my interest in benchmarking. Not that I didn't believe shotgunefx, because he was echoing information also stated in Programming Perl. I wanted to know to what extent and see some metrics. BTW, a->fastolfe, b->me, c->buckaduck, d->blakem and e->demerphq. The following program is what I used for the results of questions 1 & 2: use strict; use Benchmark; my %snippet = ( a => 's/\s#.//', b => '$_ = $` if /#/', c => 'my ($line) = split/#/', d => '$_ = substr($_, 0, $-[0]) if /#/', e => 'if ((my $p = index($_, "#")) > -1) { substr($_, $p, -1, "") } +' ); if (exists $snippet{$ARGV[0]}) { print "Processing $ARGV[0] snippet: '", $snippet{$ARGV[0]}, "'\n"; timethis (-10, $snippet{$ARGV[0]}); } else { print "Missing or invalid parameter to single1.p\n\n"; } [download] The results table (note that since I was using cmpthis in separate programs I compiled the following table using an excel spreadsheet from the averaged results of three runs each): `rate c e a b d c 521648/s -- -4% -61% -68% -68% e 543110/s 4% -- -59% -66% -67% a 1335220/s 156% 146% -- -17% -19% b 1614068/s 209% 197% 21% -- -2% d 1639269/s 214% 202% 23% 2% --` [download] Also, I suppose I could have put that b was a correct response as well since I had specified a 3% error margin elsewhere, but then I probably ran the benchmarks from this program over 30 times with very similar results. I'll also add that these benchmarks were done after ending all tasks except for Explorer and Systray on Windows 98SE. This boosted the executions per second and lessened the margin of error between runs but did not change the final percentages much. My experiments with using cmpthese instead: use strict; use Benchmark qw(cmpthese); cmpthese (-10, { a => 's/\s#.//', b => '$_ = $` if /#/', c => 'my ($line) = split/#/', d => '$_ = substr($_, 0, $-[0]) if /#/', e => 'if ((my $p = index($_, "#")) > -1) { substr($_, $p, -1, "") } +' }); [download] Points out the fact that, had I used cmpthese instead of the timethis method, I could have skewed the results in my favor (b is my snippet), but of course knowing that someone might question the results I opted for the honest approach (not to say that I may have still erred unknowingly). Notice however in the three consecutive executions below that the only snippets significantly affected as far as questions 1 & 2 are concerned were mine and yours (d is blakems snippet). I originally had more questions based on those snippets but thought they were getting too tedious and omitted them. Yes a is slowed by about 3-4% (a is fastolfes snippet) but it is irrelevant to questions 1 & 2. Rate c e a d b c 518715/s -- -2% -62% -67% -68% e 530337/s 2% -- -61% -66% -67% a 1364925/s 163% 157% -- -14% -15% d 1582165/s 205% 198% 16% -- -1% b 1600284/s 209% 202% 17% 1% -- Rate c e a d b c 508712/s -- -5% -63% -68% -69% e 532810/s 5% -- -61% -66% -68% a 1365789/s 168% 156% -- -14% -17% d 1589576/s 212% 198% 16% -- -3% b 1639949/s 222% 208% 20% 3% -- Rate c e a d b c 509240/s -- -6% -63% -68% -68% e 544492/s 7% -- -60% -66% -66% a 1365411/s 168% 151% -- -15% -15% d 1599654/s 214% 194% 17% -- -1% b 1608742/s 216% 195% 18% 1% -- [download] As for questions 3 & 4, I ran the '' snippet trying to determine some sort of baseline for the execution results and out of pure curiosity. For a one second timethis method the benchmark ran for about 5 minutes or so (I didn't time it by the wall clock) and then produced the error displayed. I interpret (maybe erroneously) it to mean that it was incrementing so quickly that it exceeded the integer range before the 1 second time interval elapsed. The same holds true for the ';' snippet since it is essentially the same as '' once compiled (an assumption). The '{}' snippet on the other hand completed in the 1 second time interval on my computer. Meaning that if someone runs the same benchmark on a faster CPU (an assumption) they will probably get the runtime error on all three snippets. On a slower CPU (again, an assumption), it is possible that all three snippets would complete without error. I also ran some 'real world' benchmarks to determine which of the snippets in questions 1 & 2 ran the fastest on 20MB data sets and came up with interesting results, but as you can see this node is already overlong. If there are people interested in seeing those results, /msg me and I'll post them on my scratch pad. I don't want to bore everyone with tons of programs and metrics without knowing if there was enough interest to warrant it. --Jim	[reply] [d/l] [select]