Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re: Faster way to do this?

by mr_mischief (Monsignor)
on Nov 12, 2015 at 21:20 UTC ( #1147607=note: print w/replies, xml ) Need Help??


in reply to Faster way to do this?

There's a temptation to run a benchmark and call one solution fastest. That's simply not the case. If you want to know what's fastest, you need to run lots of iterations, run that program lots of times, and you have to know your data and environment. Code that's fast enough can usually be found pretty quickly. If you want the fastest you can get, well your search becomes longer and more complex the shorter and simpler the code you're seeking to do the same amount of work.

Here's a sampling from some intentionally slow methods through to some of the faster methods mentioned. This is 100,000 iterations each time using the small five-line data section choroba used from Re: Faster way to do this?:

$ perl foo Rate end_of_substring end_slurped tr_and_end a +ll tr_and_match while split tr_and_split_redux tr_and_split tr tr_m +oar_magic end_of_substring 56818/s -- -7% -7% -1 +5% -16% -37% -52% -61% -62% -63% + -65% end_slurped 60976/s 7% -- -1% - +9% -10% -32% -49% -59% -60% -60% + -62% tr_and_end 61350/s 8% 1% -- - +9% -9% -32% -48% -58% -60% -60% + -62% all 67114/s 18% 10% 9% +-- -1% -26% -44% -54% -56% -56% + -58% tr_and_match 67568/s 19% 11% 10% +1% -- -25% -43% -54% -55% -56% + -58% while 90090/s 59% 48% 47% 3 +4% 33% -- -24% -39% -41% -41% + -44% split 119048/s 110% 95% 94% 7 +7% 76% 32% -- -19% -21% -23% + -26% tr_and_split_redux 147059/s 159% 141% 140% 11 +9% 118% 63% 24% -- -3% -4% + -9% tr_and_split 151515/s 167% 148% 147% 12 +6% 124% 68% 27% 3% -- -2% + -6% tr 153846/s 171% 152% 151% 12 +9% 128% 71% 29% 5% 2% -- + -5% tr_moar_magic 161290/s 184% 165% 163% 14 +0% 139% 79% 35% 10% 6% 5% + -- $ perl foo Rate end_of_substring tr_and_end end_slurped a +ll tr_and_match while split tr tr_and_split tr_and_split_redux tr_m +oar_magic end_of_substring 59172/s -- -2% -5% -1 +2% -15% -37% -50% -61% -62% -62% + -63% tr_and_end 60606/s 2% -- -2% -1 +0% -13% -35% -48% -60% -61% -61% + -62% end_slurped 62112/s 5% 2% -- - +8% -11% -34% -47% -59% -60% -60% + -61% all 67568/s 14% 11% 9% +-- -3% -28% -43% -55% -56% -56% + -58% tr_and_match 69930/s 18% 15% 13% +3% -- -25% -41% -54% -55% -55% + -57% while 93458/s 58% 54% 50% 3 +8% 34% -- -21% -38% -39% -39% + -42% split 117647/s 99% 94% 89% 7 +4% 68% 26% -- -22% -24% -24% + -27% tr 151515/s 156% 150% 144% 12 +4% 117% 62% 29% -- -2% -2% + -6% tr_and_split 153846/s 160% 154% 148% 12 +8% 120% 65% 31% 2% -- -0% + -5% tr_and_split_redux 153846/s 160% 154% 148% 12 +8% 120% 65% 31% 2% 0% -- + -5% tr_moar_magic 161290/s 173% 166% 160% 13 +9% 131% 73% 37% 6% 5% 5% + -- $ perl foo Rate end_of_substring tr_and_end end_slurped a +ll tr_and_match while split tr_and_split tr_and_split_redux tr_moar_m +agic tr end_of_substring 58140/s -- -2% -6% -1 +1% -15% -38% -48% -60% -60% +-62% -62% tr_and_end 59172/s 2% -- -4% - +9% -14% -37% -47% -60% -60% +-61% -62% end_slurped 61728/s 6% 4% -- - +6% -10% -34% -45% -58% -58% +-59% -60% all 65359/s 12% 10% 6% +-- -5% -30% -42% -56% -56% +-57% -58% tr_and_match 68493/s 18% 16% 11% +5% -- -27% -39% -53% -53% +-55% -55% while 93458/s 61% 58% 51% 4 +3% 36% -- -17% -36% -36% +-38% -39% split 112360/s 93% 90% 82% 7 +2% 64% 20% -- -24% -24% +-26% -27% tr_and_split 147059/s 153% 149% 138% 12 +5% 115% 57% 31% -- -0% + -3% -4% tr_and_split_redux 147059/s 153% 149% 138% 12 +5% 115% 57% 31% 0% -- + -3% -4% tr_moar_magic 151515/s 161% 156% 145% 13 +2% 121% 62% 35% 3% 3% + -- -2% tr 153846/s 165% 160% 149% 13 +5% 125% 65% 37% 5% 5% + 2% -- $ perl foo Rate end_of_substring tr_and_end end_slurped a +ll tr_and_match while split tr_and_split_redux tr tr_and_split tr_m +oar_magic end_of_substring 58140/s -- -2% -4% -1 +4% -15% -37% -49% -61% -62% -62% + -63% tr_and_end 59524/s 2% -- -2% -1 +2% -12% -35% -48% -60% -61% -61% + -62% end_slurped 60606/s 4% 2% -- -1 +0% -11% -34% -47% -59% -60% -60% + -61% all 67568/s 16% 14% 11% +-- -1% -26% -41% -55% -55% -55% + -57% tr_and_match 68027/s 17% 14% 12% +1% -- -26% -41% -54% -55% -55% + -56% while 91743/s 58% 54% 51% 3 +6% 35% -- -20% -39% -39% -39% + -41% split 114943/s 98% 93% 90% 7 +0% 69% 25% -- -23% -24% -24% + -26% tr_and_split_redux 149254/s 157% 151% 146% 12 +1% 119% 63% 30% -- -1% -1% + -4% tr 151515/s 161% 155% 150% 12 +4% 123% 65% 32% 2% -- -0% + -3% tr_and_split 151515/s 161% 155% 150% 12 +4% 123% 65% 32% 2% 0% -- + -3% tr_moar_magic 156250/s 169% 162% 158% 13 +1% 130% 70% 36% 5% 3% 3% + -- $ perl foo Rate end_of_substring tr_and_end end_slurped a +ll tr_and_match while split tr_and_split_redux tr tr_and_split tr_m +oar_magic end_of_substring 57803/s -- -4% -6% -1 +2% -14% -36% -50% -62% -63% -63% + -65% tr_and_end 60241/s 4% -- -2% - +8% -11% -34% -48% -61% -61% -61% + -64% end_slurped 61728/s 7% 2% -- - +6% -9% -32% -47% -60% -60% -60% + -63% all 65359/s 13% 8% 6% +-- -3% -28% -44% -58% -58% -58% + -61% tr_and_match 67568/s 17% 12% 9% +3% -- -26% -42% -56% -57% -57% + -59% while 90909/s 57% 51% 47% 3 +9% 35% -- -22% -41% -42% -42% + -45% split 116279/s 101% 93% 88% 7 +8% 72% 28% -- -24% -26% -26% + -30% tr_and_split_redux 153846/s 166% 155% 149% 13 +5% 128% 69% 32% -- -2% -2% + -8% tr 156250/s 170% 159% 153% 13 +9% 131% 72% 34% 2% -- 0% + -6% tr_and_split 156250/s 170% 159% 153% 13 +9% 131% 72% 34% 2% 0% -- + -6% tr_moar_magic 166667/s 188% 177% 170% 15 +5% 147% 83% 43% 8% 7% 7% + --

See how much things moved around there? Still, it's all the same order of magnitude, so who cares, right? Well, here's the code.:

#!/opt/local/bin/perl use warnings; use strict; use Benchmark qw(:all); my $fh = *DATA; my $start = tell DATA; cmpthese(100000, { all => '_all', while => '_while', split => '_split', end_of_substring => '_end_of_substring', end_slurped => '_end_slurped', tr_and_match => '_tr_and_match', tr => '_tr', tr_moar_magic => '_tr_moar_magic', tr_and_split => '_tr_and_split', tr_and_split_redux => '_tr_and_split_redux', tr_and_end => '_tr_and_end', }); sub _all { seek $fh, $start, 0; my $count = 0; while (<$fh>){ $count += () = /(M+)/g; } # print "all: $count\n"; } sub _while { seek $fh, $start, 0; my $count=0; while(<$fh>){ while($_=~/(M+)/g){ $count++; } } # print "while: $count\n"; } sub _end_of_substring { seek $fh, $start, 0; my $count = 0; while ( <$fh> ) { $count += () = m/M[^M]/g; } # print "end_of_substring: $count\n"; } sub _end_slurped { seek $fh, $start, 0; local $/; $_ = <$fh>; my $count = () = m/M[^M]/g; # print "end_slurped: $count\n"; } sub _split { seek $fh, $start, 0; local $/; my $count = ( split /[^M]+/, <$fh> ) - 1; # print "split: $count\n"; } sub _tr { seek $fh, $start, 0; local $/; $_ = <$fh>; my $count = tr/M// =~ tr/M/M/sr; # print "tr: $count\n"; } sub _tr_moar_magic { seek $fh, $start, 0; local $/; $_ = <$fh>; tr/M/M/s; my $count = tr/M//; # print "tr: $count\n"; } sub _tr_and_split { seek $fh, $start, 0; local $/; $_ = <$fh>; tr/M/I/cs; my $count = (split /I/) - 1; # print "tr_and_split: $count\n"; } sub _tr_and_split_redux { seek $fh, $start, 0; local $/; $_ = <$fh>; tr/M/I/cs; tr/M/M/s; my $count = (split /I/) - 1; # print "tr_and_split_redux: $count\n"; } sub _tr_and_end { seek $fh, $start, 0; local $/; $_ = <$fh>; tr/M/I/cs; my $count = () = m/M[^M]/g; # print "tr_and_end: $count\n"; } sub _tr_and_match { seek $fh, $start, 0; local $/; $_ = <$fh>; $_ = tr/M/ /csr; my $count = () = m/(M)+/g; # print "tr_and_match: $count\n"; } __DATA__ IIIIIIIIIIIMMMMMMMMMMMMOOOOOOOOOOOOMMMMMMMMMIIIIIIIIIMM IIIIIIMMMMOOOOOMMMMIIIIIIIIIIIIIMMIIII MIM IMI M

What happens, though, with five times the data? I sloppily copied and pasted so that there are twenty-five lines quite similar but not identical to if I had repeated the same five lines five times. Here's where that order of magnitude is reached.:

$ perl foo Rate end_of_substring tr_and_end end_slurped a +ll tr_and_match while split tr_and_split_redux tr_and_split tr_moar_m +agic tr end_of_substring 13263/s -- -12% -14% -1 +8% -32% -54% -72% -89% -89% +-90% -91% tr_and_end 15083/s 14% -- -3% - +7% -23% -47% -68% -87% -88% +-88% -89% end_slurped 15480/s 17% 3% -- - +5% -21% -46% -67% -87% -88% +-88% -89% all 16234/s 22% 8% 5% +-- -17% -43% -66% -86% -87% +-88% -89% tr_and_match 19608/s 48% 30% 27% 2 +1% -- -32% -58% -83% -84% +-85% -86% while 28653/s 116% 90% 85% 7 +7% 46% -- -39% -76% -77% +-78% -80% split 47170/s 256% 213% 205% 19 +1% 141% 65% -- -60% -62% +-64% -67% tr_and_split_redux 117647/s 787% 680% 660% 62 +5% 500% 311% 149% -- -6% + -9% -18% tr_and_split 125000/s 842% 729% 707% 67 +0% 537% 336% 165% 6% -- + -4% -13% tr_moar_magic 129870/s 879% 761% 739% 70 +0% 562% 353% 175% 10% 4% + -- -9% tr 142857/s 977% 847% 823% 78 +0% 629% 399% 203% 21% 14% + 10% -- $ perl foo Rate end_of_substring tr_and_end end_slurped a +ll tr_and_match while split tr_and_split_redux tr_and_split tr_moar_m +agic tr end_of_substring 12594/s -- -16% -18% -1 +8% -35% -55% -73% -89% -89% +-90% -91% tr_and_end 15015/s 19% -- -2% - +3% -23% -46% -68% -86% -87% +-88% -89% end_slurped 15267/s 21% 2% -- - +1% -21% -45% -67% -86% -87% +-87% -89% all 15432/s 23% 3% 1% +-- -20% -45% -67% -86% -87% +-87% -89% tr_and_match 19380/s 54% 29% 27% 2 +6% -- -30% -58% -82% -83% +-84% -86% while 27855/s 121% 86% 82% 8 +1% 44% -- -40% -75% -76% +-77% -79% split 46512/s 269% 210% 205% 20 +1% 140% 67% -- -58% -60% +-61% -66% tr_and_split_redux 109890/s 773% 632% 620% 61 +2% 467% 295% 136% -- -4% + -9% -19% tr_and_split 114943/s 813% 666% 653% 64 +5% 493% 313% 147% 5% -- + -5% -15% tr_moar_magic 120482/s 857% 702% 689% 68 +1% 522% 333% 159% 10% 5% + -- -11% tr 135135/s 973% 800% 785% 77 +6% 597% 385% 191% 23% 18% + 12% -- $ perl foo Rate end_of_substring all tr_and_end end_slurp +ed tr_and_match while split tr_and_split_redux tr_and_split tr_moar_m +agic tr end_of_substring 12063/s -- -16% -19% -1 +9% -36% -57% -74% -89% -89% +-90% -91% all 14430/s 20% -- -3% - +4% -24% -49% -69% -87% -87% +-88% -89% tr_and_end 14837/s 23% 3% -- - +1% -22% -47% -68% -86% -87% +-88% -89% end_slurped 14970/s 24% 4% 1% +-- -21% -47% -68% -86% -87% +-87% -88% tr_and_match 18904/s 57% 31% 27% 2 +6% -- -33% -59% -82% -83% +-84% -85% while 28169/s 134% 95% 90% 8 +8% 49% -- -39% -74% -75% +-76% -78% split 46296/s 284% 221% 212% 20 +9% 145% 64% -- -57% -59% +-61% -64% tr_and_split_redux 107527/s 791% 645% 625% 61 +8% 469% 282% 132% -- -5% +-10% -17% tr_and_split 113636/s 842% 687% 666% 65 +9% 501% 303% 145% 6% -- + -5% -12% tr_moar_magic 119048/s 887% 725% 702% 69 +5% 530% 323% 157% 11% 5% + -- -8% tr 129870/s 977% 800% 775% 76 +8% 587% 361% 181% 21% 14% + 9% -- $ perl foo Rate end_of_substring tr_and_end end_slurped a +ll tr_and_match while split tr_and_split_redux tr_and_split tr_moar_m +agic tr end_of_substring 13210/s -- -13% -14% -1 +9% -33% -54% -72% -89% -89% +-90% -91% tr_and_end 15267/s 16% -- -1% - +6% -23% -46% -68% -87% -88% +-88% -89% end_slurped 15432/s 17% 1% -- - +5% -22% -46% -67% -87% -88% +-88% -89% all 16287/s 23% 7% 6% +-- -17% -43% -66% -86% -87% +-87% -88% tr_and_match 19724/s 49% 29% 28% 2 +1% -- -31% -58% -83% -84% +-85% -86% while 28409/s 115% 86% 84% 7 +4% 44% -- -40% -76% -77% +-78% -80% split 47393/s 259% 210% 207% 19 +1% 140% 67% -- -60% -62% +-64% -66% tr_and_split_redux 119048/s 801% 680% 671% 63 +1% 504% 319% 151% -- -5% + -8% -15% tr_and_split 125000/s 846% 719% 710% 66 +7% 534% 340% 164% 5% -- + -4% -11% tr_moar_magic 129870/s 883% 751% 742% 69 +7% 558% 357% 174% 9% 4% + -- -8% tr 140845/s 966% 823% 813% 76 +5% 614% 396% 197% 18% 13% + 8% -- $ perl foo Rate end_of_substring tr_and_end end_slurped a +ll tr_and_match while split tr_and_split_redux tr_and_split tr_moar_m +agic tr end_of_substring 13175/s -- -12% -14% -1 +8% -33% -52% -72% -89% -90% +-90% -91% tr_and_end 14948/s 13% -- -3% - +7% -24% -46% -68% -87% -88% +-88% -90% end_slurped 15337/s 16% 3% -- - +4% -22% -44% -67% -87% -88% +-88% -89% all 16051/s 22% 7% 5% +-- -18% -42% -65% -87% -87% +-88% -89% tr_and_match 19569/s 49% 31% 28% 2 +2% -- -29% -58% -84% -85% +-85% -86% while 27548/s 109% 84% 80% 7 +2% 41% -- -40% -77% -79% +-79% -81% split 46296/s 251% 210% 202% 18 +8% 137% 68% -- -61% -64% +-64% -68% tr_and_split_redux 119048/s 804% 696% 676% 64 +2% 508% 332% 157% -- -7% + -8% -17% tr_and_split 128205/s 873% 758% 736% 69 +9% 555% 365% 177% 8% -- + -1% -10% tr_moar_magic 129870/s 886% 769% 747% 70 +9% 564% 371% 181% 9% 1% + -- -9% tr 142857/s 984% 856% 831% 79 +0% 630% 419% 209% 20% 11% + 10% --

Do you see how some things are pulling far out ahead? See how they are getting into a more stable order, too? Yet see how the fastest few are still pretty close, comparatively, to one another? Here's that sloppy copy, BTW.:

__DATA__ IIIIIIIIIIIMMMMMMMMMMMMOOOOOOOOOOOOMMMMMMMMMIIIIIIIIIMM IIIIIIMMMMOOOOOMMMMIIIIIIIIIIIIIMMIIII MIM IMI M IIIIIIIIIIIIIIIMMMMMMMMMMMMOOOOOOOOOOOOMMMMMMMMMIIIIIIIIIMM IIIIIIMMMMOOOOOMMMMIIIIIIIIIIIIIMMIIII MIM IMI MIIIIIIIIIIMMMMMMMMMMMMOOOOOOOOOOOOMMMMMMMMMIIIIIIIIIMM IIIIIIMMMMOOOOOMMMMIIIIIIIIIIIIIMMIIII MIM IMI MIIIIIIIIIIMMMMMMMMMMMMOOOOOOOOOOOOMMMMMMMMMIIIIIIIIIMM IIIIIIMMMMOOOOOMMMMIIIIIIIIIIIIIMMIIII MIM IMI MIIIIIIIIIIMMMMMMMMMMMMOOOOOOOOOOOOMMMMMMMMMIIIIIIIIIMM IIIIIIMMMMOOOOOMMMMIIIIIIIIIIIIIMMIIII MIM IMI MIIIIIIIIIIMMMMMMMMMMMMOOOOOOOOOOOOMMMMMMMMMIIIIIIIIIMM IIIIIIMMMMOOOOOMMMMIIIIIIIIIIIIIMMIIII MIM IMI M

Here I did the same sort of sloppy copy and paste to go from 25 lines to 125. The top is still pretty stable. The low end is changing, but if you're worried about speed then you're not using any of those by now anyway. It's still an interesting note. More interesting, though, is that the top options are processing 25 times the data and doing more than half as many iterations per second as when we started. Others that were already the slower options but seemed within arm's reach dropped to roughly an eighth of their previous iteration count on five lines.:

$ perl foo Rate end_of_substring tr_and_end end_slurped a +ll tr_and_match while split tr_and_split_redux tr_and_split tr_moar_m +agic tr end_of_substring 2634/s -- -18% -18% -2 +3% -39% -58% -77% -95% -96% +-96% -97% tr_and_end 3194/s 21% -- -1% - +7% -26% -49% -72% -94% -95% +-95% -97% end_slurped 3227/s 23% 1% -- - +6% -25% -48% -72% -94% -95% +-95% -97% all 3419/s 30% 7% 6% +-- -21% -45% -70% -94% -94% +-94% -97% tr_and_match 4325/s 64% 35% 34% 2 +7% -- -30% -62% -92% -93% +-93% -96% while 6207/s 136% 94% 92% 8 +2% 44% -- -46% -89% -90% +-90% -94% split 11416/s 333% 257% 254% 23 +4% 164% 84% -- -79% -81% +-82% -88% tr_and_split_redux 54054/s 1952% 1592% 1575% 148 +1% 1150% 771% 374% -- -9% +-13% -45% tr_and_split 59172/s 2147% 1753% 1734% 163 +1% 1268% 853% 418% 9% -- + -5% -40% tr_moar_magic 62112/s 2258% 1845% 1825% 171 +7% 1336% 901% 444% 15% 5% + -- -37% tr 99010/s 3659% 3000% 2968% 279 +6% 2189% 1495% 767% 83% 67% + 59% --

So far these have all been under Perl 5.16.3 though, and maybe that's skewing the results. So here are a couple from 5.22.0 with the 125 lines for you cutting-edge folks.:

$ perl5.22 foo Rate end_of_substring all tr_and_end end_slur +ped while tr_and_match split tr_and_split_redux tr_and_split tr_moar_ +magic tr end_of_substring 3160/s -- -16% -29% - +30% -39% -48% -72% -95% -95% + -95% -97% all 3775/s 19% -- -15% - +17% -27% -38% -67% -94% -94% + -94% -96% tr_and_end 4439/s 40% 18% -- +-2% -14% -28% -61% -92% -93% + -93% -96% end_slurped 4537/s 44% 20% 2% + -- -12% -26% -60% -92% -93% + -93% -96% while 5160/s 63% 37% 16% +14% -- -16% -54% -91% -92% + -92% -95% tr_and_match 6124/s 94% 62% 38% +35% 19% -- -46% -90% -91% + -91% -94% split 11274/s 257% 199% 154% 1 +48% 118% 84% -- -81% -83% + -83% -89% tr_and_split_redux 59172/s 1773% 1467% 1233% 12 +04% 1047% 866% 425% -- -8% + -9% -44% tr_and_split 64516/s 1942% 1609% 1354% 13 +22% 1150% 954% 472% 9% -- + -1% -39% tr_moar_magic 64935/s 1955% 1620% 1363% 13 +31% 1158% 960% 476% 10% 1% + -- -38% tr 105263/s 3232% 2688% 2272% 22 +20% 1940% 1619% 834% 78% 63% + 62% -- $ perl5.22 foo Rate end_of_substring all tr_and_end end_slur +ped while tr_and_match split tr_and_split_redux tr_and_split tr_moar_ +magic tr end_of_substring 3176/s -- -17% -31% - +33% -41% -50% -73% -95% -95% + -95% -97% all 3828/s 21% -- -17% - +19% -29% -39% -67% -94% -94% + -94% -97% tr_and_end 4598/s 45% 20% -- +-3% -15% -27% -61% -92% -93% + -93% -96% end_slurped 4730/s 49% 24% 3% + -- -12% -25% -59% -92% -93% + -93% -96% while 5405/s 70% 41% 18% +14% -- -14% -54% -91% -92% + -92% -95% tr_and_match 6313/s 99% 65% 37% +33% 17% -- -46% -89% -90% + -91% -94% split 11641/s 267% 204% 153% 1 +46% 115% 84% -- -81% -82% + -83% -90% tr_and_split_redux 59880/s 1786% 1464% 1202% 11 +66% 1008% 849% 414% -- -9% + -11% -47% tr_and_split 65789/s 1972% 1618% 1331% 12 +91% 1117% 942% 465% 10% -- + -2% -41% tr_moar_magic 67114/s 2013% 1653% 1360% 13 +19% 1142% 963% 477% 12% 2% + -- -40% tr 112360/s 3438% 2835% 2344% 22 +75% 1979% 1680% 865% 88% 71% + 67% --

How about 5.14?

$ perl5.14.4 foo Rate end_of_substring end_slurped tr_and_end +all tr_and_match while split tr_and_split_redux tr_and_split tr_moar_ +magic tr end_of_substring 2672/s -- -19% -19% - +22% -38% -59% -78% -95% -96% + -96% -97% end_slurped 3298/s 23% -- -0% +-3% -23% -49% -73% -94% -95% + -95% -97% tr_and_end 3312/s 24% 0% -- +-3% -23% -49% -73% -94% -95% + -95% -97% all 3406/s 27% 3% 3% + -- -20% -47% -72% -93% -94% + -94% -97% tr_and_match 4277/s 60% 30% 29% +26% -- -34% -65% -92% -93% + -93% -96% while 6443/s 141% 95% 95% +89% 51% -- -47% -88% -89% + -89% -94% split 12255/s 359% 272% 270% 2 +60% 187% 90% -- -77% -80% + -80% -88% tr_and_split_redux 52356/s 1859% 1487% 1481% 14 +37% 1124% 713% 327% -- -14% + -15% -48% tr_and_split 60606/s 2168% 1738% 1730% 16 +79% 1317% 841% 395% 16% -- + -1% -40% tr_moar_magic 61350/s 2196% 1760% 1752% 17 +01% 1334% 852% 401% 17% 1% + -- -39% tr 101010/s 3680% 2963% 2949% 28 +66% 2262% 1468% 724% 93% 67% + 65% --

I happen to also have a 5.18 on this system.:

$ perl5.18.4 foo Rate end_of_substring tr_and_end end_slurped +all tr_and_match while split tr_and_split_redux tr_moar_magic tr_and_ +split tr end_of_substring 2651/s -- -13% -15% - +16% -35% -49% -77% -95% -95% + -95% -98% tr_and_end 3049/s 15% -- -3% +-4% -25% -42% -74% -94% -95% + -95% -97% end_slurped 3131/s 18% 3% -- +-1% -23% -40% -73% -94% -95% + -95% -97% all 3168/s 19% 4% 1% + -- -22% -40% -73% -94% -95% + -95% -97% tr_and_match 4060/s 53% 33% 30% +28% -- -23% -65% -92% -93% + -93% -96% while 5244/s 98% 72% 67% +66% 29% -- -55% -90% -91% + -91% -95% split 11710/s 342% 284% 274% 2 +70% 188% 123% -- -77% -80% + -80% -89% tr_and_split_redux 51020/s 1824% 1573% 1530% 15 +11% 1157% 873% 336% -- -13% + -13% -53% tr_moar_magic 58480/s 2106% 1818% 1768% 17 +46% 1340% 1015% 399% 15% -- + -1% -46% tr_and_split 58824/s 2119% 1829% 1779% 17 +57% 1349% 1022% 402% 15% 1% + -- -45% tr 107527/s 3956% 3427% 3334% 32 +95% 2548% 1951% 818% 111% 84% + 83% --

Well, 5.22 might be too new for some of you, and 5.18 too old. So the Goldilocks 5.20.3 to the rescue:

$ perl5.20 foo Rate end_of_substring all tr_and_end end_slur +ped tr_and_match while split tr_and_split_redux tr_and_split tr_moar_ +magic tr end_of_substring 2752/s -- -13% -27% - +28% -42% -49% -77% -95% -96% + -96% -98% all 3159/s 15% -- -16% - +17% -33% -42% -74% -94% -95% + -95% -97% tr_and_end 3759/s 37% 19% -- +-2% -20% -31% -69% -93% -94% + -94% -97% end_slurped 3828/s 39% 21% 2% + -- -19% -30% -68% -93% -94% + -94% -97% tr_and_match 4728/s 72% 50% 26% +23% -- -13% -61% -91% -92% + -93% -96% while 5432/s 97% 72% 44% +42% 15% -- -55% -90% -91% + -92% -95% split 12019/s 337% 281% 220% 2 +14% 154% 121% -- -78% -81% + -82% -89% tr_and_split_redux 55249/s 1908% 1649% 1370% 13 +43% 1069% 917% 360% -- -12% + -15% -51% tr_and_split 62500/s 2171% 1879% 1563% 15 +33% 1222% 1051% 420% 13% -- + -4% -44% tr_moar_magic 65359/s 2275% 1969% 1639% 16 +07% 1282% 1103% 444% 18% 5% + -- -42% tr 112360/s 3983% 3457% 2889% 28 +35% 2276% 1969% 835% 103% 80% + 72% --

"But...", you may say, "... but, mr_mischief, I have to run against a really old perl. I use 5.8.9, or 5.10.1, or 5.12.5 and I want to see these results for my version.". Well, that may well be. However, the program used here won't compile on your version. Which sub that causes the problem? The tr version does. Yes, that one, the one that is consistently higher in these results (for this system at least) across multiple versions when the data size is larger. The tr_and_match version doesn't work either. The /r flag on tr/// behaves differently as of 5.14 so you'll have to find something else.

Still, there are options much faster than others. The code really needs some fixing up if you get warnings, though. 5.8 and 5.10 don't like this code much. :

$ perl5.8.9 foo Use of implicit split to @_ is deprecated at foo line 65. Use of implicit split to @_ is deprecated at foo line 91. Use of implicit split to @_ is deprecated at foo line 101. Rate end_of_substring all tr_and_end end_slurp +ed split while tr_and_split_redux tr_and_split tr_moar_magic end_of_substring 3911/s -- -17% -18% -2 +1% -47% -57% -72% -73% -93% all 4686/s 20% -- -2% - +6% -36% -49% -66% -67% -92% tr_and_end 4796/s 23% 2% -- - +3% -35% -47% -65% -66% -92% end_slurped 4960/s 27% 6% 3% +-- -32% -45% -64% -65% -91% split 7348/s 88% 57% 53% 4 +8% -- -19% -47% -48% -87% while 9099/s 133% 94% 90% 8 +3% 24% -- -34% -36% -84% tr_and_split_redux 13831/s 254% 195% 188% 17 +9% 88% 52% -- -3% -76% tr_and_split 14245/s 264% 204% 197% 18 +7% 94% 57% 3% -- -75% tr_moar_magic 57803/s 1378% 1134% 1105% 106 +5% 687% 535% 318% 306% --
$ perl5.10.1 foo Use of implicit split to @_ is deprecated at foo line 65. Use of implicit split to @_ is deprecated at foo line 91. Use of implicit split to @_ is deprecated at foo line 101. Rate end_of_substring tr_and_end end_slurped a +ll split while tr_and_split tr_and_split_redux tr_moar_magic end_of_substring 2899/s -- -20% -21% -2 +3% -54% -63% -77% -77% -95% tr_and_end 3634/s 25% -- -1% - +4% -42% -54% -72% -72% -94% end_slurped 3679/s 27% 1% -- - +3% -41% -53% -71% -71% -94% all 3779/s 30% 4% 3% +-- -40% -52% -70% -71% -94% split 6258/s 116% 72% 70% 6 +6% -- -20% -51% -51% -90% while 7843/s 171% 116% 113% 10 +8% 25% -- -39% -39% -87% tr_and_split 12788/s 341% 252% 248% 23 +8% 104% 63% -- -0% -79% tr_and_split_redux 12837/s 343% 253% 249% 24 +0% 105% 64% 0% -- -79% tr_moar_magic 60606/s 1990% 1568% 1547% 150 +4% 868% 673% 374% 372% --
$ perl5.12.5 foo Rate end_of_substring end_slurped tr_and_end a +ll while split tr_and_split_redux tr_and_split tr_moar_magic end_of_substring 2596/s -- -18% -19% -2 +1% -61% -77% -95% -96% -96% end_slurped 3185/s 23% -- -1% - +3% -52% -72% -94% -95% -95% tr_and_end 3214/s 24% 1% -- - +2% -52% -72% -94% -95% -95% all 3270/s 26% 3% 2% +-- -51% -71% -94% -95% -95% while 6693/s 158% 110% 108% 10 +5% -- -41% -87% -89% -89% split 11429/s 340% 259% 256% 24 +9% 71% -- -78% -81% -82% tr_and_split_redux 52910/s 1938% 1561% 1546% 151 +8% 690% 363% -- -12% -15% tr_and_split 60241/s 2220% 1792% 1774% 174 +2% 800% 427% 14% -- -3% tr_moar_magic 62112/s 2293% 1850% 1832% 179 +9% 828% 443% 17% 3% --

It's really striking how much better performance for some of those really is in more recent versions. That's impressive progress for a "dead language". Some of you may remember when demerphq was working on making the regex engine more efficient. I think that was around 5.10 and 5.12 IIRC. Well I can't credit it all there, but a 300% speedup on tr_and_split seems like a good indicator that something has improved. On the other hand, it looks like something in the sub called while may have undergone a performance regression between 5.8 and 5.18 on these builds on this platform according to this limited data. Yes, this is still very limited data.

If you really care about speed then follow this advice: measure multiple ways, then repeat as needed. First, measure by checking quickly if things are fast enough already. Then measure by profiling to make sure you're looking at the right hot spot to measure further. Measure the complexity of the logic in the source. Measure the complexity of how things are done after your source is processed. Measure the pieces in isolation. Then, measure your actual implementation's runtime again in order to make sure you need to keep looking for things to speed up. Do each of these on several sizes of input which vary in content. Pay the most attention to typical inputs, but also to pathological worst cases.

Oh, about that varied input? The program bar is the same as the program foo except for the DATA section.:

$ tail -8 bar __DATA__ oevoerfgveoevervevoooooevevevoooevevevevoooor4tghew4rtghgwooorgeqrgfer +fgoooo refrorgoregerorrefM oorreoooooerferfgqerfgrt4hgetryhy5oooo erfewrfewrfertwgerwGoorefeqwrefrfewer rfeeorrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr +rrrrrrrrrrrrrrrrrooorr rferrrrrrrrrrrrroooooooooooooooooooooooooooooooooooooooooooooooooooooo +rrrrr
$ perl5.22.0 bar Rate while all split end_of_substring tr_and_ma +tch tr_and_end end_slurped tr_and_split_redux tr_moar_magic tr_and_sp +lit tr while 111111/s -- -4% -4% -6% +-8% -13% -18% -23% -24% - +26% -26% all 116279/s 5% -- 0% -1% +-3% -9% -14% -20% -21% - +22% -22% split 116279/s 5% 0% -- -1% +-3% -9% -14% -20% -21% - +22% -22% end_of_substring 117647/s 6% 1% 1% -- +-2% -8% -13% -19% -20% - +21% -21% tr_and_match 120482/s 8% 4% 4% 2% + -- -6% -11% -17% -18% - +19% -19% tr_and_end 128205/s 15% 10% 10% 9% + 6% -- -5% -12% -13% - +14% -14% end_slurped 135135/s 22% 16% 16% 15% +12% 5% -- -7% -8% +-9% -9% tr_and_split_redux 144928/s 30% 25% 25% 23% +20% 13% 7% -- -1% +-3% -3% tr_moar_magic 147059/s 32% 26% 26% 25% +22% 15% 9% 1% -- +-1% -1% tr_and_split 149254/s 34% 28% 28% 27% +24% 16% 10% 3% 1% + -- -0% tr 149254/s 34% 28% 28% 27% +24% 16% 10% 3% 1% + 0% --

See how that carries even less information about differences than the original five lines? Well, baz has the faster ones pulling away sooner, as mentioned by an anonymous monk in Re: Faster way to do this? about the tr/// operator and the 'M' being common in the data.:

$ perl5.22.0 baz Rate end_of_substring tr_and_end end_slurped a +ll while split tr_and_match tr_and_split_redux tr_and_split tr_moar_m +agic tr end_of_substring 19048/s -- -3% -4% -7 +1% -74% -75% -75% -84% -86% +-86% -87% tr_and_end 19608/s 3% -- -1% -7 +0% -74% -74% -75% -84% -85% +-86% -87% end_slurped 19881/s 4% 1% -- -7 +0% -73% -74% -74% -83% -85% +-86% -86% all 65359/s 243% 233% 229% +-- -12% -14% -16% -45% -51% +-53% -56% while 74627/s 292% 281% 275% 1 +4% -- -2% -4% -37% -44% +-46% -49% split 76336/s 301% 289% 284% 1 +7% 2% -- -2% -36% -43% +-45% -48% tr_and_match 77519/s 307% 295% 290% 1 +9% 4% 2% -- -35% -42% +-44% -47% tr_and_split_redux 119048/s 525% 507% 499% 8 +2% 60% 56% 54% -- -11% +-14% -19% tr_and_split 133333/s 600% 580% 571% 10 +4% 79% 75% 72% 12% -- + -4% -9% tr_moar_magic 138889/s 629% 608% 599% 11 +2% 86% 82% 79% 17% 4% + -- -6% tr 147059/s 672% 650% 640% 12 +5% 97% 93% 90% 24% 10% + 6% --

These three data sets with similar sizes are just about as important to test as varied sizes of input. You'd never know from a single sample the amount of variance in some of these solutions.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1147607]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (8)
As of 2022-12-05 11:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?