Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re^4: Fast Replacement (0.01 seconds)

by davido (Cardinal)
on Jun 14, 2013 at 15:49 UTC ( [id://1038984]=note: print w/replies, xml ) Need Help??


in reply to Re^3: Fast Replacement (0.01 seconds)
in thread Fast Replacement

For what it's worth, depending on how I measure, this is at least ten times faster than my solution. In real life it would be quite a bit more than 10x the speed of my solution -- I just used Benchmark to test, and that required making a copy of the input string on each test iteration so as not to mess with the original. Since the OP wouldn't be making copies (hopefully), that could be factored out, and would make the difference between our two algorithms all the more significant.


Dave

Replies are listed 'Best First'.
Re^5: Fast Replacement (0.01 seconds)
by BrowserUk (Patriarch) on Jun 14, 2013 at 19:42 UTC
    making a copy of the input string on each test iteration so as not to mess with the original.

    Indeed, it is a pig to benchmark. Here's my attempt.

    What I did was have the first iteration do tr[!][\n] and the second tr[\n][!], using a flag to keep track of odd & even. It also shows how the problem some people level at tr -- the need to know the lists at compile time -- can be addressed:

    #! perl -slw use strict; use Benchmark qw[ cmpthese ]; sub makeTR{ eval "sub{ \$_[ 0 ] =~ tr[$_[0]][$_[1]] }"; } our $N //= 10; die "$N must be even and positive" if $N &1 or $N < 2; our $tr1 = makeTR( '!', "\n" ); our $tr2 = makeTR( "\n", '!' ); our $flag = 0; our $s = '1234!' x 55e3; cmpthese $N, { a => q[ if( $flag ) { my( $p, $c ) = ( 0, 50e3 ); 1 while --$c and $p = index $s, "\n", $p; $tr2->( substr $s, 0, $p ); $flag ^= 1; } else { my( $p, $c ) = ( 0, 50e3 ); 1 while --$c and $p = index $s, "!", $p; $tr1->( substr $s, 0, $p ); $flag ^= 1; } ], b => q[ if( $flag ) { $s =~ s/\n(??{ ( $myregexp::count++ < 50000 ) ? '' : '(?!) +' })/!/g; $flag ^= 1; } else { $s =~ s/!(??{ ( $myregexp::count++ < 50000 ) ? '' : '(?!)' + })/\n/g; $flag ^= 1; } ], };

    And the results put tr 5x to 30x times faster, so your benchmark isn't bad at all:

    C:\test>junk71 -N=2 (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) s/iter a b a 5.84 -- -85% b 0.899 550% -- C:\test>junk71 -N=4 (warning: too few iterations for a reliable count) s/iter a b a 5.81 -- -92% b 0.492 1081% -- C:\test>junk71 -N=10 s/iter a b a 5.78 -- -95% b 0.273 2013% -- C:\test>junk71 -N=20 s/iter a b a 5.74 -- -97% b 0.176 3167% --

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Wait, please tell me I'm reading the results wrong.... In my benchmarks yours was faster. But in your benchmarks, "a", which is your algorithm, is taking 5.xx seconds per iteration, whereas "b", which is mine, is taking 0.17-0.9 seconds per iteration. Your benchmark seems to be showing the regexp approach winning by a landslide.


      Dave

        Using the eval subroutines was a step too far. Whilst much better than eval for every line, the additional subroutine call still has a substantial impact.

        Going back to hardcoded trs, and you get the picture we were both expecting:

        C:\test\ACA>..\junk71 -N=4 (warning: too few iterations for a reliable count) Rate b a b 1.45/s -- -98% a 85.1/s 5751% -- C:\test\ACA>..\junk71 -N=10 (warning: too few iterations for a reliable count) Rate b a b 2.94/s -- -97% a 91.7/s 3025% -- C:\test\ACA>..\junk71 -N=20 (warning: too few iterations for a reliable count) Rate b a b 4.55/s -- -96% a 116/s 2453% -- C:\test\ACA>..\junk71 -N=50 (warning: too few iterations for a reliable count) Rate b a b 6.67/s -- -95% a 133/s 1900% -- C:\test\ACA>..\junk71 -N=100 Rate b a b 7.93/s -- -95% a 149/s 1776% --

        Updated benchmark code:

        #! perl -slw use strict; use Benchmark qw[ cmpthese ]; sub makeTR{ eval "sub{ \$_[ 0 ] =~ tr[$_[0]][$_[1]] }"; } our $N //= 10; die "$N must be even and positive" if $N &1 or $N < 2; our $tr1 = makeTR( '!', "\n" ); our $tr2 = makeTR( "\n", '!' ); our $flag = 0; our $s = '1234!' x 55e3; cmpthese $N, { a => q[ if( $flag ) { my( $p, $c ) = ( 0, 50e3 ); 1 while --$c and $p = index $s, "\n", $p; $s =~ tr[\n][!]; #$tr2->( substr $s, 0, $p ); $flag ^= 1; } else { my( $p, $c ) = ( 0, 50e3 ); 1 while --$c and $p = index $s, "!", $p; $s =~ tr[!][\n]; #$tr1->( substr $s, 0, $p ); $flag ^= 1; } ], b => q[ if( $flag ) { $s =~ s/\n(??{ ( $myregexp::count++ < 50000 ) ? '' : '(?!) +' })/!/g; $flag ^= 1; } else { $s =~ s/!(??{ ( $myregexp::count++ < 50000 ) ? '' : '(?!)' + })/\n/g; $flag ^= 1; } ], };

        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
        ,

        Holy crap! You're right! (I saw what I was expecting to see :( )

        Unless there is some bug I haven't spotted, ...


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1038984]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (6)
As of 2024-04-20 00:11 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found