Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re^8: Faster Luhn Check Digit Calculation?

by BrowserUk (Patriarch)
on Dec 02, 2018 at 07:45 UTC ( #1226617=note: print w/replies, xml ) Need Help??


in reply to Re^7: Faster Luhn Check Digit Calculation?
in thread Faster Luhn Check Digit Calculation?

The optomiser eliminates any differences. Yours (c_luhn3 below) can be slightly faster one run and slightly slower on the next; but the difference is less that 0.5% either way:

#! perl -slw use strict; use Inline C => Config => BUILD_NOISY => 1; use Inline C => <<'END_C', NAME => '_luhn', CLEAN_AFTER_BUILD =>0; int c_luhn( char *s ) { int i, total = 0; for( i=0; i < 15; ++i ) { int d = s[ i ] - '0'; if( !( i & 1 ) ) { d *= 2; if( d > 9 ) d -= 9; } total += d; } total *= 9; return total % 10; } int l[] = { 0, 2, 4, 6, 8, 1, 3, 5, 7, 9 }; int c_luhn2( char *s ) { int total = 0; total += l[ s[ 0] - '0' ]; total += s[ 1] - '0'; total += l[ s[ 2] - '0' ]; total += s[ 3] - '0'; total += l[ s[ 4] - '0' ]; total += s[ 5] - '0'; total += l[ s[ 6] - '0' ]; total += s[ 7] - '0'; total += l[ s[ 8] - '0' ]; total += s[ 9] - '0'; total += l[ s[10] - '0' ]; total += s[11] - '0'; total += l[ s[12] - '0' ]; total += s[13] - '0'; total += l[ s[14] - '0' ]; total *= 9; return total % 10; } int ll[] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 4, 6, 8, 1, 3, 5, 7, 9 }; int c_luhn3( char *s ) { return (( ll[ (int)s[ 0] ] + s[ 1] + ll[ (int)s[ 2] ] + s[ 3] + ll[ (int)s[ 4] ] + s[ 5] + ll[ (int)s[ 6] ] + s[ 7] + ll[ (int)s[ 8] ] + s[ 9] + ll[ (int)s[10] ] + s[11] + ll[ (int)s[12] ] + s[13] + ll[ (int)s[14] ] - 7 * '0' ) * 9) % 10; } END_C use Time::HiRes qw[ time ]; my @samples = qw[ 4011350000000008 4011350000000016 4011350000000024 4011350000000032 4011350000000040 4011350000000057 4011350000000065 4011350000000073 4011350000000081 4011350000000099 ]; sub luhn { use integer; my $s = $_[ 0 ]; my $total = 0; for my $i ( 0 .. 14 ) { my $d = substr( $s, $i, 1 ); unless( $i & 1 ) { $d *= 2; $d -= 9 if $d > 9; } $total += $d; } $total *= 9; return chop $total; } for ( @samples ) { print "$_: ", luhn( substr $_, 0, 15 ); } my $start = time; #for ( 401135000000000..401135000999999 ) { # my $chk = luhn( $_ ); #} #printf "Took %.9f seconds.\n", time() - $start; #for ( @samples ) { # print "$_: ", c_luhn( $_ ); #} $start = time; for ( 401135000000000..401135000999999 ) { my $chk = c_luhn( $_ ); } printf "Took %.9f seconds.\n", time() - $start; for ( @samples ) { print "$_: ", c_luhn2( $_ ); } $start = time; for ( 401135000000000..401135000999999 ) { my $chk = c_luhn2( $_ ); } printf "Took %.9f seconds.\n", time() - $start; for ( @samples ) { print "$_: ", c_luhn3( $_ ); } $start = time; for ( 401135000000000..401135000999999 ) { my $chk = c_luhn3( $_ ); } printf "Took %.9f seconds.\n", time() - $start;
C:\test>luhn 4011350000000008: 8 4011350000000016: 6 4011350000000024: 4 4011350000000032: 2 4011350000000040: 0 4011350000000057: 7 4011350000000065: 5 4011350000000073: 3 4011350000000081: 1 4011350000000099: 9 Took 0.751899958 seconds. 4011350000000008: 8 4011350000000016: 6 4011350000000024: 4 4011350000000032: 2 4011350000000040: 0 4011350000000057: 7 4011350000000065: 5 4011350000000073: 3 4011350000000081: 1 4011350000000099: 9 Took 0.695394039 seconds. 4011350000000008: 8 4011350000000016: 6 4011350000000024: 4 4011350000000032: 2 4011350000000040: 0 4011350000000057: 7 4011350000000065: 5 4011350000000073: 3 4011350000000081: 1 4011350000000099: 9 Took 0.693738222 seconds. C:\test>luhn 4011350000000008: 8 4011350000000016: 6 4011350000000024: 4 4011350000000032: 2 4011350000000040: 0 4011350000000057: 7 4011350000000065: 5 4011350000000073: 3 4011350000000081: 1 4011350000000099: 9 Took 0.751657963 seconds. 4011350000000008: 8 4011350000000016: 6 4011350000000024: 4 4011350000000032: 2 4011350000000040: 0 4011350000000057: 7 4011350000000065: 5 4011350000000073: 3 4011350000000081: 1 4011350000000099: 9 Took 0.694734097 seconds. 4011350000000008: 8 4011350000000016: 6 4011350000000024: 4 4011350000000032: 2 4011350000000040: 0 4011350000000057: 7 4011350000000065: 5 4011350000000073: 3 4011350000000081: 1 4011350000000099: 9 Took 0.697062969 seconds.

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
In the absence of evidence, opinion is indistinguishable from prejudice. Suck that fhit

Replies are listed 'Best First'.
Re^9: Faster Luhn Check Digit Calculation?
by Corion (Patriarch) on Dec 02, 2018 at 08:06 UTC

    At least according to godbolt.org, the three routines don't get folded into the same assembly code. What's (somewhat) interesting is that gcc and clang create different addressing modes for the accesses, so it might be worth to switch between compilers and compiler versions if calculating the check digits was material to the program operation.

      At least according to godbolt.org, the three routines don't get folded into the same assembly code.

      I didn't really expect that they would be; but the empirical evidence is that they take the same amount of time. The routine is now so stripped down to the essentials that the calling overheads are beginning to dominate.

      For example, the OPs original test code used a numeric range for testing:401135000000000..401135000999999 which I just copy&pasted. That means that every call to the function carries the overhead of converting those integers to strings. Switching the range to strings: '401135000000000' .. '401135000999999' avoids that conversion and saves more time than tybalt's optimisations.

      Other changes that would require knowledge of how that code is used are also more likely to affect the performance. Eg. If the code's purpose is to verify existing CC numbers (rather than constructing new ones) then is is likely that the 16-digits strings are being read from a file or DB. Then the 16-digits are truncated to 15, passed into the routine, the checkdigit is calculated and returned and that is then compared against the 16th digit of the input.

      If the routine was rewritten to accept the 16 digits and return a boolean indicating T/F then several operations external to the routine could be eliminated and overall throughput improved even though the function itself might be a little slower.

      Only the OP can decide what comes next.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
      In the absence of evidence, opinion is indistinguishable from prejudice. Suck that fhit
        "Only the OP can decide what comes next."

        Based on a huge performance difference with C, I think I'll make an XS clone of Algorithm::LUHN so this is more generally useful for others. It introduces some requirements that will slow things down (like this, but it seems to be the most popular module with LUHN10 in it.

        Appreciate all the help from everyone. If I do get get it done, I'll reference/credit all in the module.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1226617]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (2)
As of 2022-01-17 02:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    In 2022, my preferred method to securely store passwords is:












    Results (50 votes). Check out past polls.

    Notices?