http://qs321.pair.com?node_id=638659


in reply to Challenge: CPU-optimized byte-wise or-equals (for a meter of beer)

[Updated.]

I've been using Inline lately as a way to eval_xs(). Here are two straight up implementations in C and C++. I first did the C/libc implementation because it was so incredibly straightforward. I followed up with C++ because I wanted to see if I could regain support for unicode on the way. I failed that one - I don't have std::wstring in my g++ and I didn't want to have to look for it. I suspect the C++ stuff is slower than the C because std::string(<char*>...) copies the array.

I noticed that moritz' code didn't pass tests when it was enabled. An off-by-one error I'm sure.

# Rate avar2 dio_cpp moritz dio_c #avar2 215/s -- -42% -64% -90% #dio_cpp 374/s 73% -- -38% -82% #moritz 598/s 178% 60% -- -71% #dio_c 2084/s 868% 458% 248% -- # #Later, just the C version with everyone else's # Rate split1 substr1 ikegami_s avar avar2 ikega +mi_tr avar2_pos corion moritz avar2_pos_inplace dio_c #split1 0.994/s -- -88% -91% -99% -99% + -99% -99% -100% -100% -100% -100% #substr1 8.33/s 738% -- -28% -92% -93% + -94% -96% -96% -98% -99% -100% #ikegami_s 11.6/s 1069% 39% -- -89% -90% + -91% -94% -95% -97% -98% -99% #avar 102/s 10168% 1125% 778% -- -9% + -25% -46% -56% -73% -85% -96% #avar2 112/s 11194% 1247% 866% 10% -- + -17% -41% -52% -70% -84% -95% #ikegami_tr 136/s 13548% 1528% 1068% 33% 21% + -- -29% -42% -64% -80% -94% #avar2_pos 191/s 19090% 2189% 1542% 87% 70% + 41% -- -18% -49% -72% -92% #corion 233/s 23354% 2698% 1907% 128% 108% + 72% 22% -- -38% -66% -90% #moritz 374/s 37512% 4387% 3118% 266% 233% + 176% 96% 60% -- -46% -84% #avar2_pos_inplace 691/s 69381% 8188% 5844% 577% 515% + 409% 262% 196% 85% -- -70% #dio_c 2315/s 232762% 27677% 19822% 2168% 1962% +1606% 1113% 893% 519% 235% BEGIN { package diotalevi; local @diotalevi::ISA; my $c_src = <<'XS'; #include "EXTERN.h" #include "perl.h" #include "XSUB.h" #include <string.h> #include <string> using namespace std; MODULE = diotalevi PACKAGE = diotalevi void dio_c(dsv_rv, ssv_rv) SV *dsv_rv SV *ssv_rv INIT: SV *dsv; SV *ssv; STRLEN len; char *dpv; char *spv; char *ptr; char *ptr_end; PPCODE: if ( ! ( dsv_rv && ssv_rv && SvRV( dsv_rv ) && SvRV( ssv_rv ) +&& SvPOK(SvRV(dsv_rv)) && SvPOK(SvRV(ssv_rv)))) croak( "Blaaa" ); /* Fetch my SVs */ dsv = SvRV( dsv_rv ); ssv = SvRV( ssv_rv ); /* Fetch my (char*)s. */ dpv = SvPVX( dsv ); spv = SvPVX( ssv ); /* Operate only on the minimum length. */ len = SvCUR( dsv ) < SvCUR( ssv ) ? SvCUR( dsv ) : SvCUR( ssv ); /* Establish bounds. */ ptr = dpv; ptr_end = dpv + len; /* Do it. */ while ( 1 ) { ptr = (char*)memchr( ptr, '\0', ptr_end - ptr ); if ( ! ( ptr && ptr < ptr_end ) ) break; *ptr = *(ptr - dpv + spv); ++ ptr; } if ( G_VOID != GIMME_V ) { XPUSHs( SvRV( ST(0) ) ); } static void dio_cpp( dsv_rv, ssv_rv ) SV * dsv_rv SV * ssv_rv PPCODE: // Dereference the references. SV *dsv = SvRV( dsv_rv ); SV *ssv = SvRV( ssv_rv ); // Get mah lengths const STRLEN lend = SvCUR( dsv ); const STRLEN lens = SvCUR( ssv ); const STRLEN len = min( lend, lens ); // Make mah strings char *tgt = SvPVX( dsv ); const string dstr( tgt, lend ); const string sstr( (char const * const)SvPVX(ssv), lens ); // The Replacements string::size_type offset = 0; while ( 1 ) { offset = dstr.find( '\0', offset ); if ( offset == string::npos ) break; tgt[offset] = sstr[offset]; ++ offset; } if ( G_VOID != GIMME_V ) { XPUSHs( SvRV( ST(0) ) ); } XS Inline->bind( CPP => $c_src, NAME => 'diotalevi', XSMODE => 1, ); *main::dio_c = \&diotalevi::dio_c; *main::dio_cpp = \&diotalevi::dio_cpp; }

⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

  • Comment on Re: Challenge: CPU-optimized byte-wise or-equals (for a meter of beer)
  • Download Code