http://qs321.pair.com?node_id=638715


in reply to Re^3: Challenge: CPU-optimized byte-wise or-equals (for a meter of beer) (addr math)
in thread Challenge: CPU-optimized byte-wise or-equals (for a meter of beer)

I'd already coded up an XS version and used memchr for my search instead of your explicit loop. I don't know why but I found that using memchr got a function that was twice as fast. I included the searching part of the code below.

Rate avar avar2 ikegami_tr avar2_pos corion morit +z avar2_pos_inplace dio_c2 dio_c avar 109/s -- -6% -18% -40% -51% -70 +% -84% -91% -95% avar2 116/s 6% -- -13% -36% -48% -68 +% -83% -90% -95% ikegami_tr 133/s 22% 15% -- -27% -41% -64 +% -81% -89% -94% avar2_pos 182/s 66% 56% 36% -- -19% -50 +% -74% -84% -91% corion 224/s 105% 93% 68% 23% -- -39 +% -67% -81% -89% moritz 366/s 235% 215% 175% 101% 63% - +- -47% -69% -83% avar2_pos_inplace 686/s 527% 490% 415% 278% 206% 87 +% -- -41% -68% dio_c2 1172/s 971% 908% 779% 545% 422% 220 +% 71% -- -45% dio_c 2118/s 1836% 1722% 1488% 1066% 844% 479 +% 209% 81% -- dio_c2 // Do it while ( dpv < dpv_end ) { if ( ! *dpv ) *dpv = *spv; ++spv; ++dpv; } dio_c // Do it while ( 1 ) { ptr = (char*)memchr( ptr, '\0', ptr_end - ptr ); if ( ! ( ptr && ptr < ptr_end ) ) break; *ptr = *(ptr - dpv + spv); ++ ptr; }

⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊