http://qs321.pair.com?node_id=11113808


in reply to Re^3: Detecting whether UV fits into an NV
in thread Detecting whether UV fits into an NV

My first guess is that the overhead in SvUV(ST(i)) is causing twiddle to be slower

Yes, I thought of replacing them with a variable, but decided there wouldn't be that much difference between looking at the value of an SV's IV slot and looking at the value of an IV.
I guess for a few calls there's not much difference, but when you're making 36 million of them it's not hard to believe that things might add up - and I should have thought that through a little better. (Actually, a "lot better".)

Fixing that alone makes uv_fits_double_bitfiddle almost twice as fast as uv_fits_double3 for me:
Benchmark: timing 1 iterations of uv_fits_double3, uv_fits_double_bitf +iddle... uv_fits_double3: 1 wallclock secs ( 0.45 usr + 0.00 sys = 0.45 CPU) + @ 2.21/s (n=1) (warning: too few iterations for a reliable count) uv_fits_double_bitfiddle: 0 wallclock secs ( 0.25 usr + 0.00 sys = +0.25 CPU)@ 4.02/s (n=1) (warning: too few iterations for a reliable count)
This is pretty much the type of approach whose existence I had wondered about.
It had never been pointed out to me that iv & -iv would identify the least significant set bit, and I'm certainly not sharp enough to have ever realized it myself.
This method is just brilliant ... and it's great that it turns out to be faster, too !!
I'll certainly be using it (with due accreditation to you) unless further testing, contrary to my expectations, reveals some problem with it.

Cheers,
Rob