Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Re: How to do popcount (aka Hamming weight) in Perl

by danaj (Friar)
on Sep 26, 2017 at 19:12 UTC ( [id://1200146]=note: print w/replies, xml ) Need Help??


in reply to How to do popcount (aka Hamming weight) in Perl (popcount References)

There are a fair number of modules that do this. For instance, RosettaCode shows:

use ntheory qw/hammingweight/; say hammingweight(1234567); use Math::GMPz qw/Rmpz_popcount/; say Rmpz_popcount(Math::GMPz->new(1234567)); use Math::BigInt; say 0 + (Math::BigInt->new(1234567)->as_bin() =~ tr/1//); use Bit::Vector; say Bit::Vector->new_Dec(64,1234567)->Norm;

The speed of the underlying C code is important if you're calling the function from C where you can see the difference between a few cycles. Once you add Perl overhead it's somewhat less obvious. ntheory for example adds an input validation layer when called from Perl which means it croaks if you give it undef or "hello" or a strange object reference. It recognizes bigints and gives you the correct answer. Bit::Fast and BrowserUK's Inline::C code have no input validation -- it coerces the input into a C integer type and proceeds whether that makes any sense or not. There are many cases validation is just wasted time so it's a tradeoff. ntheory's C code is as fast or faster than BrowserUK's code on 64-bit machines (it uses the popcount asm if possible, otherwise identical C code), Bit::Fast's C code is either the same speed as ntheory, the same speed but with incorrect results, slightly slower, or infinitely slower due to a segfault or inability to compile, depending on the C compiler.

Once we start calling from Perl, the differences in overhead become more apparent. Note that BrowserUK's benchmark as posted greatly favors his Inline::C function, as all the others get an added Perl subroutine call added to them, and Perl sub calls are not at all cheap compared to this function. Correcting that, and adding some modules, I get:

                Rate bigint  GMPz bitvector Three Four  Two ntheory Inline raw_nt bitfast
bigint    4.56e-02/s     --  -93%      -94%  -99% -99% -99%   -100%  -100%  -100%   -100%
GMPz         0.667/s  1361%    --      -18%  -85% -85% -86%    -93%   -95%   -96%    -96%
bitvector    0.811/s  1676%   22%        --  -81% -82% -83%    -92%   -93%   -95%    -95%
Three         4.32/s  9363%  548%      433%    --  -4% -10%    -57%   -65%   -71%    -71%
Four          4.50/s  9763%  575%      455%    4%   --  -6%    -56%   -64%   -70%    -70%
Two           4.81/s 10434%  621%      493%   11%   7%   --    -53%   -61%   -68%    -68%
ntheory       10.1/s 22087% 1419%     1149%  134% 125% 111%      --   -18%   -32%    -33%
Inline        12.4/s 27108% 1763%     1432%  188% 176% 158%     23%     --   -17%    -18%
raw_nt        15.0/s 32729% 2148%     1748%  247% 233% 212%     48%    21%     --     -1%
bitfast       15.1/s 33012% 2167%     1764%  250% 236% 214%     49%    22%     1%      --

Some of these functions have more features than others, e.g. Math::GMPz, bigint, and ntheory will handle bigints correctly. raw_nt is ntheory with a new trivial XS function added that just takes a UV and returns the result of popcnt -- no validation or input size checks, just like Bit::Fast or the Inline C function. Not surprisingly, it comes out the same speed on this machine (both use gcc with asm popcnt support). We are talking about an instruction with 1 cycle latency on newer processors.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1200146]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (4)
As of 2024-04-20 00:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found