Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: How to do popcount (aka Hamming weight) in Perl

by marioroy (Parson)
on Sep 24, 2017 at 16:12 UTC ( #1199998=note: print w/replies, xml ) Need Help??


in reply to How to do popcount (aka Hamming weight) in Perl

Update 2: Thanks Dana, the popcnt function moved to util.h in Math::Prime::Util v0.62.

Update 1: Added links to popcount.cpp and popcnt. Dana replaced popcnt with mpu_popcount_string in Math::Prime::Util v0.62.

Found in mce-sandbox/src/bits.h and used here, I received help by reading popcount.cpp from primesieve.org and util.c (popcnt) from Math::Prime::Util <= v0.61. Although the following is tailored for counting set bits inside a string, I'm sharing the code to show-case 64bits and 32bits support inside a function, determined by the __LP64__ pragma.

#ifndef BITS_H #define BITS_H #include <stdint.h> typedef unsigned char byte_t; static const int popcnt_byte[256] = { 0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4,1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5, 1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5,2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6, 1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5,2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6, 2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6,3,4,4,5,4,5,5,6,4,5,5,6,5,6,6,7, 1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5,2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6, 2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6,3,4,4,5,4,5,5,6,4,5,5,6,5,6,6,7, 2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6,3,4,4,5,4,5,5,6,4,5,5,6,5,6,6,7, 3,4,4,5,4,5,5,6,4,5,5,6,5,6,6,7,4,5,5,6,5,6,6,7,5,6,6,7,6,7,7,8 }; static uint64_t popcount(const byte_t *bytearray, uint64_t size) { uint64_t asize, i, count = 0; if (bytearray == 0 || size == 0) return count; if (size > 8) { #ifdef __LP64__ static const uint64_t m1 = UINT64_C(0x5555555555555555); static const uint64_t m2 = UINT64_C(0x3333333333333333); static const uint64_t m4 = UINT64_C(0x0f0f0f0f0f0f0f0f); static const uint64_t h01 = UINT64_C(0x0101010101010101); const uint64_t *a = (uint64_t *) bytearray; asize = (size + 7) / 8 - 1; for (i = 0; i < asize; i++) { uint64_t b = a[i]; b = b - ((b >> 1) & m1); b = (b & m2) + ((b >> 2) & m2); b = (b + (b >> 4)) & m4; count += (b * h01) >> 56; } i = asize * 8; #else static const uint32_t m1 = UINT32_C(0x55555555); static const uint32_t m2 = UINT32_C(0x33333333); static const uint32_t m4 = UINT32_C(0x0f0f0f0f); static const uint32_t h01 = UINT32_C(0x01010101); const uint32_t *a = (uint32_t *) bytearray; asize = (size + 3) / 4 - 1; for (i = 0; i < asize; i++) { uint32_t b = a[i]; b = b - ((b >> 1) & m1); b = (b & m2) + ((b >> 2) & m2); b = (b + (b >> 4)) & m4; count += (b * h01) >> 24; } i = asize * 4; #endif } else i = 0; for (; i < size; i++) count += popcnt_byte[bytearray[i]]; return count; } #endif

Regards, Mario

Replies are listed 'Best First'.
Re^2: How to do popcount (aka Hamming weight) in Perl
by danaj (Friar) on Sep 26, 2017 at 17:04 UTC

    Dana replaced popcnt with mpu_popcount_string in Math::Prime::Util v0.62.

    I moved the native popcnt to util.h. mpu_popcount_string is a new function to handle bigints or sufficiently magic input without using a bigint library. It's faster than getting Math::BigInt involved until the numbers get over 500 digits. It should be optimized, but it's not exactly the common case.

    It's also useful for making sure we don't get really slow for 64-bit numbers on a 32-bit Perl. Not as fast as forcing 64-bit code, but that's further narrowing down the space -- people on 64-bit machines who install a 32-bit Perl.

    Using Math::BigInt to handle a long digit string would be something like:

    use Math::BigInt; my $n = "16" x 1000; say 0 + (Math::BigInt->new("$n")->as_bin() =~ tr/1//);
    where only the last part is needed if it's already a bigint. Math::GMPz has Rmpz_popcount which is ridiculously fast if the input is already a Math::GMPz object.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1199998]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (5)
As of 2022-05-17 20:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you prefer to work remotely?



    Results (68 votes). Check out past polls.

    Notices?