http://qs321.pair.com?node_id=817827

remzak has asked for the wisdom of the Perl Monks concerning the following question:

I'm new to perl, yet an old dog with other languages. I'm looking for a way to scan through a very long list of 64-bit integers and find all pairs in the list that share only the least significant bit being set (($num1 & $num2)==1). The code below takes much too long.
# find all integers in the list that do not share any bits except the +lsb. my %MatchedIntegers=(); #my output foreach my $i1 (@LongListOfIntegers) { foreach (@LongListOfIntegers) { #if a match (only least significant bit overlaps), # add to hash of hash, I don't care about the value. $MatchedIntegers{$i1}{$_}=() if (($i1 & $_) == 1) ; } }
Any ideas? It seems that this should be a very fast thing to do. For a list of ~4,000 values, it takes 2 seconds. I want to keep the code in perl (not link to C/C++). I need the output in a hash table, but it could be a hash of arrays instead of a hash of hashes. Thanks,

Replies are listed 'Best First'.
Re: Need a faster way to find matches
by BrowserUk (Patriarch) on Jan 17, 2010 at 11:11 UTC
    1. You can save a small amount by removing a level of scope:
      foreach my $i1 (@LongListOfIntegers) { ($i1 & $_)==1 and undef $MatchedIntegers{$i1}{$_} foreach @LongListOfIntegers; }
    2. A little more by use integer;
      foreach my $i1 (@LongListOfIntegers) { use integer; ($i1 & $_) == 1 and undef $MatchedIntegers{$i1}{$_} foreach @LongListOfIntegers; }
    3. And if the values are evenly distributed, potentially save 75% of the runtime, by pre-filtering out any even values:
      @LongListOfIntegers = grep $_ &1, @LongListOfIntegers; foreach my $i1 (@LongListOfIntegers) { use integer; ($i1 & $_) == 1 and undef $MatchedIntegers{$i1}{$_} foreach @LongListOfIntegers; }

    Beyond that, any kind of categorisation of the values by the bits they have set (other than the LSB) will take far longer to set up.

    It even seems unlikely that you could save much time by moving this into (Inline)C, given you want a hash as the result. To be honest 2 seconds for 16 million comparisons doesn't seem too bad.

    Update: There's probably no need to have results for $x & $y and $y & $x in the hash, and even if there is, there's no need to test both. So,

    @LongListOfIntegers = grep $_ &1, @LongListOfIntegers; foreach my $i1 ( 0 .. $#LongListOfIntegers) { use integer; my $v = $LongListOfIntegers[ $i1 ]; ($v & $_) == 1 and undef $MatchedIntegers{$i1}{$_} foreach @LongListOfIntegers[ $i1+1 .. $#LongListOfIntegers ]; }

    Will save a bit more, though not as much as you'd think.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Ok, using BrowserUk's suggestions we improved the speed by 25.5 percent.

      I am not a perl whiz, and do not know how to "remove a level of scope" in the syntax below.

      "use integer" helps, and more so (i think) when I moved it outside of the loop.

      the "grep $_ &1, @LongListOfIntegers" doesn't do anything since I know the values are all odd.

      The last idea provided the most savings. I do need both values as keys in the hash, since I am using this as a lookup in subsequent steps.

      { use integer; foreach my $i1 ( 0 .. $#LongListOfIntegers) { my $v = $LongListOfIntegers[ $i1 ]; foreach (@LongListOfIntegers[ $i1+1 .. $#LongListOfIntegers]) { if (($v & $_) == 1) { $MatchedIntegers{$v}{$_}=(); $MatchedIntegers{$_}{$v}=(); } } } }
      Thanks. Now I am going to investigate kikuchiyo's comments.
        One last improvement... by moving the inner loop of the above code into the code where I build the qualified list of integers, I was able to take the problem from N^2/2 to N^2/4.

        At first the list is small, only once the list is done is it N long. By doing the checks as I add numbers I am performing 1/4 the number of comparisons.

        Doing without the array slice may improve it:
        { use integer; for my $i1 (0 .. $#LongListOfIntegers) { for my $i2 ($i1+1 .. $#LongListOfIntegers) { my $v = $LongListOfIntegers[$i1]; my $w = $LongListOfIntegers[$i2]; unless ($v & $w & ~1) { $MatchedIntegers{$v}{$w} = 1; $MatchedIntegers{$w}{$v} = 1; } } } }
Re: Need a faster way to find matches
by CountZero (Bishop) on Jan 17, 2010 at 08:17 UTC
    Two comments:
    1. Your test is wrong: it does not check at all whether the least significant bit is set. Consider:
      • print 3 & 5; => 1
      • print 7 & 5; => 5
      In both of these tests the arguments have their LSB set, yet only the first test results in "1".

      The following tests the LSB of a variable: $value & 1

    2. Your inner loop is too long: if you know that both A and B have their significant bits set, then also B and A have their significant bits set By employing this symmetry you can eliminate half of your tests.
    Update: Are you sure you want to have a hash with all the pairs of elements with the LSB set? This will quickly grow to enormous numbers: Assuming your integers are equally divided over odd and even numbers, the number of pairwise permutations will be 2000! (the factorial of 2000) which is larger than 10**499. Even if you can generate 1 million pairs per second, the program will run for more than 10**486 years.

    Updated Update: Actually I was wrong: The number of permutations of subsets k drawn from a set of size n is n!/(n-k)! only, or in this case 2000! / 1998! or 2 000 * 1 999 = 3 998 000. Still a big number, but doable.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

      CountZero, I did a poor job of explaining the test. BrowserUk realized what I intended. The list of integers is very specific, and was constructed meeting a rigid set of criteria; it is a pre-filtered list, the only known characteristic is that they are all odd numbers. The algorithm to build the list is quite fast (considering the large number of lines of code), and if the input parameters change the length of this list of integers will shrink or grow. Having said that, the list ranges in size from 10 elements (not as problem) to around 4,000 elements (my problem).
        If they are all odd numbers, then their LSB is always 1, isn't it?
        Why don't you just drop that bit then, and test for the bitwise ANDs of the resulting numbers being 0?

        my @chopped_array = map { $_ >> 1 } @original_array; my %result; foreach my $i1 (0..$#chopped_array) { foreach my $i2 ($i1..$#chopped_array) { $result{ $original_array[$i1] } = $original_array[$i2] unless +$chopped_array[$i1] & $chopped_array[$i2]; } }

        This algorithm is still quadratic, though - you still need to do N^2/2 comparisons for an N-sized array.

        But what if you build an auxiliary hash out of those lsb-shifted-off integers, then for every value simply check whether the bitwise negated value is present among the hash keys?

        my %hash = map { ($_ >> 1) => 1 } @original_array; my %result; foreach (keys %hash) { $result{($_ << 1) + 1} = ~($_ << 1) if $hash{ ~$_ }; }
        Or something like this.

        Beware of bugs, as I have not tested the code above, nor have I proved it correct :)
Re: Need a faster way to find matches
by bobf (Monsignor) on Jan 17, 2010 at 08:00 UTC

    I may be missing something here, but if you are only testing the least significant bit aren't you simply testing whether the integers are even or odd?

    Since there are only two possible groups (even or odd), it seems overkill to use a hash to store all possible combinations of integers in each group. Why not store the odd numbers (since you want the least significant bit set) in an array, then compute the possible combinations afterwards? See Math::Combinatorics for one option.

    Update: If you need the results in a hash, it could be populated while constructing the combinations of odd numbers. I suspect that most of the time (and memory) required for your routine is spent on the storage steps, not on the bitwise comparison itself.

    Update 2: I misread the question, thinking that the comparison only included the last bit. As BrowserUK notes below, this is not correct. Apologies for the misunderstanding.

      only least significant bit overlaps

      Is quite different to just odd or even:

      $x = 0b1111_1110_1101_1100_1011_1010_1001_1000_0111_0110_0101_0100_001 +1_0011;; $y = 0b0000_0001_0010_0011_0100_0101_0110_0111_1000_1001_1010_1011_110 +0_1101;; printf "%064b\n", $x & $y;; 0000000000000000000000000000000000000000000000000000000000000001

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
      Math::Combinatorics will not do the trick: it will give you all the permutations of an array or list, but cannot do nPk where n < k. Use Algorithm::Permute for that.

      CountZero

      A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

      BrowserUk gave a great example... I am really looking for the integers in the list that do not share any bits except the lsb.
Partition in to 63 parts
by gam3 (Curate) on Jan 17, 2010 at 21:43 UTC

    The secret is that there are only 63 bits and that any number with a certain bit set can not 'match' any number with that same bit set.

    So all you need to do is get a list of all the elements with a bit set and compare them with all of the numbers with out that bit set.

    By using a hash in place of the list we can do this pretty efficiently.

    1. Create the hash from the list.
    2. get a list of all the numbers with the 2**1 bit set.
    3. remove those numbers from the hash.
    4. compare them against the numbers in the hash
    5. repeat for 2**2 through 2**63
    You do need to fix up the generated hash afterwards, as it will on have half of the keys in it.
    use strict; use vars qw (@LongListOfIntegers %bighash %MatchedIntegers); my $x = 20; push( @LongListOfIntegers, int( rand( 2**8 ) ) ) while ( $x-- ); $bighash{$_} = 0 for @LongListOfIntegers; for my $bit (1..63) { my $n = 2 ** $bit; my @slist = grep( { ($_ & $n) } keys %bighash); delete $bighash{$_} for @slist; my $x = 0; for my $i (@slist) { for my $j (keys %bighash) { $MatchedIntegers{$i}{$j}++ if ( ( $i & $j ) == 1 ); $x++ if ( ( $i & $j) == 1 ); } } } normalize(\%MatchedIntegers); use Data::Dumper; print Dumper \%MatchedIntegers; sub normalize { my $hash = shift; for my $o (keys %{$hash}) { for my $i (keys %{$hash->{$o}}) { $hash->{$i}{$o}++; } } }
    note: It is also easy to split the problem up into 63 problems and use POE or some such to run each on a separate processor -- if you have 63 processors on your computer.
    -- gam3
    A picture is worth a thousand words, but takes 200K.
      Partitioning the problem is a really cool idea, especially since it shows how to easily split up the processing (if only I could). Also I appreciate your code. When I plugged it into my program, it wasn't faster than my starting point. I am guessing it would be faster if my list was larger than 10,000, or if I had a computer with many processors running the algorithm in parallel. At the current size of my list (slightly less than 4,000) the current optimization is the fastest. Just as an FYI, with all the stimulating ideas that were shared, I managed to improve the speed by more than 47 percent.
        Yes it's only real advantage is that you could stick it in POE and use all your cores.

        If the data is not random you might find some advantage in removing the bit that is set the most first.

        If the input data has some bits set more than 1/63 of the time you can get some advantage. You can see this in that if all of the data had a bit set then this algorithm would remove all the entries from the list on the first pass if it checked that bit first.

        But I assume your data is random.

        -- gam3
        A picture is worth a thousand words, but takes 200K.
How to use my two processors
by remzak (Acolyte) on Jan 18, 2010 at 04:06 UTC
    I managed to cut the run time in half (thank you all for your help). During this process, the algorithm was changed in a manner such that there is now opportunity for the logic to be implemented in parallel; previously the logic was strictly sequential.

    I tried to use threads, but the numerous calls to create the threads created way too much overhead for any speed improvement. My program is not memory intensive; the memory usage maxes out at just over 4MB before the program finishes; yet, by using threads the required memory exceeded the 2GB on my machine.

    Dual core machines are very common these days. I was hoping to re-write the program to use both processors. I was thinking of having a service run on one processor (which takes requests to find pairs of numbers) while the main program looks for valid numbers, waits for all the pairs to be found, then finishes the last step of the process.

    I'm using windows (it sounds like perl behaves slightly differently on windows). I need some type of inter-process communication mechanism to submit new requests, eventually wait and receive back the (hash table) results. At this point, I don't need detailed explanations, just your insights and recommendations on which modules are best suited to this problem.

      There are some gains to be had from using threads, but not as much as you might hope for. The following are the results of a single threaded test, and a multi-threaded version operating on the same list using 1, 2, 3 & 4 threads:

      C:\test>817827 Took 1.622 Odd numbers in list: 4000 Pairs found: 99135 C:\test>817827-t -T=1 Took 1.790000 Odd numbers in list: 4000 Pairs found: 99135 C:\test>817827-t -T=2 Took 1.365000 Odd numbers in list: 4000 Pairs found: 99135 C:\test>817827-t -T=3 Took 1.047500 Odd numbers in list: 4000 Pairs found: 99135 C:\test>817827-t -T=4 Took 0.832500 Odd numbers in list: 4000 Pairs found: 99135

      As you can see, the overhead of threading costs you relative to the non-threaded version for the 1-thread case. You make a small gain using two. And further small gains using 3 or 4. At the cost of complexity. The tests were run on a 4-core system.

      Note: These are only test apps; the results produced--a single level hash with composite keys--may not suit your purposes.

      Non-threaded:

      #! perl -slw use strict; use Time::HiRes qw[ time ]; use Data::Dump qw[ pp ]; our $N ||= 4000; our $S ||= 1; srand( $S ); my @list = map 1 | int( rand( 2**32 )), 1 .. $N; my $start = time; my %hash; for my $i ( 0 .. $#list ) { use integer; my $v = $list[ $i ]; ( $v & $_ ) == 1 and undef $hash{ "$i:$_" } for @list[ $i+1 .. $#list ]; } printf "Took %.3f\n", time() - $start; print 'Odd numbers in list: ', scalar @list; print 'Pairs found: ', scalar keys %hash; #<>; pp \%hash;

      Threaded

      #! perl -slw use strict; use threads; use Time::HiRes qw[ time ]; use Data::Dump qw[ pp ]; $|++; our $N ||= 4000; our $T ||= 4; our $S ||= 1; srand( $S ); my @list = map 1|int( rand( 2**32 ) ), 1 .. $N; my $cStart = 0; my $cSize = int( @list / $T ); my $start = time; my @threads = map { my( $t ) = threads->create( sub{ my( $lo, $hi ) = @_; use integer; my $tid = threads->tid; my @keys; for my $i ( 0 .. $#list ) { my $v = $list[ $i ]; my $start = $i > $lo ? $i+1 : $lo; ( $v & $_ ) == 1 and push @keys, "$i:$_" for @list[ $start + .. $hi ]; } return @keys; }, $cStart, $_<($T) ? $cStart + $cSize + 1 : $#list ); $cStart += $cSize; $t; } 1 .. $T; my %hash; for( @threads ) { my @keys = $_->join; undef $hash{ $_ } for @keys; } printf "Took %.6f\n", time() - $start; print 'Odd numbers in list: ', scalar @list; print 'Pairs found: ', scalar keys %hash; <>; pp \%hash;

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        I appreciate your code; your partitioning approach was very helpful to see. I will see if I can incorporate it into my program (I've moved the logic around such that I embedded the search for pairs into the agorithm that creates the list... this one step made big improvements in the performance but it makes using threads more difficult).

        Using your program, my machine starts tailing out at six or seven threads. To get the results below, I added some loops, and ran each number of threads 20 times to get an average. Here are the results:

        Threads Used: 1 Time: 2.193750 Threads Used: 2 Time: 1.646408 Threads Used: 3 Time: 1.418214 Threads Used: 4 Time: 1.311989 Threads Used: 5 Time: 1.228125 Threads Used: 6 Time: 1.221352 Threads Used: 7 Time: 1.218237 Threads Used: 8 Time: 1.211988 Threads Used: 9 Time: 1.233855 Threads Used: 10 Time: 1.212500

        What I would really like to try is to create a parallel process/thread that receives new numbers to "pair-up" while the main process keeps working to build the numbers. The process would need an event mechanism to add a new number and at the end, return its list. I think POE provides just such a framework. This is much more complicated than partitioning the algorithm, so I'll try to leverage what you've shown me first.

Closure & Results
by remzak (Acolyte) on Jan 19, 2010 at 16:12 UTC
    I completed the project (as complete as it will ever be). I ended up using one child thread with a queue to create two parallel processes. It was exactly what I wanted, and it was extremely simple to implement. The main process created the valid numbers; the child process paired them up. The two ran very much in parallel since the pairing algorithm actually ran more efficiently working on a growing list of numbers. If I had more processors, I'm not sure how I'd partition the logic.

    I think I'd have to re-frame the overall algorithm to take advantage of many processors; then, it may have been more efficient to multi-thread and partition like (some of) you suggested.

    The time dropped to 44% of the original (2.25 times faster). Some of the speed came not from the threading, but just learning how to write more efficient perl statements. I learned that there are expensive operations in perl and some very fast operations... I ended up tweaking many, and changing how I was storing the data to take advantage of these realities.

    I am sure the program could be made even faster once I understand perl a bit better.

    Here are the code snippets relevant to the threading:

    # This child thread builds an encoded array of valid pairs. # "valid pairs" do not have any of the same bits set. # Output format is row1_integer:row2_integer my $DataQueue = Thread::Queue->new(); my ($thr) = threads->create({'context' => 'list'}, sub { # I tried to use a shared hash of hashes, but the required work to + explicity define the hash of hashes in the manner required by thread +s::shared was just too expensive. use integer; # This thread uses queues to get its parameter. my @ValidPairs=(); my @output=(); while (my $validNum = $DataQueue->dequeue()) { !($validNum & $_) and push @output, $validNum .':'.$_ foreach +(@ValidPairs); push (@ValidPairs, $validNum ); #must add number to this threa +d's list of valid numbers } return (@output); }); #... Inside loop and logic to build numbers # send this new number to the child thread waiting on queued par +ameters. # The child thread builds the array of valid pairs $DataQueue->enqueue($i); #... After block that builds the numbers $DataQueue->enqueue(undef); #tell the child process to finish what has + been queued and then exit. my @encodedPairedNumbers = $thr->join(); #subsequent code decodes our encoded paired numbers into a hash of has +hes

    Thank you to everyone for your help!