Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: Finding all connected nodes in an all-against-all comparison

by BrowserUk (Patriarch)
on May 06, 2010 at 22:52 UTC ( #838803=note: print w/replies, xml ) Need Help??


in reply to Finding all connected nodes in an all-against-all comparison

#! perl -slw use strict; use Data::Dump qw[ pp ]; my %h; while( <DATA> ) { chomp; my( $k, $v ) = split; push @{ $h{ $k } }, $v; push @{ $h{ $v } }, $k; } my @keys = sort{ substr( $a, 6 ) <=> substr( $b, 6 ) } keys %h; my $n = 0; my %offsets = map{ $_ => $n++ } @keys; my %masks; for my $k ( @keys ) { $masks{ $k } //= chr(0)x2; vec( $masks{ $k }, $offsets{ $_ }, 1 ) = 1 for $k, @{ $h{ $k } }; } for my $i ( 0 .. $#keys ) { for my $j ( 0 .. $#keys ) { if( ( $masks{ $keys[ $i ] } & $masks{ $keys[ $j ] } ) ne chr(0 +)x2 ) { $masks{ $keys[ $i ] } |= $masks{ $keys[ $j ] }; } } } my %uniq; $uniq{ $_ } = 1 for values %masks; $n = 0; for my $group ( keys %uniq ) { printf "Group %d : ", ++$n; print join ' ', map{ $keys[ $_ ] } grep{ vec( $group, $_, 1 ) } 0 .. $#keys; } __DATA__ Contig1 Contig2 Contig1 Contig3 Contig2 Contig1 Contig2 Contig3 Contig3 Contig1 Contig3 Contig2 Contig3 Contig4 Contig4 Contig3 Contig4 Contig5 Contig6 Contig7 Contig7 Contig6 Contig8 Contig9 Contig9 Contig10 Contig10 Contig8 Contig10 Contig11 Contig11 Contig10

Gives:

c:\test>838787 Group 1 : Contig8 Contig9 Contig10 Contig11 Group 2 : Contig6 Contig7 Group 3 : Contig1 Contig2 Contig3 Contig4 Contig5

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^2: Finding all connected nodes in an all-against-all comparison
by GrandFather (Saint) on May 08, 2010 at 01:41 UTC

    Adding:

    C12 Contig11 C12 Contig5

    to the test data prints:

    Group 1 : Contig6 Contig7 Group 2 : C12 Contig3 Contig4 Contig5 Contig8 Contig9 Contig10 Contig1 +1 Group 3 : C12 Contig1 Contig2 Contig3 Contig4 Contig5 Contig8 Contig9 +Contig10 Contig11

    where something like:

    Group 1 : Contig6 Contig7 Group 2 : C12 Contig1 Contig2 Contig3 Contig4 Contig5 Contig8 Contig9 +Contig10 Contig11

    is expected, at least according to my understanding of the OP's 'connected by at least one edge (including non-reciprocal edges)' criteria.

    True laziness is hard work

      Indeed. I need to |= the sets both ways:

      (Note: I've changed your C12 to Config12 because it was easier than re-writing the sort Which isn't really necessary anyway, but makes the output nicer.)

      #! perl -slw use strict; use Data::Dump qw[ pp ]; my %h; while( <DATA> ) { chomp; my( $k, $v ) = split; push @{ $h{ $k } }, $v; push @{ $h{ $v } }, $k; } my @keys = sort{ substr( $a, 6 ) <=> substr( $b, 6 ) } keys %h; my $n = 0; my %offsets = map{ $_ => $n++ } @keys; my %masks; for my $k ( @keys ) { $masks{ $k } //= chr(0)x2; vec( $masks{ $k }, $offsets{ $_ }, 1 ) = 1 for $k, @{ $h{ $k } }; } for my $i ( 0 .. $#keys ) { for my $j ( 0 .. $#keys ) { if( ( $masks{ $keys[ $i ] } & $masks{ $keys[ $j ] } ) ne chr( +0)x2 ) { $masks{ $keys[ $i ] } |= $masks{ $keys[ $j ] }; $masks{ $keys[ $j ] } |= $masks{ $keys[ $i ] }; } } } my %uniq; $uniq{ $_ } = 1 for values %masks; $n = 0; for my $group ( keys %uniq ) { printf "Group %d : ", ++$n; print join ' ', map{ $keys[ $_ ] } grep{ vec( $group, $_, 1 ) } 0 .. $#keys; } __DATA__ Contig1 Contig2 Contig1 Contig3 Contig2 Contig1 Contig2 Contig3 Contig3 Contig1 Contig3 Contig2 Contig3 Contig4 Contig4 Contig3 Contig4 Contig5 Contig6 Contig7 Contig7 Contig6 Contig8 Contig9 Contig9 Contig10 Contig10 Contig8 Contig10 Contig11 Contig11 Contig10 Contig12 Contig11 Contig12 Contig5

      Gives:

      c:\test>838787 Group 1 : Contig6 Contig7 Group 2 : Contig1 Contig2 Contig3 Contig4 Contig5 Contig8 Contig9 Cont +ig10 Contig11 Contig12

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://838803]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (3)
As of 2022-01-24 19:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    In 2022, my preferred method to securely store passwords is:












    Results (65 votes). Check out past polls.

    Notices?