Indeed. I need to |= the sets both ways:
(Note: I've changed your C12 to Config12 because it was easier than re-writing the sort Which isn't really necessary anyway, but makes the output nicer.)
#! perl -slw
use strict;
use Data::Dump qw[ pp ];
my %h;
while( <DATA> ) {
chomp;
my( $k, $v ) = split;
push @{ $h{ $k } }, $v;
push @{ $h{ $v } }, $k;
}
my @keys = sort{ substr( $a, 6 ) <=> substr( $b, 6 ) } keys %h;
my $n = 0;
my %offsets = map{ $_ => $n++ } @keys;
my %masks;
for my $k ( @keys ) {
$masks{ $k } //= chr(0)x2;
vec( $masks{ $k }, $offsets{ $_ }, 1 ) = 1 for $k, @{ $h{ $k } };
}
for my $i ( 0 .. $#keys ) {
for my $j ( 0 .. $#keys ) {
if( ( $masks{ $keys[ $i ] } & $masks{ $keys[ $j ] } ) ne chr(
+0)x2 ) {
$masks{ $keys[ $i ] } |= $masks{ $keys[ $j ] };
$masks{ $keys[ $j ] } |= $masks{ $keys[ $i ] };
}
}
}
my %uniq; $uniq{ $_ } = 1 for values %masks;
$n = 0;
for my $group ( keys %uniq ) {
printf "Group %d : ", ++$n;
print join ' ', map{
$keys[ $_ ]
} grep{
vec( $group, $_, 1 )
} 0 .. $#keys;
}
__DATA__
Contig1 Contig2
Contig1 Contig3
Contig2 Contig1
Contig2 Contig3
Contig3 Contig1
Contig3 Contig2
Contig3 Contig4
Contig4 Contig3
Contig4 Contig5
Contig6 Contig7
Contig7 Contig6
Contig8 Contig9
Contig9 Contig10
Contig10 Contig8
Contig10 Contig11
Contig11 Contig10
Contig12 Contig11
Contig12 Contig5
Gives:
c:\test>838787
Group 1 : Contig6 Contig7
Group 2 : Contig1 Contig2 Contig3 Contig4 Contig5 Contig8 Contig9 Cont
+ig10 Contig11 Contig12
-
-
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.