http://qs321.pair.com?node_id=838795


in reply to Finding all connected nodes in an all-against-all comparison

I don't really know what you mean by "non-reciprocal edges", but if I get it right, then you want want to find all the connected components in a directed graph, so try this:
use strict; use warnings; use Data::Dump qw( pp ); sub find_parts { my %graph = %{ shift() }; my %seen; my @parts; my $i = 0; my $helper; $helper = sub { my $start = shift; return if $seen{$start}++; push @{ $parts[$i] }, $start; $helper->($_) for @{ $graph{$start} }; }; for ( keys %graph ) { $helper->($_); $i = $#parts + 1; } undef $helper; return @parts; } my %graph; while (<DATA>) { my ( $src, $dst ) = split; push @{ $graph{$src} }, $dst; } pp \%graph; pp find_parts( \%graph ); __DATA__ Contig1 Contig2 Contig1 Contig3 Contig2 Contig1 Contig2 Contig3 Contig3 Contig1 Contig3 Contig2 Contig3 Contig4 Contig4 Contig3 Contig4 Contig5 Contig6 Contig7 Contig7 Contig6 Contig8 Contig9 Contig9 Contig10 Contig10 Contig8 Contig10 Contig11 Contig11 Contig10 foo bar bar foo quux quux
I haven't tested all the edge cases, but this will give you the idea.

Hope that helps.

update: LanX is right, my code above does not work for the case he has shown. Something like this would have been better to implement: Tarjan's strongly connected components algorithm