http://qs321.pair.com?node_id=466308


in reply to pattern matching and array comparison

Using a hash is probably a better way to do it. Something like this may be what you want:

use warnings; use strict; my %Ids; open (IDS, "$ARGV[0]") or die "unable to open file $!\n"; while (<IDS>) { chomp; $Ids {$_} = undef; } close IDS; open (GENES, "$ARGV[1]") or die "unable to open file $!\n"; my @genes; while (<GENES>) { if (/^>/) { unshift @genes, substr "$_ ", 1; } else { $genes[0] .= $_; } } close GENES; foreach my $gene (@genes) { my ($id) = $gene =~ /^(.*?)[,\s]/g; next if ! defined $id; ++$Ids{$id}; } foreach (sort keys %Ids) { print "$_\n" if defined $Ids{$_}; }

Assumes sample data given in original node.

dbj|BA000040|:2701685-2702539 gi|11995001:156374-156649

Perl is Huffman encoded by design.