http://qs321.pair.com?node_id=911898


in reply to Using hash keys to separate data

Nearly there. :-)
#!/usr/bin/perl use warnings; use strict; open(KEY, "<hashKey.txt") or die "error reading key list"; open(REG, "<testReg.txt") or die "error reading file"; my %Chr; while (my $key = <KEY>) { chomp $key; $Chr{$key} = undef; } my %R; while (my $reg = <REG>) { chomp $reg; my @reg_split = split("\t", $reg); push @{$R{$reg_split[0]}}, $reg; } foreach my $key (sort keys %R) { next unless exists $Chr{$key}; for my $out (@{$R{$key}}){ print "$out\n"; } print q{-} x 20, qq{\n}; } close(KEY); close(REG);
chr1 100 159 0 chr1 200 260 0 chr1 500 750 0 -------------------- chr11 679 687 0 -------------------- chr22 100 200 0 chr22 300 400 0 -------------------- chr3 450 700 0 -------------------- chr4 100 300 0 -------------------- chr7 350 600 0 -------------------- chr9 100 125 0 --------------------
The first while loop creates a lookup table (%Chr). The source file only has 1 field per record so there is no need for the split.

The second while loop creates a hash of arrays (%R) from your input file. The key is the first field (chromosome) and the value is an array of records. That's what the push is doing.

Finaly we print the records for each chromosome if it exists in the lookup table. In your case you want to print to a file rather than STDOUT as we do here.

As an aside, you could rewrite the first while loop with map.

Hope that helps.

Update
Reading your question again I see

hashKey.txt gives a list of all the possible chromosome values there could be in a given input file.
If that is the case why do you need the lookup table? I could see it being useful if there could be values in your input that you weren't interested in.