I don't have a fix for you, but instead I'm offering some refactoring advice.
- Load what you need from the annotation file into a hash.
- Might as well do your processing one line at a time, no need to have the whole file in memory.
Here follows some code to implement these first thoughts. I have not changed your use of the global @window, @probe, which would be my next targets for refactoring. But I think after you refactor, you may find it easier to debug.
open my $annotation_read_handle, '<', $annotation_file;
my %annotation_for;
while (my $ad = <$annotation_read_handle> ) {
# read $an_chrom, $prol, $pror out of $ad
my (
$an_chrom, undef, undef,
$prol, $pror, undef,
undef, undef, $mess,
) = split(/\t/, $ad);
# read $name out of $mess
my (undef, undef, $name) = split(/\;/, $mess);
# store for future lookups
$annotation_for{$an_chrom} = [ $name, $prol, $pror ];
}
# loop through the main data file
OLC:
while (my $md = <$main_read_handle> ) {
# remove newlines
chomp $md;
# pull out chromosome #, window start, end
my ($main_chrom, $winl, $winr) = split(/\t/, $md);
# see if $main_chrom has been annotated
next OLC if !exists $annotation_for{$main_chrom};
my $array_ref = $annotation_for{$main_chrom};
( my $name, @probe ) = @$array_ref;
# put the window start, end into array for further processing
@window = ($winl, $winr);
# call the range_finding sub to look for matches
my $return = range_find();
next OLC if !$return;
# upon matching, print out the name of the gene along with the ori
+ginal values
print OUTPUT "$name\t $md\n";
}