UPDATE: Solved and removed if(defined $seq) line
Hello monks,
I have been trying a bio perl challenge site- so meta hints are welcome, too. I am having trouble reading a file into a hash properly. My data will be a fasta file in the format of:
>sequence_5849
CCTGCGGAAGATCGGCACTAGAATAGCCAGAACCGTTTCTCTGAGGCTTCCGGCCTTCCC
TCCCACTAATAATTCTGAGG
>sequence_5959
CCATCGGTAGCGCATCCTTAGTCCAATTAAGTCCCTATCCAGGCGCTCCGCCGAAGGTCT
ATATCCATTTGTCAGCAGACACGC
>sequence_0808
CCACCCTCGTGGTATGGCTAGGCATTCAGGAACCGGAGAACGCTTCAGACCAGCCCGGAC
TGGGAACCTGCGGGCAGTAGGTGGAAT
my code is:
#! /usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
use feature qw(say);
my $file = 'file.txt';
open (my $fh, '<', $file) or die "Could not open file '$file' $!";
my (%sequence_hash, $header, $seq, $count);
while ( my $line = <$fh> ) {
chomp($line);
if ( $line =~ m/^>(.*)/ ) {
if ( $seq ) {
say $seq;
$sequence_hash{$header} = $seq;
}
$header = $1;
$seq = '';
}
else {
$seq .= $line;
}
}
close $fh;
print Dumper(\%sequence_hash);
My problem is that I am not getting the header and sequence. I have a feeling that its because of clearing out the $seq variable, but I am not sure how else to get the header and sequences in the hash. Any insight would be highly appreciated :)