comment on

Yes at times the genbank files can be problematic in that they are incomplete or that BioPerl gets cranky, you have not provided a code that I can test but if you may consider the following workaround, work with the fasta files in conjunction with the feature table provided in the genbank files

convert the genbank to gff through (genbank2gff3.pl)
convert the genbank files to fasta or download the fasta equivalent
parse the gff files and extract the CDs with their coordinate information
perl -F'\t' -lane 'if($F[2] eq "CDS"){print}' GCA_000153565.1_ASM15356v1_genomic.gbff.gff | cut -f3,4,5,7 > GCA_000153565.1_ASM15356v1_genomic.coordinates.txt
extract the subsequences from the fasta files using the coordinates saved in GCA_000153565.1_ASM15356v1_genomic.coordinates.txt

For the last item you may use BioPerl::SeqIO $seq->subseq(start..stop) but make sure you get the reverse translation of the seqs in the negative strand

A 4 year old monk

In reply to Re: Debugging Bioperl warnings for Genebank files that are missing info by biohisham
in thread Debugging Bioperl warnings for Genebank files that are missing info by Sosi

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


XP is just a number
	PerlMonks