XP is just a number | |
PerlMonks |
Debugging Bioperl warnings for Genebank files that are missing infoby Sosi (Sexton) |
on Oct 23, 2014 at 22:33 UTC ( [id://1104819]=perlquestion: print w/replies, xml ) | Need Help?? |
Sosi has asked for the wisdom of the Perl Monks concerning the following question: Oh thy masters of Perl wisdom, please enlighten me. I am struggling with some problems reading Genebank files (.gbff) through Bioperl. I am trying to extract CDS and translation sequences using $feat->spliced_seq->seq and $feat->get_tag_values("translation")). My problem is that many of the genebank files are incomplete or are not matching the "correct" (example) format:
If all files were in the correct format, it would be relatively straightforward to extract FASTA files with each gene or protein in the format
and
Many of the files that I have either do not have the "Origin" field at the bottom (example), or have multiple "Origin" fields (example), each just after a "CDS" field, resulting in warnings and die errors that prevent me from doing what I need to do. Most of the warnings indicate that Bioperl hasn't been able to infer the sequence (because they are lacking that "ORIGIN" field) So my questions are the following: 1. Could you give me any tips on how I can find which of the files have this incorrect file format? I am figuring that a if($feat->spliced_seq->seq) fails, push those filenames to a list and manually download them again :( But I haven't been able to test this correctly yet, and maybe there is something already in Bioperl for these cases? 2. How can I prevent the automatic die everytime a warning comes out, so that I can find the whole list of files that is not designed as it should? Curiously, through the ~1000 files that I am running, the script runs for a few hundreds, outputing those errors but quits at some point. I must say that I have use autodie; in the preamble, but I think the die command is being given by Bioperl. Note: this was now (27 Oct) crossposted on Biostars in here
Back to
Seekers of Perl Wisdom
|
|