Re: Error with Bio::Seq Translate

Replies are listed 'Best First'.
Re^2: Error with Bio::Seq Translate by dmbNEWB (Initiate) on Jan 06, 2015 at 16:45 UTC
Thanks for your reply! Well, it's only the `my $prot3 = $seq->translate(2);` part that does not work, which makes me think the input is fine since it works for the $prot and $prot2. However, here is sample input if it helps! The file contains 194,503 of these sequences, but here are a couple for an idea: >3HSAA000001 CA000001 tgcagccgcgggcccagggtgctgttggtgtcctcagaagtgccggggattctggactgg ctccctcccctcctgttgcagcacaaggccggggtctctggggggctggagaagcctccc tcattcctcccaggaattaataaatgtgaagagaggctctgtttaaaatgtctttggact cccagggctgagtgggctgggatctcgtgtcctcaa >3HSAA000002 CA000002 tgcagccgcgggcccagggtgctgttggtgtcctcagaagtgccggggattctggactgg ctccctcccctcctgttgcagcacaaggccggggtctctggggggctggagaagcctccc tcattcctcccaggaattaataaatgtgaagagaggctctgtttaaaatgtctttggact cccagggctgagtgggctgggatctcgtgtcctcaaagtggatgggttctggggtggctc ctgaggtagaggagtggagaactggctcttaagagcacagttttcttttcttttttcttt tttgagacagggtctcacactgtcacccaggctggagtgcagtgttgcagtccggctcac tgcagttttgacctcccaggctcaagcgatcctcctgcctcagcctcccaagtagctggg agcctgggcatgcatcgccacgtctggctaattattattttttgtagacagggtctcact gtgttgcccaagcttgtcttgaactcctggccttaagtgatcctcccacctcagcctcct gagtagctgggactacaggcatgagccaccatgcctggccaactcacatttttctttcta ttatttattttttgtagagatgagtctcactatgttgcccaggctggtcttgaactcttg ggctcaagtgatcctcccgcctcagcctcccaaagtgttgggattacaggtgtgagccac agcacctggccaaccaacacttctcagggcctctttcatctgtgctcttccaggatgctg cctcttactccctgggcacctcggcctggtcccagcaggtatgggcagttgcttgaggct ccagacatactcacctctacctcgaccacatcaaccccatcaccaagaggaggttcaggg aagctgcattttgtggtcttgtcctcccagtccagggtggtagtgctgggctgttcccag cctcccacagcagagctggacgctgtggaaatggctggattcctctgtgttctttcccat tatccctcagctctgagtcctcttgtgccatctgacatctgatgccttcccaaccactgc cgagtttctgcctggagcagggcctcaaggccctggcacacagaagatgcgtatcagtat tatcaaccaatagttgatgaattgtgtttttcaacgaatttgctagtgatctggtttact gccttagtaatatctagttcctaatatgcctatgccttttaatgtctgcacagtctatga tgatgtcacttctctcaagcctgatgtgtcctctctctctttcttgctagtggtttatca atttttaaaaccttttcttgtttatttgtttttgagacagaattttgctcttgttgccca ggctggagtgcaatggcgtgatcttggctcacagcaacctctacctcctgggttcaagtg attctcgtgcctcagcctcctgagtagctgggctgacaggcacccaccaccacacccagc taatttttatgtttttagtagagatggggtttcaccttgttgggtggccaggctggtctc caactcctgagctcaagtgatccacccgcctcagcctcccaaagtgctgggattacaggg gtgagccaccacgcccagccccagcttagttttttaaaaagtttattttagccattctaa taggtatgtagacatatctaatagtggttttccttgcgtttccttaatgtctgatgatgt taagcattttttccccaagtgcttagttgccatctatatagcatctttgatgaaatgtct gtttatatattttgcacactttaaaatattgggttgttttcttt >3HSAA000003 CA000003 tgcagccgcgggcccagggtgctgttggtgtcctcagaagtgccggggattctggactgg ctccctcccctcctgttgcagcacaaggccggggtctctggggggctggagaagcctccc tcattcctcccaggaattaataaatgtgaagagaggctctgtttaaaatgtc [download]	[reply] [d/l] [select]
Re^3: Error with Bio::Seq Translate by PerlSufi (Friar) on Jan 06, 2015 at 17:01 UTC
You're welcome. And what is the goal of the script? To convert the DNA to ribose nucleic acid (RNA) ? UPDATED CODE TO REFLECT TRANSLATING TO RNA For starters, here is code I made (with acknowledgements to the Perl Monks for help) to parse a fasta file: use strict; use warnings; use Data::Dumper; my $file = '/path/to/your/file/here.fasta'; open (my $fh, '<', $file) or die "Could not open file '$file' $!"; my (%sequence_hash, $header); while ( my $line = <$fh> ) { chomp($line); if ( $line =~ m/^>(.*)/ ) { $header = $1; } else { $sequence_hash{$header} .= $line; } } # look at the header and sequence print Dumper(\%sequence_hash); my @translated_sequences; foreach my $seq ( keys %sequence_hash ) { $sequence_hash{$seq} =~ s/t/u/ig; push @translated_sequences, $sequence_hash{$seq}; } print Dumper(\@translated_sequences); [download]	[reply] [d/l]
Re^4: Error with Bio::Seq Translate by dmbNEWB (Initiate) on Jan 07, 2015 at 15:22 UTC
My overall goal is converting DNA to AMINO ACIDS (protein). (DNA to RNA is transcription, not translation). Here is information from BioPerl for using this function: http://search.cpan.org/~cjfields/BioPerl-1.6.924/Bio/PrimarySeqI.pm#translate "Function: Provides the translation of the DNA sequence using full IUPAC ambiguities in DNA/RNA and amino acid codes." Here is the code example from the Beginner's guidelines: `$prot_obj = $my_seq_object->translate(-frame => 1);` And here is an example code I found online in which they call using Bio::SeqIO #!/usr/bin/perl -w use strict; use Bio::Seq; use Bio::SeqIO; my $input = $ARGV[0]; my $seqio_obj = Bio::SeqIO->new(-file => $input, -format => "fasta" ); my $line_count; # process multi-fasta sequences while(my $seq_obj = $seqio_obj->next_seq){ $line_count++; # print "===========================================================\ +n"; # obtain id of the nt sequence my $id = $seq_obj->display_id; # print id of nt sequence # print "SEQ ID\t>\t", $id, "\n"; # print "===========================================================\ +n"; # use translate to convert amino acid squence to protein sequence fr +om frame 0 (default) # 'complete' - do some checks for ORF # print ">>>>>>>> Frame 0 <<<<<<<\n"; print ">", $id, "-0", "\n"; my $prot_obj = $seq_obj->translate(); # print the protein sequence print $prot_obj->seq,"\n"; # print "\n"; # print ">>>>>>>> Frame 1 <<<<<<<\n"; # translation starting from the second nucleotide (frame 1) print ">", $id, "-1", "\n"; $prot_obj = $seq_obj->translate(-frame => 1); # print the protein sequence print $prot_obj->seq,"\n"; # print "\n"; [download] My script is working fine for the default frame, for the 1st frame, and until line 5401 for the 2nd frame. The part that messes up is when it gets to the (-frame => 2) file, but even that file works until line 5401. So I don't think it is an input issue. And thank you for the above code, but I am translating to amino acids.	[reply] [d/l] [select]
Re^5: Error with Bio::Seq Translate by PerlSufi (Friar) on Jan 08, 2015 at 15:48 UTC


Welcome to the Monastery
	PerlMonks